Downloads
Note: All content downloadable from alpenglow.info is copyright protected and licensed under the Creative Commons-Attribution-NonCommercial-ShareAlike License Germany version 3.0.
See http://creativecommons.org/licenses/by-nc-sa/3.0/de/
Photo Diskspace statistics script
I have written a script that will crawl your photo disk, and create statistics on how many megabyte were added which month in which format for which camera. This is really useful to predict the required disk space in the future, especially when making hard-disk buying decisions. I did so in 2009 and 2010, and of course it made me buy a larger disk than I had originally planned.
Prerequisites
You will need two things to run this script successfully:
- A Python interpreter. I have tested version 2.5 and 2.6, and both worked for me. Visit the Python.org home and download the package suited for your operating system. Install following the instructions. I wouldn’t expect the script to work with Python 3.
- Phil Harvey’s excellent Exiftool. Download the executable version for your operating system. Install according to the instructions.
Download
Click here to download my script: photoSizeHistogram.zip
Instructions for use
After downloading and installing the two prerequisite packages as detailed above, download the statistics script, unzip it, and put onto your disk where you will find it. Caveat: We are going to use the command line for this, so fasten your seat belt! The following steps worked for me:
- First, open the photoSizeHistogram.py script using a text editor. We need to tell it where it can find the Exiftool we have just installed. Look for the line that reads
exiftool = 'F:/Fotos/exiftool.exe'
somewhere close to the top of the script. Edit the path to point to the Exiftool on your computer. Warning: If your exiftool executable is called “exiftool(-k).exe” or similar, please rename it to exiftool.exe, else the tool will pause waiting for a keypress after every invocation. This is useful for drag and drop usage, but not for the automation we are intending to do.
- Now, open a command line, and change the directory to the top level directory of the photo tree you want to calculate the statistics for. For me, this looks like this:
f:\Fotos> c:\apps\python2.6\python.exe f:\scripts\photoSizeHistogram.py > statistics_2011.csv
Adapt to your local environment, and make sure to enter everything in one line. This will run the script, which will emit some progress messages when it enters a new directory, and will create a CSV file called statistics_2010.csv in the photo directory. As it is actually extracting EXIF data from all photos, this can take some time. It ran for 3 hours on my old PC with about 100000 files.
- Open the created CSV file with the spreadsheet of your choice (I use OpenOffice), and feel free to visualize in any way you like. The CSV file contains one column per camera and file type combination detected, and one line for each month where data was found. The numbers are the megabytes created in the corresponding month.
Configurability
Now, this script still is pretty basic, but what you will like to do if you do not use a Nikon DSLR is to change the line that contains the valid file extensions that the script will try to open. Use the text editor and change the line
valid_extensions = ["nef", "jpg", "cr2"]
to your file name extensions. You might also want to explore different axis, and change the exif tags that are used to create the columns of the output. They are listed in the line with
bin_tags = ["Camera Model Name", "File Type"]
For example, if you do not care about the file type and just want to sum up per camera, just change the line to read
bin_tags = ["Camera Model Name"]
and you should be fine.
Photoblogs.com