About this Blog

Photo Writing is the web version of the Photo Writing mini-magazine produced by Limephoto and Emil von Maltitz since 2010. As of 2015 it is now completely online. Feel free to browse through the articles and please leave comments in the comments section if you would like to engage with us.

Friday, November 2, 2018

A Brief Introduction To Storage and Backup


As photographers we are now well past the days where we stored all our old film negatives in a shoe box in the closet. Now instead of several hundred photographs to sift through on a Sunday afternoon, we have tens of thousands of images that we need to sort through, store and somehow backup. It is an often articulated concern among the photographers that I engage with as to how to store all these images. More than just store the images, how do we keep them future safe and back them up in the case of loss or damage.



I recently went through yet another hard drive upgrade as I ran out of space for the umpteenth time since I started down the road of digital image capture and storage (to be fair I only really need to consider an upgrade every three years or so). Now there are a glut of articles on the internet on best practice Digital Asset Management (DAM for short). The problem with all of them is that they are often extremely complex, or extremely expensive. Ultimately, after going through a number of articles as well as attempting a book on video libraries (David Austerberry’s “Digital Asset Management”). I have come to the conclusion that there is no perfect workflow for storing visual data. The best system is the one that works for you! Which means you have to build it yourself in a way that makes sense to you.

First, what are the criteria that we need to consider. Leaving aside the search ability of an image or clip, the files themselves need to be 1) readable into the future, 2) Easily accessible in the here and now, 3) backed up and 4) Simple and 5) this all needs to be affordable to a point.

DAM Map

Future Proof

The problem is that data files change with time, as do the physical means that we have for reading them. For example, who still uses CD-ROMS for storing images (and on that point, who remembers using floppy disks with their ‘extraordinarily large’ 64Mb or data capacity)? The newer laptops and computers don’t even have a DVD/CD drive anymore. Yet, 15 years ago the DVD was considered one of the most stable forms of data storage. Then there are file types themselves. We seem to have settled down a little now, but there are still a plethora of imaging file types that may or may withstand the test of time.

My personal take has been to continue using the native RAW format as one version of the backed up files that I keep. I know that Adobe brought out their RAW format DNG as a supposed future-proof file format, but considering the number of changes it has gone through in the last decade, I am not convinced it is any more future-proof than say Nikon’s NEF or Canon’s CR2 files. Should the event ever arise that any of the major camera manufacturers decide to kill their RAW formats, it would not be particularly difficult to convert an existing libraries (or libraries) from one format to another using Photoshop or any other bitmap editor (and probably most RAW editors too). 

So the answer, for me at least is to stick to what are considered the standard image formats. These would be TIFF and JPEG files. Their ubiquity means that they are likely to be around for a long time yet. The fact that there are already better compressed formats than JPEG yet JPEGs continued usage is indication that changing from JPEG is going to be a tall order. These formats will move on…eventually. Their ubiquitousness means that they are likely the safest files to store in. Far safer than using something like .PSD in fact since Adobe have now made somewhat more proprietary with their Creative Cloud deal.

Accessible

One of the most popular options for many photographers is to use a hard drive array like those offered by Drobo, G-Tech, Western Digital and many others. These all work fairly well and also give you options as to how to utilise the hard drive space. For instance RAID allows you to essentially halve the capacity but add a fragmented backup. This means that a two drive array offering 12Tb of data in total can be set up so that they actually store only 6 Tb, but 6 Tb that are backed up at the same time so that if one drive fails you don’t lose any data. Instead, replace the drive with a fresh one and software rebuilds the backup. This is the major selling point of arrays like the Drobo 5c. A huge plus to the use of a array (aside from the automatic backup) is that you get fast access to a huge amount of files through one tethered cable. In the case of most high end arrays you will be using a Thunderbolt or USB-C cable which makes the drive operate essentially as if it were an internal hard drive of the computer’s (i.e. it’s fast).

The good news is that with the move away from CDs and DVDs as backup, accessibility has become easier. It is now possible to access hundreds of thousands of images through one cable connected to the computer. In fact, if you are happy to have a loss in performance you can access imagery wirelessly through several types of wireless routers and servers. Using any connection interface from Firewire (now completely phased out but still available at massively reduced prices for photographers trying to get their foot in the backup door) through to the new USB-C also means for fast data transfer.

The downside is cost. A four bay array can cost in the region of R8000 (US$550) and that’s without any hard drives added. Usually you are looking at a startup cost of over R20,000 for initial array. This still doesn’t negate the need to backup though. Essentially, critical files should still be backed up to a separate hard drive. A separate hard drive is important because of things like drive enclosure failure. It is still possible for the enclosure to go haywire and corrupt the drives. It’s unlikely, but not impossible as I discovered a few years ago with a portable drive of mine that went on the fritz during a shoot in Johannesburg. I didn’t lose anything because I had a backup (in fact the drive was the backup and the original files were still on my laptop’s drive and I hadn’t overwritten my CF cards).

Backup!

The basic idea is to follow the rule of 3, 2, 1. To whit, you need three copies of a file. Two of these copies can be on site on separate drives (hence two hard drives), and then one copy online. If you do the most basic of backup regimens; an original file, a backed up file on a separate hard drive and an online file, you are pretty much covered for the life of the image. If a hard drive crashes, you have the physical backup drive. If the house burns down or you are burgled and lose both hard drives, then there is the online version of the image to fall back on.

The problem is that we tend to have tens of thousands of images these days that need to be backed up somehow. That is a lot of files and a lot of data. Unfortunately this means that image storage is going to cost money, more than just the cost of a separate hard drive. For those who are adamant at storing every single file they own, there is the expensive route of simply obtaining a server with several terabytes of storage. The tricky part is that you actually need two servers, or at the very least an enormous server that can be partitioned with RAID.

Cost and complexity kept getting in the way when I designed my storage needs. Ultimately I didn’t get a RAID system as not only were they extremely expensive (sometimes unobtainable in South Africa as well), but they would still need to be backed up on a separate hard drive. Couple this to the fact that once you select a server system you end up being tied to it. I wanted a system that was slightly more flexible. The answer I ended up with was to invest in several dedicated Firewire hard drives  (I am slowly moving over to Thunderbolt). Each Drive (the smallest is 3Tb, the largest 12Tb split between two drives) is responsible for one thing. My large thunderbolt 12Tb drive only has RAW files that I am currently working on. This is backed automatically to a second older FireWire drive. The system of dedicated drives has already saved my image library on two occasions in the last five years.

Exported Images are housed in a separate drive which I have named my output library. This too is backed up automatically to a separate drive (it doesn’t have to be an expensive drive either incidentally). All images that go into the Output Library are also sent to the internet as high res Adobe RGB JPEG files.

Regularly culling and backing up images is a chore. It needs to be done fairly easily and without headache (making life simple as required in my criteria for a backup regimen). Like most people I started with manual copy and paste from one drive to another. Clearly this is not a good idea. Eventually I settled on Intego Backup Pro. There are a bunch of programmes that do the same thing. What I liked about Intego Backup Pro was ease of use and the fact that I could create actions that would back up individual folders as opposed to whole drives. Again, there are other software packages that do the same thing, maybe even better. For me, it was what was accessible, easy to use and didn’t cost a kidney. Another nice thing about letting a programme do the heavy lifting for you is that you can opt to set it to overwrite or incremental backup.

The last bit to my ‘system’ is that RAW files eventually get archived into yet another drive, or drives actually. This means they get shifted from my my large 12TB drive which does all the heavy lifting, to smaller slower drives that rarely get accessed. When a shoot is complete and I am unlikely to go back over the files, the folder of images is archived. These archive folders can be backed up, but this is less important than keeping fresh backups of finished images which are delivered to clients.

I am by no means an expert when it comes to backing up data, or even in creating a workable DAM system. However I am continuously asked what it is that I do to backup images. In an ideal world my system would be based on larger, more complex hard drive servers. If money weren’t a consideration I would be working off a multiple NAS array like that of the G-Speed Shuttle SSD for my heavy lifting. I’d back this up to a slower NAS like the G-Speed Shuttle XL or the Drobo 5C. The problem is that this system would cost in excess of R145,000 (approximately US$10,000). The system of drives that I have put together over the years are scaleable as my needs grow, and more importantly, are literally a sixth of the price. Another important point is that b y using smaller drives as opposed to one large array, if a physical occurs that damages the unit and compromises the data, I lose only a small portion of my entire library as opposed to the whole thing. Considering that home and office theft is a real issue around the world (and very much a salient point where where I am writing this from in South Africa), inexpensive smaller drives are less likely to be stolen than an expensive looking enclosure with drives. Ultimately the best protection against data being lost through theft, is to have an offsite copy of your image files. The bare minimum in that case (which is pretty much what I subscribe to) is to have an internet copy of your finished files.

The Cloud

Using the internet is going to cost money. The good news is that the amount it costs is actually quite reasonable when you consider how many images can be stored. The downside is that if you want to store anything except your finished jpeg files, the costs ratchet up exponentially. This is understandable when you consider that a RAW file from a camera like the Nikon D850 comes in at  around 50MB (using lossless compression on a 14-bit file). Even jpeg files can end up being quite large despite being 8-bit files.

Cloud Storage solutions



I personally use SmugMug, and have done for the past 7 or 8 years. What drew me to their offering was that they allowed unlimited jpeg uploads. If anything the service has gotten better in the intervening years and I now have clients who rely on the way I deliver the images through SmugMug. They are by no means the only image storage site though, with notable competition from Zenfolio, Photo Shelter and Adobe Portfolio (when I first looked into the online image storage the decision was essentially between Smugmug, Zenfoilo and Photo shelter). You can also use dedicated file storage sites like Dropbox, iCloud and Google (Amazon also offers a photo storage facility now).

The peace of mind of an online storage facility is immense. Yes, SmugMug only has jpeg files, but then all the images that I have on Getty have been uploaded as jpeg files and these have been sold around the world, sometimes as enormous billboard images. If you are truly concerned about having a 16 bit TIFF copy of all your best images, then consider an offsite backup of your finished files. This doesn’t require an enormous array of drives. All it needs is a portable drive that can moved between sites easily.

Simplicity

This is where most people get hung up in creating their backup system, myself included. All the books, websites and blog articles end up becoming extremely complicated. In reality all you need is a simple programme that is easy to set up that will then automatically back up your files. Some Mac users rely on Time Machine, and this des work well if your storage requirements fit on one drive. If you are running between multiple drives, then a third party piece of software is maybe a better idea.
As mentioned above I use Intego Backup Manager Pro. It was nice and simple to setup and can do everything in the background without my worrying about it. This is important as it takes the complexity away from backing up files. If you really want to make life simple in terms of backing up, simply plug in a drive that backs up the drive you are working on. I would recommend creating a workflow system that you stick to in importing, editing and exporting images. I write about my post-shoot workflow in the article on this link. This was written in July 2014 and still holds true to my workflow in 2018.

Ultimately it doesn’t really matter what hard drives you invest in or how you manage your files within those drives. As long as you can follow the 3, 2, 1 prescript and potentially automate the process as much as possible, backup will be simple and fairly secure. I have followed the system above for the last decade, and it has saved my images on two occasions. Unfortunately I have lost 2 sets of images in the last decade as well. Both of these losses were when I didn’t follow the system mentioned above as I was being hasty and downright careless. The big disasters though were crashed hard drives. On both occasions I didn’t lose my image files because they had been backed as described.

No comments:

Post a Comment