I’ve gathered some thoughts about storage issues in a production group environment. But let’s face it, anyone who uses a computer deals with some of these, so cherry pick ideas that improve your digital life.
Remember the last time you collaborated with others on the same document and wondered: Who has the most recent version? How can I know the changes I’ve indicated won’t get lost in the shuffle? This is the issue of version control. The promise of networked storage is that everyone sees the same thing at the same time. But if you are working on more than one drive, or your collaborators “Save As…” often, you spend a lot of time and resources tracking versions. Some software (such as ProTools) forces this issue by allowing only one Write authorization at a time. If anyone else is going to work at the same time, the group makes changes in more than one place at a time. Dividing up the work along clear lines and providing rules for how they get combined are key workflow issues in these cases.
If your files are low bandwidth — text, pictures, compressed media — then you have lots of great options for networked storage. If your files are high bandwidth — uncompressed audio, video, CAD — then you may find it more cost effective to store files close to you, rather than remotely. Fat pipes are expensive over a long haul. Even the most expensive 3 foot cable is a sunk cost; you pay for it once. Remote storage tends to have ongoing costs, both for connectivity and for the storage. People who work with high bandwidth media files don’t just move them around, we also play them. If you can’t play from your remote storage because of bandwidth limitations then you will copy to local storage and create version control overhead. Some software (ProTools) can’t read/write with low bandwidth storage.
What happens when something goes wrong? If “The Show must go on” then using only remote storage could pose a problem. It’s easier to power cycle a local device, right? Now there are remote storage solutions that feature an army of people smarter than me who can keep the lights on. And that army comes at a cost. But if I don’t backup my data remotely I could have the opposite problem: failure of the stuff right next to me (power outage, etc.) can stop work with no alternative. So some combination of local and remote storage is important. Make sure to decide which storage location is Primary. More often than not primary local storage has helped me meet deadlines because I maintain control of it.
Your personal privacy is one thing. But if you work with data that gives your team an advantage, or losing control of the data would benefit your competitors, then you need security. Local networks have always offered a security advantage. But remote storage solutions have become better and better, to a point where I don’t find this a compelling argument one way or the other. I do think it is important to understand the security context in which you work and take appropriate steps to protect files that need to remain secure.
Where did I put that project? Naming conventions can go a long way to help you find what you need. If you work with a high volume of projects a clever naming system may still not do the trick. A Search Key is a unique identifier that allows you to differentiate an item. Think Social Security number. I like to use a five or six digit number at the end of every project folder to identify it. All other references to that project — invoices, documentation, correspondence, etc. — include that code so I can go to the correct data set quickly.
Project Life Cycle
In my experience there are different ways we keep files safe from failure by machines and people. Let’s differentiate Safety, Backup and Archive because so often these terms are used interchangeably.
Safety – This is a running copy made while work is underway. For example, during a recording session in years past this might have been a second recorder. These days it’s a redundant array that ensures failure of the primary storage doesn’t nuke the work happening right now. In other words if the primary storage fails, work doesn’t stop.
Backup – At periodic intervals (daily) and/or at significant milestones a copy of the work is put somewhere else. If primary storage fails you’ve only lost work since the last backup. It shouldn’t cause version control problems because you don’t use it for anything but protection. This copy is like a firehouse: you don’t ever want to use it. But when you need to put out a fire, you are very glad to have it.
Archive – Projects become dormant over time. I move files off of active storage after six or more months of inactivity. But that doesn’t mean the client won’t contact me in two years to revisit the material. The process of deep sixing data for use years later is known as archiving. I’ll make the files pretty and try to re-format or re-configure things in an effort to future-proof them. I like to gather all related documentation, screen shots and anything that can help someone use the data again after we’ve all forgotten the particulars, then put that stuff in a master folder. Next I zip the whole thing, apply an MD5 hash to it and copy it all to two different external drives. Every 3-5 years the zipped archive and MD5 hash move to another set of drives.
Follow-up articles: Business Continuity and Disaster Recovery, Media Storage Flavors
See also Rob Schlette’s excellent article 5 Steps for More Dependable Hard Drives at theProAudioFiles.com.