Backing up data means making safety copies, usually on set. In actuality, this is a safety measure, designed to insure that there is a duplicate of the camera original should something happen to the original files. As such, this type of backup is not organized in any fashion other than the extemporaneous way in which it happened to be captured while shooting. A backup is really just a spare copy of the original.
By contrast, an archive is an organized library of content, stored on stable, long-term media. Spinning hard drives, sensitive to eventual physical decay and damage, are not suitable long-term storage media. Material must be organized with a searchable structure. There should be a searchable system of added commentary data, informational tagging, logging, metadata, and color correction handles if the footage has traveled through a post process.
An archive should not merely be considered storage of media after a production is completed. The use of a proper library storage system for an on-going production increases efficiency and protects the media assets.
Types of Archiving for an Ongoing Production
Online storage is for a production that requires constant and instant access to all material. An example would be a long form documentary with various elements gathered over an extended period of time. This type of online storage is more expensive to maintain but fast to use.
Near-line storage is used for a production that might require routine access to archived material for reuse or reference. An example would be a TV series with standard "B-roll" or interstitial footage. This material would need to be available on media that is quickly accessed but which might not be available in real time, requiring a few seconds or minutes to access or compile. Near-line storage is much less expensive and more stable than an online archive.
Offline storage utilizes material generally not accessible in real time or full resolution, but which is on stable media. An example of this would be legacy footage from a TV series' previous seasons, which generally would not be needed in a routine or ongoing basis. It would still be organized and considered an active library, but access would require greater time and effort. The advantage to remote storage archiving is lower costs and high stability.
For longer-term storage, production media is not appropriate. Production materials should be transitioned to a suitable archival media, and the production media should be considered transitory.
Here are a few common types of production media, along with the reasons they don't work as archival media.
Hard Drives
A hard drive has a spinning disk or drum with a spindle arm for reading the media. While hard drive technology has improved over the years and ruggedized models are available, they all have moving parts and will eventually succumb to mechanical failure and fail.
SSD (Solid State Drive)
With no moving parts, SSDs are more reliable than hard drives, but they do have an eventual failure rate that may surprise many. SSDs are rated to a given number of read/write cycles, after which their ability to store content is jeopardized. They are also rather expensive per gig as a storage medium.
Solid State Cards (SxS, P2, SD, CF, etc.)
Cards are the capture media for many current camera systems. This is efficient on set as they are relatively small and robust around the camera. However, they are a transitory form of storage, and once the data moves away from the on-set environment, it should move off these on-set storage devices as well. Cards are inefficient, often with relatively slow read/write capabilities and limited capacity. They are not an organized form of media, as they often contain the original camera files in the order they were shot, with no later reorganization in terms of the actual content on them. There is also no consideration for additional production or post-production notation on the media, such as logging or color correction.
Cards will eventually fail, as they need to physically be inserted into and extracted from card readers, putting wear onto their electrical contacts and mechanical housings. Card media is perhaps the most expensive per gig option as a long-term storage medium. Their small capacity means they become a physical library that must be managed and stored. Files organized electronically stay that way until someone chooses to move them, while files physically organized by stacks of cards, drives, tapes, etc. must be actively maintained.
Video Tapes
The "Old Reliable" of storage media actually isn't so reliable at all. Tapes degrade over time just sitting on the shelf, with additional wear occurring during every usage. Video tape data retrieval is a very inefficient use of time, as one must shuttle through a tape to the appropriate portion and then play back material in real time in order to access it. Tapes also require the use of an expensive deck, which also must be maintained. These physical formats all expire over time, requiring one to either maintain an elderly legacy format that is no longer supported by the manufacturer or invest massive time and expense in transitioning to a new storage format. Like cards, tapes must be stored in a physical library system that must be actively maintained.
Appropriate Options for Archival Media
LTO-5 -- Data-on-Tape
LTO-5 stores approximately 1.5 terabytes of data on a tape that costs less than $100. This is equal to about 1,000 minutes of ProRes 422HQ material. Uncompressed content needs to be transferred and reconstituted for use, so LTO-5 is most appropriate for offline storage. Current generations of LTO-5 systems allow material to be accessed by individual files using a directory system, instead of requiring the reconstituting of an entire data tape, greatly speeding up the process of data access. This system, called LTFS (Linear Tape File System), makes the tape appear as a virtual hard drive, with a standard directory of the file contents stored within. This allows for simple "drag and drop" file access. LTO-5 was developed in cooperation between all the LTO data storage management companies. All of the defining code is open-sourced so that no system becomes proprietary and universal support is encouraged.
Unlike other systems of data storage discussed in this article, LTO-5 enjoys widespread use as IT informational storage. It is used in business, finance, medical and other industries, so as a format, it is strongly supported and does not appear to be going away anytime soon. LTO-5 tapes have a long life expectancy, as does the format itself.
Sony XDCAM Optical Disc
Optical discs have minimal moving parts and are sealed within their protective jackets, utilizing laser light to read from and write to the disks. This makes them less sensitive to environmental damage than LTO-5 tape. Currently, the disks have a 128G capacity, equal to about 80 minutes of ProRes 422HQ. While XDCAM is a proprietary Sony codec, the optical disc system can be used to store any file format, making the system format-agnostic. A benefit to an XDCAM video file is that it can play back instantly from the disc without first needing to be downloaded or transcoded, which would be the case with other content stored on an XDCAM disc. Compared to LTO-5 -- with its higher cost per gig, smaller storage capacity per unit but faster access to media -- XDCAM optical disc storage is more suited to near-line storage.
Taking Archive Storage to the Next Level: Asset Management
By organizing a library system, files can be put to work on an ongoing basis. Production content no longer needs to be thought of as one-time use raw material, but instead as assets to be utilized for future production, and as a revenue-generating property or commodity. Advanced systems for archive storage offer proxy video, thumbnails, tags, video windows and other functionality for content seekers to make intelligent searches through an ever-growing database.