I managed to get all my media collated onto a single set of disks. There are two 3TB RAID-1 disks, and I have a backup on a couple of other disks. Still, this all feels pretty fragile. I’m going to need to figure out a richer backup strategy.
It totals 2.2TB of heavily duplicated data (still, this is a lot less than the 6TB I thought it was). However, there are a couple of disappointing holes. The combination of bad interaction between Symantec’s PGP WholeDiskEncryption and Apple’s TimeMachine, and Apple MigrationAssistant’s occasional loss of local mail storage, has led to a couple of big holes in my archiving. In particular, I’m missing nearly all of 2009’s mail; various recovery strategies are underway. My archives from the DOS years 1985-1999, and the gnu/linux years 2000-2008 are a lot better than the OSX years 2009-present. All my actual work files (and their history) 2004-present are safe in our SVN repository, but still, I hate to lose even a single byte.
The failure rates are interesting as well. One of the 18 hard disks, one (a 2.5″ drive pulled from a laptop c. 2005) had a serious hardware problem. Of the 53 zip disks, all but 4 had I/O errors; but the 67 CDs had only 2 I/O probs. The 157 1.44MB 3.5″ floppies were in between with 22 damaged ones. ddrescue is of course the right answer, but if you lose even a few bytes of a pgp-encrypted file, or a bzip2, gzip, or zip compressed file, that data is lost.
Total so far: 7.5M files in 900K directories. However, this includes zillions of zip and tar files which I need to unpack before indexing. More to come.