Preparing Digital Legacies

This post has its roots in a hachyderm (mastodon) exchange where SwiftOnSecurity asked if anyone knew anything about digitization services.  After a few back and forths, I realized I had a lot to say about the subject of Digital Legacy in general.  Hence, this post.

Mum

My mother, Claire Rodley, died in May of 2020, near the beginning of the pandemic.  She was born in 1932 and was 88 at her death.  She had a good run and left a fairly well-organized estate for me and my brother to disburse.

One of the things I took responsibility for as co-executor was dealing with Mum’s pictures.  She was the recorder of the family.  I don’t think my father ever took a picture in his life.  Oil on canvas was his medium.  Anyway Mum left us with thousands of artifacts – prints of all sizes, 35mm and 110mm slides and negatives and a half-dozen VHS tapes.  About half of these were boxed in an orderly fashion.  The rest were in bags and boxes or in randomly sized albums with evocative names that were mostly unhelpful in placing them in time.  “Santo Domingo”, “John’s Kids” … You get the idea.

My goal in processing the pictures was to get them into a form where I could distribute a complete set of them to Mum’s three siblings and my four siblings organized such that they’d have a chance of finding things they were interested in.  Looking back, the process divided into 4 major tasks:

  • Collection
  • Digitization
  • Organization
  • Archiving and Distribution

So first up, getting it all in one place.

Collection

(Calendar time – 1 month)

Mum never did the digital photo thing so gathering the stuff together was pretty easy. It was all physical artifacts that Mum organized pretty well.  Taking pictures off the wall, digging prints out of albums, all pretty easy.  A stack of boxes that would sit next to my office chair for the length of the digitization project.  

Digitization 

(Calendar time – 18 months)

There are a lot of photo digitization services.  I compared services, then went eenie meenie mynie moe and picked ScanDigital.  (If you’re comparing services, I’m pretty sure ScanDigital and ScanCafe are the same company).  Having not done this before, my trust level was non-existent.  These were all unique assets, one-offs, and if ScanDigital lost them those images would be gone forever.  So I hedged my bets and sent off a couple of small batches.  Those went well so I sent off bigger and bigger batches.  But I never lost my skepticism.  I always waited until the last batch came back before sending off a new one.  If ScanDigital suddenly lost its mind I only wanted to lose one batch in the process.

ScanDigital claims to accept albums and to return them to you as received but I never tested this.  Didn’t trust it and keeping the albums together wasn’t a priority since that organization of pictures mattered only to Mum.

In all, it took from March 2021 to November 2022 to get them all done.  

Looking back it seems like it took about a month from shipping the artifacts out to receiving the flash drives and artifacts back from ScanDigital.

One note about ScanDigital – they are always running a promotion for 25-50% off retail price so look for those.  If you send even one batch to ScanDigital without using a coupon code, you’re doing it wrong.  

Organizing for Distribution/Archiving 

(Calendar time – 2-3 months)

What I wanted to do for distribution was package the images with some sort of viewer that made discovery easy.  But I never found one.  What I ended up doing was making a simple file-folder structure that organized things by year, or set of years.  Sorting the images into those structures was not easy and certainly came out no more than 75% correct.  There were no dates on the artifacts, mostly, so it was all from our collective memory of when things happened.

This process took at least a couple of months of 2-3 hour sessions 3 or more times per week. The result was heartbreakingly imprecise but projects like this have a time limit on them where if you let them dangle unshipped, they never get out the door.  So at one point I simply had to say “close enough” and ship it.

I distributed sets of these photos to all interested parties on these external SSD drives with the expectation that I’d get 10+ years out of them – time enough for everyone to decide what they wanted to do with them.  Sadly, I didn’t check the data retention stats on the drives and, unpowered SSDs can last as little as a year.  This was a mistake.

Archiving

The physical artifacts all went back to my brother, a museum professional, to do “something” with.  Not wanting to get any of those boxes back, I didn’t ask what “something” was.  As for the digital archiving, that had to wait until I dealt with my own digital legacy.

Collection Part Deux

(Calendar time – 6 months)

My situation is very different from Mum’s.  She was all film, no digital photos and no video.  I was film for my first 30 years, 1960 to 1990.  A little late to the digital party, probably due to cost and enduring fondness for my 35mm camera.  So I had all the same type of physical artifacts, 35mm prints and slides, 110mm slides, assorted print sizes in boxes and albums along with disposable cameras and Hi8 and MiniDV physical tapes one of which was stuck in a videotape recorder that would not power up and release it.  Along with  that I had 40+ years of mostly unorganized digital assets on retired disk drives (100+ of those including a 5MB, 8-inch hard drive), memory sticks, flash drives, 3.5 and 5 inch floppies, backup tapes, online services, old phones and active machines.

A note about day-to-day life.  I’m not a total bozo.  I (have tried to) keep all my pictures available on my main machine under a single directory organized by when the batch was added to the collection.  So I wasn’t totally disorganized, but I also wasn’t alone on this journey.  I’ve had a significant other for the last 35 years and two digitally savvy children creating their own memories in their own way on their own devices.  And for most of this time I haven’t had to think about preserving these things beyond my own working life which for most of the time seemed endless.

All of this is complicated by the fact that several years ago I collected all the video I could find into a network drive, then promptly hosed that drive.  (If you must know, the OS corrupted, I removed the data drive then accidentally Etcher’d a raspberry pi image onto it).  So I’m actually a bit of a bozo. Undoubtedly some unique content was lost there.  But … knowing that other copies of that unique content might be duplicated put the onus on the collection process to be as exhaustive and exhausting as possible.

Among the treasures this over-the-top data search discovered:

  • From the broken videotape recorder, in a 3 hour, 2-man operation with tape-splicing involved, I recovered a single 18 minute segment of Hi8 tape that recorded bath time for my then 2-year old daughter.  Totally worth it.
  • From a random hard drive, phone video of the Hanson Machine Gun Shoot, a ridiculous spectacle of loud noise, ammunition wastage and wanton fire-starting as tracer ammunication deflected off concrete and out into the woods.
  • From the years-old rolls of 35mm film I sent to TheDarkroom a handful of pictures of a birthday party, our dearly departed Newf Klondike and some shots of Scituate Harbor taken from the nose turret of a B-24 in the years before riding around in WWII airplanes was proved to be a bad idea.  

I scoured all the media and online services (Google Photos, Youtube …) we had for  jpg, 3gp, mov, avi, wmv files and boy there were a lot of them out there.  I scooped them all into single directory, organized by source without worry about whether or not I already had a copy of them.  Save that for the organizing phase.

Digitization

(Calendar time – 2 months)

I was a lot more trusting with ScanDigital the second time around, so sent most of my physical artifacts to be digitized off in bigger batches and they didn’t disappoint.

Organizing for Archiving

(Calendar time – several weeks)

Organization options haven’t improved as far as I can tell so it was back to the folder of years.  The biggest problem in organizing my pictures, as opposed to Mum’s, was de-duplication.  I collected every digital image into one big directory structure then backed this unexamined lump of hundreds of gigabytes of image data, dupes and all, up to regular DVDs in the likely case that I did something stupid in the organization process and lost something.

Since not losing data was the priority, I ended up with up to seven copies of some pictures as they were  copied hither and yon and then gathered back in again.  The tool I used to deal with this was Duplicate Cleaner.  It was fast, allowed for detection based solely on file contents (not name/size), had a nice preview of found-duplicates and multiple ways of making the mark-for-deletion process way, way easier, such as “mark for deletion all dupes with a particular folder name”. Just as important, it warns you if you’re about to delete all the copies of a particular file.  Handy for people who have a habit of doing stuff like that. I never caught it making a mistake identifying as duplicate something that wasn’t.

In final form, a file folder structure with years at the top level and at most one layer of subdirectories with meaningful names (e.g. “Grace’s graduation”).

Archiving

My goal for all this stuff is simply to get it into my own kids hands in a form that they can, at their leisure, decide what to do with it.  I’m 63 years old, so archives I make now need to last 30+ years so they can have a good decade to procrastinate about it.  

My original, off the top of my head plan for all this digital legacy was to use these USB external SSDs as long-term archive. It was a hachyderm conversation that alerted me to the short data retention of unpowered SSDs (thanks to Patrick Lam on hachyderm.io).  After considering and discarding the idea of mechanical drives due to mechanical rot and interface deprecation I came to the solution I’ve actually implemented – M-DISC DVDs.

I’d originally disregarded the idea of disks because of the ugly and rapid physical deterioration I’d seen on CD-Rs in my own collection.  A lot of that was because I used cheap-ass CDs but still, when you see bits of foil peeling off a storage medium you’re disinclined to go any further with it. 

But with all the other options gone I looked back down the DVD path.  And the news was good.  HTL MDisc archival DVD uses inorganic material as the raw storage and some manufacturers claim 1,000 years data retention.  I’ll settle for 30.

Actual Archiving – Update

So archival storage will be accomplished using:

The first obstacle to confront with archiving to DVD is how to size your chunks. 100GB is not 100GB, nor is 25GB 25GB. Here’s the actual sizes I’ve been getting out of these DVDs.

  • 100GB – 91GB
  • 25GB – 23GB
  • 4.7GB –

I could have had the backup software do the arrangement automatically to squeeze more out of each disk, but I didn’t. So for my 230GB of pictures I ended up with:

  • 2 100GB discs
  • 1 25GB disc
  • 2 4.7GB discs

In Power2Go the distinctions between disc types can be confusing. 100G and 25G are Blu Ray projects, 4.7GB are DVD projects.

Reliability and Verification

There is a setting for “quick verification/complete verification“. This defaults to “quick” but we’re preserving for the ages so we chose “complete”. This paid off immediately as one of the first 2 expensive 100G discs failed complete verification. Sigh.

After the second verification failure I set a checkbox on the burn dialog labelled “enable defect management” that claimed it would help but would also make the burn slower. It did indeed make the burn slower and Power2Go crashed the first disc I used it on, so I turned it off.

The machine I’m doing this on is a bit of a beast – with 7 hard disks, 3 of them USB, 1 USB burner, 1 SATA burner, onboard USB AND an extra PCIe4 card (for USB-C), 4 monitors and assorted other USB things attached to it. It seems that stressing the USB system might have a bad effect on the burner as a massive network file I/O from USB coincided with a crash of the USB burner. AT $12 a pop, the 100s are too expensive to take chances with, so be nice to the machine in the three hours it takes to burn one. That said, I’ve had verification failures on discs where the machine was freshly rebooted and otherwise idle.

I ended up having to burn 10 100G discs, 5 25G discs and 10 4.7G discs. I had the following verification failure rate:

  • 100G – 10 successful, 5 failed verification after appearing to write correctly, 3 crashed the burner itself because I Remote Desktop’d into the burner machine and apparently using Remote Desktop is a hard NO in this situation.
  • 25G – 5 successful, 0 failed
  • 4.7G – 10 successful, 1 failed verification

Each 100G disc took 3 hours to write and verify. 25G 50 minutes, 4.7G 5 minutes.

Interestingly, Amazon accepted returns of DVDs that failed verification. Remember to cut up the disc to render the already written data unreadable.

Also interestingly, I was able to copy out all the files on the 4.7G disc that “failed verification” and got no errors.

Why This, Why Now?

When people are interviewed after they’ve suffered a home-destroying natural disaster – fire, flood, tornado, hurricane – they never mourn their crumpled SUV, their furniture splintered to matchsticks or even the house itself.  They cry over the lifetime of memories lost in their family photos.  There are few things more important, and many things that are way less important that occupy way more of our time and energy.

Why now?  My picture/video management practices aren’t screamingly bad. And I’m only 63 which as everyone says is the new 47.  Why not continue on a few more years like this?  Why go through this now?  In 2017 I had a silent heart attack that damaged my heart and only got detected in 2020 and stented in 2021.  The months between detection and stent were difficult.  Apparently what I’d read was correct.  I will die, maybe way earlier than I’d expected and I even have a pretty good idea of how it will go down.  Along with my fresh understanding of how much work Mum’s digital legacy took, this added urgency to putting this stuff in a form that “people who aren’t me” can use it.  

When my family want to remind themselves of how I was, how they were and how we all were together, they’ll have a place to go.