Forensik

Aus aktuellem Anlass habe ich einen meiner alten Blogartikel herausgekramt. Es geht um die Wiederherstellung von gelöschten Daten, insbesondere RAW und Bitmap-Fotos. Da er auch in einem gewissen Maße auch mit Fotografie zu tun hat, poste ich ihn hier ohne weiteren Kommentar.
 
Yep. I did it. I killed my NAS. Thankfully i managed to restore a lot of data. Not all of it, but a lot. But Let’s start at the beginning.

The beginning

I have a QNAP DiskStation 219P II with two 1TG HDDs in a RAID 1 array. It holds held lots of photos, private documents stuff, a small archive of important documents (mostly PDFs) and my small library of music.

It did until i wiped the disks clean. There i was, shocked because i couldn’t come up with a story i could tell my wife, to explain why i deleted two years worth of photos of our little boy. And anything else for that matter.

The accident

The plan was to wipe the disks clean and get rid of any “off-the-record” data i might have on the old drives. For that task i moved everything of importance to an external disk and started to wipe the RAID-array. Everything went fine, until i noticed that the external disk contained no data. Obviously i failed to mount the disk properly, or it got unmounted for some reason. I did copy everything somewhere, but not to my backup disk.

The shock

First thing to do was to pull the plug. Literaly, since the disks were still being overwitten with random data.

While thinking about my options – after i slammed my head against every wall i could find – the first thing that came to my mind was photorec from the testdisk suite. I did not go through with the wiping and I must have copied my backup from my disks to the same disks. That means i might have two copies of everything of importance and one of them might still be intact. Since the Disks were set up as RAID-0 i might even have 4 copies.

Say hello to: photorec

PhotoRec 6.13, Data Recovery Utility, November 2011
Christophe GRENIER <grenier@cgsecurity.org>

http://www.cgsecurity.org

PhotoRec is free software, and
comes with ABSOLUTELY NO WARRANTY.

Select a media (use Arrow keys, then press Enter):
>Disk /mnt/nfs_Public/timecapsule.dd - 1000 GB / 931 GiB (RO)

>[Proceed ] [ Sudo ] [ Quit ]

Note: Some disks won't appear unless you're root user.
Disk capacity must be correctly detected for a successful recovery.
If a disk listed above has incorrect size, check HD jumper settings, BIOS
detection, and install the latest OS patches and disk drivers.

So i ejected the drives and connected them to a linux box by using an external eSata enclosure, fired up photorec and sat back for a few hours to enjoy my cold. Starting photorec in a screen session was a good idea: The ETA display went up to 20h till lunchtime, 80h till evening and it was displaying an whopping 120h when i went to bed. It turns out that the tool was recovering every little text file it could find, while it needed a short moment to copy the file after it had been found. In the end I ended up with a lot of old configuration and metadata files that i would never need again plus a constantly rising ETA, since there were millions of txt files scattered all over my disks.

Fortunately you can tell photorec which file types you are interested in. I limited my selection to pdf, tif (including different RAW formats, e.g. NEF, arw), crw, jpeg, gz, bz2, zip (including OpenDocument etc.), mp3 and doc.

So about 10 hours later i had 16000 JPEGs, a few Office documents and a lot of zips and pdfs – but no NEFs. All my RAW photos were gone. Everything of the first two years of my son. My first studio photos. Everything!

But wait, that can’t be… I fired up google and found this:

Bug fixes

  • Fix an endless loop during .caf file recovery
  • Fix tiff recovery including some raw file formats, 64-bit version wasn’t affected

So i was working on a 32Bit Atom machine…. download the source, compile it, start it, wait another half day. There they you! A few thousand NEF files. Not everything of my ca. 10000 images, but more than a half.

… and i wasn’t able to open them. After a bit of digging and a few red herrings it turned out, that the files were just mixed up. I sometimes i could see overlays of parts of other images. Some files were just garbled. Only a few could be recovered completely.

The cleanup

After running photorec a few times i was left with the following:

  • around 17000 JPEGs. Some of them were full resolution JPEGs that were probably 1:1 previews rendered by lightroom and backed up to my NAS via TimeMachine
  • a few thousand NEFs, most of which i could not open anymore
  • a few hundred PDFs, Docs, OpenDocuments and other stuff.

But everything was a) mixed up and scattered over multiple folders. Photorec creates a lot of folders, each containing about 500-600 recovered files. Of all types.

Entry: fdupes and find

fdupes compares and optionally deletes files based on file size, md5 checksum and, if that’s not enough: bit-by-bit comparison. The brave just type:

 fdupes -d -N -r   .

This makes fdupes -recurse in all subdirectories of ‘.’ and -delete all duplicates whil -Not prompting for confirmation. You can also skip -d and redirect the output to a file.

lennart@shuttle:~$ fdupes fdupes_test/
fdupes_test/1.txt 
fdupes_test/2.txt

fdupes_test/3.txt
fdupes_test/4.txt
fdupes_test/5.txt

You’ll get blocks of files which have similar content. Just delete each file except for one in each block.

Use find from the gnu findutils to sort your files based on the file extensions. I created a directory for each extension i could find. You may ask how to find all extensions? coreutils to the rescue:

find timecapsule -type f -exec basename {} ; | awk -F. '{print $NF}'|sort|uniq

Then i called called find to search for a each specific extension and i used -exec to mv the files:

for dir in $(ls -1 ./sorted)
do 
echo $dir
find ./recovered_files -iname "*.$dir" -exec mv {} ./sorted/$dir/ ;
done

kill empty directories afterwards

find . -type d -empty -exec rmdir {};

The last bit of work i could do was checking the integrity of the RAW files. For that i used ufraw-batch:

for file in $(ls -1 sorted/nef)
do
 ufraw-batch $file --out-type=jpeg 
 --wb=camera 
 --exposure=auto 
 --compression=93 
 --out-path nef_recovery/ 2>&1 
 |tee -a nef_recovery/log
done

And move all corrupted files to a folder calles trash:

cd nef_recovery; mkdir ./trash; for file in $(grep "Corrupt" log |
cut -f 2 -d ":"|tr -d " "); do mv $file ./trash; done

Of 10000 RAW files only about 350 survived. I was able to recover about 8000 JPEGs, though, mostly full-size preview images. Better than nothing.

And now the fun part: Digging through at each remaining file and find out what it contains. Sadly that’s a task i still havent got a script for. So get yourself a coffee and get to work!

Update:

I managed to get hold of the hard disk from my old Time Capsule. 8700 NEFs scraped already, 2000 processed with ufraw and “only” 440 corrupted files. YAY! Let’s hope the EXIF headers are still intact.

Update 2:

And there’s another pot of gold on my father’s Notebook. Yesss!

Schreibe einen Kommentar

Deine E-Mail-Adresse wird nicht veröffentlicht. Erforderliche Felder sind mit * markiert.