Thomas


1
May/06
3

Recovering/Undeleting Files From ext3 Partitions

I have written previously on my efforts to recover my GnuPG secret keyring (secring.gpg), my email and other fun stuff. I had a limited degree of success though, I managed to recover some email but not the secring.gpg file (not all of it anyways) or the my certificate for my online bank. If you thought I learned something from that ordeal I am afraid to dissappoint you.

Friday evening I was rummaging through some changes in one of the university projects I am involved with, I had a few days work sitting uncommited in my local subversion checkout. Now, the makefile of this project leaves a lot to be desired and one of those things are proper dependency checking. When something is changed in a header the source files including these header are not automatically recompiled upon the next make. To circumvent this (yeah yeah, I should write a proper makefile instead yada yada) I routinely do a rm *o to remove the object files I know needs recompilation. You can probably imagine what happens next, I forgot the "o" and deleted all the source files matching the pattern. I was a bit wiser this time and immediately decided to see if I could recover the files before too much of the disk layout changed, the files (along with the rest of /home) was stored on an ext3 filesystem. I googled around for ext3 recovery tools thinking such tools were common and easy to use, afterall the filesystem was still intact and the probably was some dormant inode floating around on disk. I googled around and found that ext3 does not allow for undeletion of files, my only option was to scan eveything.

I decided to dust of my scanner and some other tools and started the recovery process. I knew that I had a the string "void output_direction_tin(" in the deleted code and I made my scanner look for that string. Then i pointed it to /dev/hdd3 where my /home was mounted and waited while the scanner was busy rummaging through 30 gigs of jibberish. The scanner found 17 matches at byte offsets:

Opening /dev/hdd2
Done.
Starting read..
Match: 6912250812
Match: 6937423690
Match: 6937444526
Match: 6937452718
Match: 6962577584
Match: 6971031370
Match: 6971052206
Match: 6971056302
Match: 9009694462
Match: 9009785063
Match: 9010125031
Match: 9103993029
Match: 9105934533
Match: 9236583016
Match: 9236763589
Match: 22171115613
Match: 22300949900
Writing out 17 matches.

I then used another program to read approximately 40 from each of these offsets and dump the result to temporary files which I then inspected by hand. I suspect you are all at the edge of your seats now sweating with the pure excitement that is sure to come from the conclusion of this post. I actually _did_ manage to recover what I wanted and everything is now comitted to the subversion server (which is under backup). ;)

For future reference, here are the programs I wrote. They are crude and simple but if they worked for me, they might work for you.
scanner.cpp: Scans the file given as input (this would usually be a raw device like /dev/hda1) searching for the pattern specified in "magic" at the top of the source code. It outputs byte offsets when it finds a match.
write_from.cpp: takes four arguments, the input file/device name, the output file name, the offset to search to and the number of bytes to write out.
To compile you probably need large-file-support in your libc/kernel to be able to work with 64bit offsets (needed when you have more than a couple of gigs of data). I compiled like this:
g++ scanner.cpp -O3 -o scanner -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64
g++ write_from.cpp -O3 -o scanner -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64

By the way, another thing I had trouble finding was information on how to handle large (more than 2 or 4 gig) files using the streams in C++. The key is to use the off_type of the individual streams which will be 64 bits given the right compilation parameters (at least this worked using GCC 4.0.3). For inputstreams (istream,ifstreams) this type is named istream::off_type and the compilation parameters are as above.

I hope this will prove usefull for someone, otherwise I will at least have put these two programs somewhere safe for my own later use :) (No, I still do not plan to learn anything from this).

Note that the scanner program is roughly equivalent to running:

fgrep [pattern] --byte-offset [file/device]

which might be faster.

Filed under: Stuff
Comments (3) Trackbacks (0)
  1. What a drama! I was actually sitting on the edge of my seat… :-) I’m glad the story had a happy ending and that he files are now safely stored here at DAIMI.

  2. I’m afraid, that by calling grep as “fgrep” you have exactly disabled the support for regular expressions :-)

  3. Yeah, fixed :)

Leave a comment


No trackbacks yet.