javier arturo rodríguez

reiserfs and dd_rescue for data recovery

20060716 16:48 by javier

Last thursday the hard disk drive on a development machine died big time. First it started to behave erratically and dmesg showed that it has having trouble with some bad blocks. It did not survive a reboot: ReiserFS woud not mount it on boot, and reiserfsck running from an Accelerated Knoppix CD refused to bring it back to life. At first glance, the disk was beyond repair.
Of course, upon closer inspection, it turned out that the warranty expired exactly two months ago. Normally -after swearing my heart off- I would just replace the disk and make myself a nice paperweight or some other modern art piece -I’ve been looking forward to make one of those nice HDD clocks- but in the guts of that particular HDD were some uncomitted changes that I just wasn’t on the mood of rewriting. Besides, even though most of the data was expendable, the configuration hadn’t been backed up in quite a while (Yes, there is a pattern here).
So here’s the recipe I usually apply in these situations using Kurt Garloff’s dd_rescue. First get a brand-new HDD of approximately the same capacity and place both disks in a working Linux box (Depending on your necessities, booting from Knoppix might do). Let’s call the old, dying HDD /dev/hdg, and the spankin’ new disk will be /dev/hde. For the sake of simplicity, let’s assume that /dev/hdg was partitioned in /dev/hdg1 for swap and /dev/hdg2 for data.
First we’ll copy the entire data partition from /dev/hdg2 to /dev/hde2:

# dd_rescue /dev/hdg2 /dev/hde2

This will take a long, long time. dd_rescue starts with a reasonable block size, but whenever it encounters and error it retries a few times with a smaller block size before skipping the defective blocks and moving along. This is useful because it will copy all data in every readable block, instead of giving up at the first error like dd does. In my case, this took more than a day for a 248GB partition.
Once the data is in a new disk you can try to mount it directly, although it is a good idea to run reiserfsck first to make sure that the files you’ll copy are usable.

# reiserfsck /dev/hde2

Now here you might run into a small obstacle. Ideally I would buy the exact same model as the old drive for recovery purposes, because that guarantees that an exact bit-for-bit copy will work in most cases, partition maps and all. However in this case I bought a different brand, which resulted in a slightly smaller drive and a completely different geometry. When this happens, reiserfsck will complain about the different partition size, and suggests that you rebuild the superblock:

# reiserfsck –rebuild-sb /dev/hde2

Now you can do a normal reiserfsck.
When you’re done just mount the new partition and copy your data to a safe place:

# mount /dev/hde2 /mnt/tmp
# rsync -a –progress /mnt/tmp/etc /backup/dir/
# rsync -a –progress /mnt/tmp/home/arturo /another/backup/dir/

After this you can reformat the new drive for normal usage. Mine is being debbootstrapped as I write this.
This little recipe has saved quite some data and a few disks, including most of mcleod’s late Xbox hard disk. As usual your mileage may vary, but with a litle luck you just might get some of your files back.
Now about that crappy Maxtor HDD… I might just go for the wind chimes instead.

5 Responses to “reiserfs and dd_rescue for data recovery”

  1. javier Says:

    ¡Saludos tocayo!

    Hace un tiempo encontré tu blog, aunque lamentablemente no he tenido tiempo de seguirlo. Me ha parecido interesante por muchas cosas: tivo, mutuo interés por tecnología, somos de la misma institución (tec), tus imágenes de la misma me sirvieron mucho…

    En fin, espero darme una vuelta más seguido.

    javier.

  2. Robin Says:

    It is good to see prople sharing their experinces like this which would give some knowledge about what to do in case of a possible data loss in thier system. I have seen most of the people loose thier data just because of the fact they are unaware that there is something called Data recovery with which they can recover most of their data without allmost all the files to be intact. I got my data recovered from a service provider Disk doctors Labs and after sent most of the people i sent to disk doctors gave me good and positive response

  3. Diego Says:

    As always, my friend codehead to the rescue; i know this is the second time this happens to me and the second i bother you for help… You know it will also be to your benefit when we sync our multimedia collections ;)

    A hug from the Land down Under!

  4. Diego Says:

    Dude, the dd_rescue processs has finished succesfully:

    dd_rescue: (info): /dev/sdb2 (480320047.5k): EOF
    Summary for /dev/sdb2 -> /dev/sda1:
    dd_rescue: (info): ipos: 480320047.5k, opos: 480320047.5k, xferd: 480320047.5k
    errs: 168, errxfer: 84.0k, succxfer: 480319963.5k
    +curr.rate: 832kB/s, avg.rate: 9812kB/s, avg.load: -2.9%

    But now i am worried about the fsck, because it has hanged at 40%, just like with the original Hard Drive:

    Will rebuild the filesystem (/dev/sda1) tree
    Will put log info to ’stdout’

    Do you want to run this program?[N/Yes] (note need to type Yes if you do):Yes
    Replaying journal..
    Reiserfs journal ‘/dev/sda1′ in blocks [18..8211]: 0 transactions replayed
    ###########
    reiserfsck –rebuild-tree started at Wed Aug 8 08:02:26 2007
    ###########

    Pass 0:
    ####### Pass 0 #######
    Loading on-disk bitmap .. ok, 103818443 blocks marked used
    Skipping 11875 blocks (super block, journal, bitmaps) 103806568 blocks will be read
    0%….20%….40%

    It hasnt finished yet, but i fear a bad outcome; if the problem is no physical is there anything i can do to fix this?

    Regards my friend…

  5. Diego Says:

    Please forgive my total lack of patience, the process ended just fine:

    327035 directory entries were hashed with “r5″ hash.
    “r5″ hash is selected
    Flushing..finished
    Read blocks (but not data blocks) 103806568
    Leaves among those 116376
    Objectids found 324508

    Pass 1 (will try to insert 116376 leaves):
    ####### Pass 1 #######
    Looking for allocable blocks .. finished
    0%….20%….40%….60%….80%….100%
    Flushing..finished
    116376 leaves read
    116176 inserted
    - pointers in indirect items pointing to metadata 7 (zeroed)
    200 not inserted
    non-unique pointers in indirect items (zeroed) 2984
    ####### Pass 2 #######

    Pass 2:
    0%….20%….40%….60%vpf-10260: The file we are inserting the new item (134320 14974 0xae532001 IND (1), len 4048, location 48 entry count 0, fsck need 0, format new) into has no StatData, insertion was skipped
    ….80%….100%
    Flushing..finished
    Leaves inserted item by item 200
    Pass 3 (semantic):
    ####### Pass 3 #########
    vpf-10680: The file [249620 343697] has the wrong block count in the StatData (1414608) - corrected to (1403248)
    vpf-10680: The file [335046 335060] has the wrong block count in the StatData (23448) - corrected to (22904)
    vpf-10680: The file [241839 243017] has the wrong block count in the StatData (64) - corrected to (56)
    Flushing..finished
    Files found: 285283
    Directories found: 33617
    Symlinks found: 4426
    Others: 1167
    Pass 3a (looking for lost dir/files):
    ####### Pass 3a (lost+found pass) #########
    Looking for lost directories:
    Flushing..finished
    Pass 4 - finished
    Deleted unreachable items 2
    Flushing..finished
    Syncing..finished
    ###########
    reiserfsck finished at Wed Aug 8 13:18:34 2007
    ###########

    No i am on a “dd if=/dev/zero of=/dev/sda bs=1M” run to the original drive and see if it can be rescued ;)

Leave a Reply

32 queries. 0.435s  $Revision: 1.6 $
Use Any Browser! Valid XHTML 1.0   Powered by WordPress Powered by Apache Web Server Hacker Emblem