PDA

View Full Version : Rescuing data on HDD



CrimsonS
03-31-2009, 11:12 PM
Hello,

I have about 2 years worth of experience with linux, mainly Gentoo, and still consider myself a newbie...

I have been stuck on this problem for a week now, and I have tried so many solutions that I am starting to get lost myself so I'll try to explain as much as I can. Also, this computer's main user is my girlfriend so some info may escape me (although she is not very techie and knows not to mess with things she doesn't know about). First of all : here are my system specs :


root@1[knoppix]# lspci -v
0000:00:00.0 Host bridge: Intel Corp. 82845 845 (Brookdale) Chipset Host Bridge (rev 03)
Subsystem: Intel Corp. 82845 845 (Brookdale) Chipset Host Bridge
Flags: bus master, fast devsel, latency 0
Memory at d8000000 (32-bit, prefetchable) [size=64M]
Capabilities: [e4] #09 [0104]
Capabilities: [a0] AGP version 2.0

0000:00:01.0 PCI bridge: Intel Corp. 82845 845 (Brookdale) Chipset AGP Bridge (rev 03) (prog-if 00 [Normal decode])
Flags: bus master, 66MHz, fast devsel, latency 64
Bus: primary=00, secondary=01, subordinate=01, sec-latency=32
Memory behind bridge: dc000000-ddffffff
Prefetchable memory behind bridge: d0000000-d7ffffff

0000:00:1e.0 PCI bridge: Intel Corp. 82801 PCI Bridge (rev 12) (prog-if 00 [Normal decode])
Flags: bus master, fast devsel, latency 0
Bus: primary=00, secondary=02, subordinate=02, sec-latency=32
I/O behind bridge: 0000c000-0000cfff
Memory behind bridge: de000000-de0fffff

0000:00:1f.0 ISA bridge: Intel Corp. 82801BA ISA Bridge (LPC) (rev 12)
Flags: bus master, medium devsel, latency 0

0000:00:1f.1 IDE interface: Intel Corp. 82801BA IDE U100 (rev 12) (prog-if 80 [Master])
Subsystem: Micro-Star International Co., Ltd.: Unknown device 244b
Flags: bus master, medium devsel, latency 0
I/O ports at f000 [size=16]

0000:00:1f.2 USB Controller: Intel Corp. 82801BA/BAM USB (Hub #1) (rev 12) (prog-if 00 [UHCI])
Subsystem: Micro-Star International Co., Ltd.: Unknown device 244b
Flags: bus master, medium devsel, latency 0, IRQ 19
I/O ports at d000 [size=32]

0000:00:1f.3 SMBus: Intel Corp. 82801BA/BAM SMBus (rev 12)
Subsystem: Micro-Star International Co., Ltd.: Unknown device 244b
Flags: medium devsel, IRQ 17
I/O ports at 5000 [size=16]

0000:00:1f.4 USB Controller: Intel Corp. 82801BA/BAM USB (Hub #2) (rev 12) (prog-if 00 [UHCI])
Subsystem: Micro-Star International Co., Ltd.: Unknown device 244b
Flags: bus master, medium devsel, latency 0, IRQ 23
I/O ports at d800 [size=32]

0000:01:00.0 VGA compatible controller: nVidia Corporation NV11 [GeForce2 MX/MX 400] (rev b2) (prog-if 00 [VGA])
Flags: bus master, 66MHz, medium devsel, latency 32, IRQ 16
Memory at dc000000 (32-bit, non-prefetchable) [size=16M]
Memory at d0000000 (32-bit, prefetchable) [size=128M]
Capabilities: [60] Power Management version 2
Capabilities: [44] AGP version 2.0

0000:02:03.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8139/8139C/8139C+ (rev 10)
Subsystem: Realtek Semiconductor Co., Ltd. RT8139
Flags: bus master, medium devsel, latency 32, IRQ 19
I/O ports at c000 [size=256]
Memory at de000000 (32-bit, non-prefetchable) [size=256]
Capabilities: [50] Power Management version 2

0000:02:09.0 Multimedia audio controller: C-Media Electronics Inc CM8738 (rev 10)
Subsystem: Micro-Star International Co., Ltd.: Unknown device 5280
Flags: bus master, medium devsel, latency 32, IRQ 21
I/O ports at c400 [size=256]
Capabilities: [c0] Power Management version 2

What happened : A default Ubuntu 8.10 (installed on hda) was running fine before I decided to clean out the dust from inside the computer (without any of the data backed up, of course... ). At the same time, I figured I would add a couple parts to it, mainly another HDD that I had messed up with Gentoo (hdb) a few years back. When I started up the computer again It worked fine. When my gf got to it, the mouse wasn't working so she decided to reboot.

From this point on, the Ubuntu drive would not boot up, and instead output a Grub 'Error 17' (Filesystem not recognized). So I booted with a Knoppix LiveCD, and tried a couple of things without success.

here is the partition table for hda :


root@1[knoppix]# fdisk -l /dev/hda

Disk /dev/hda: 40.0 GB, 40020664320 bytes
255 heads, 63 sectors/track, 4865 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System
/dev/hda1 * 1 4660 37431418+ 83 Linux
/dev/hda2 4661 4865 1646662+ 5 Extended
/dev/hda5 4661 4865 1646631 82 Linux swap / Solaris


I read that a filesystem check fail could be due to a damaged superblock and that I could use a backup to restore the superblock, which I did without success :


root@0[knoppix]# mkfs.ext3 -n /dev/hda1
mke2fs 1.38 (30-Jun-2005)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
4685824 inodes, 9357854 blocks
467892 blocks (5.00%) reserved for the super user
First data block=0
286 block groups
32768 blocks per group, 32768 fragments per group
16384 inodes per group
Superblock backups stored on blocks:
32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
4096000, 7962624

root@0[knoppix]# fsck.ext3 -b 32768 -f -y -C0 /dev/hda1
e2fsck 1.38 (30-Jun-2005)
/dev/hda1: Attempt to read block from filesystem resulted in short read while reading block 1289

/dev/hda1: Attempt to read block from filesystem resulted in short read reading journal superblock

fsck.ext3: Attempt to read block from filesystem resulted in short read while checking ext3 journal for /dev/hda1


'dmesg' spat out a bunch of errors about hda1 that I couldn't decipher :



root@0[knoppix]# dmesg | grep hda1
Buffer I/O error on device hda1, logical block 0
Buffer I/O error on device hda1, logical block 1
Buffer I/O error on device hda1, logical block 2
Buffer I/O error on device hda1, logical block 3
Buffer I/O error on device hda1, logical block 4
Buffer I/O error on device hda1, logical block 5
Buffer I/O error on device hda1, logical block 6
Buffer I/O error on device hda1, logical block 7
Buffer I/O error on device hda1, logical block 8
Buffer I/O error on device hda1, logical block 9
Buffer I/O error on device hda1, logical block 10
Buffer I/O error on device hda1, logical block 11
Buffer I/O error on device hda1, logical block 12
Buffer I/O error on device hda1, logical block 13
Buffer I/O error on device hda1, logical block 14
Buffer I/O error on device hda1, logical block 15
Buffer I/O error on device hda1, logical block 16
Buffer I/O error on device hda1, logical block 17
Buffer I/O error on device hda1, logical block 18
Buffer I/O error on device hda1, logical block 19
Buffer I/O error on device hda1, logical block 20
Buffer I/O error on device hda1, logical block 21
Buffer I/O error on device hda1, logical block 23
Buffer I/O error on device hda1, logical block 0
Buffer I/O error on device hda1, logical block 1
Buffer I/O error on device hda1, logical block 2
Buffer I/O error on device hda1, logical block 3
Buffer I/O error on device hda1, logical block 4
Buffer I/O error on device hda1, logical block 5
Buffer I/O error on device hda1, logical block 6
Buffer I/O error on device hda1, logical block 7
Buffer I/O error on device hda1, logical block 10312
Buffer I/O error on device hda1, logical block 10313
Buffer I/O error on device hda1, logical block 10314
Buffer I/O error on device hda1, logical block 10315
Buffer I/O error on device hda1, logical block 10317
Buffer I/O error on device hda1, logical block 10318
Buffer I/O error on device hda1, logical block 10319
Buffer I/O error on device hda1, logical block 10317
Buffer I/O error on device hda1, logical block 10318
Buffer I/O error on device hda1, logical block 10319
Buffer I/O error on device hda1, logical block 10312
Buffer I/O error on device hda1, logical block 10313
Buffer I/O error on device hda1, logical block 10314
Buffer I/O error on device hda1, logical block 10315
Buffer I/O error on device hda1, logical block 10312
Buffer I/O error on device hda1, logical block 10313
Buffer I/O error on device hda1, logical block 10314
Buffer I/O error on device hda1, logical block 10315
Buffer I/O error on device hda1, logical block 10317
Buffer I/O error on device hda1, logical block 10318
Buffer I/O error on device hda1, logical block 10319
Buffer I/O error on device hda1, logical block 10317
Buffer I/O error on device hda1, logical block 10318
Buffer I/O error on device hda1, logical block 10319
Buffer I/O error on device hda1, logical block 10312
Buffer I/O error on device hda1, logical block 10313
Buffer I/O error on device hda1, logical block 10314
Buffer I/O error on device hda1, logical block 10315
Buffer I/O error on device hda1, logical block 0
Buffer I/O error on device hda1, logical block 1
Buffer I/O error on device hda1, logical block 2
Buffer I/O error on device hda1, logical block 3
Buffer I/O error on device hda1, logical block 4
Buffer I/O error on device hda1, logical block 5
Buffer I/O error on device hda1, logical block 6
Buffer I/O error on device hda1, logical block 7
Buffer I/O error on device hda1, logical block 8
Buffer I/O error on device hda1, logical block 9
Buffer I/O error on device hda1, logical block 10
Buffer I/O error on device hda1, logical block 11
Buffer I/O error on device hda1, logical block 12
Buffer I/O error on device hda1, logical block 13
Buffer I/O error on device hda1, logical block 14
Buffer I/O error on device hda1, logical block 15
Buffer I/O error on device hda1, logical block 16
Buffer I/O error on device hda1, logical block 17
Buffer I/O error on device hda1, logical block 18
Buffer I/O error on device hda1, logical block 19
Buffer I/O error on device hda1, logical block 20
Buffer I/O error on device hda1, logical block 21
Buffer I/O error on device hda1, logical block 23
Buffer I/O error on device hda1, logical block 0
Buffer I/O error on device hda1, logical block 1
Buffer I/O error on device hda1, logical block 2
Buffer I/O error on device hda1, logical block 3
Buffer I/O error on device hda1, logical block 4
Buffer I/O error on device hda1, logical block 5
Buffer I/O error on device hda1, logical block 6
Buffer I/O error on device hda1, logical block 7
Buffer I/O error on device hda1, logical block 10312
Buffer I/O error on device hda1, logical block 10313
Buffer I/O error on device hda1, logical block 10314
Buffer I/O error on device hda1, logical block 10315
Buffer I/O error on device hda1, logical block 10317
Buffer I/O error on device hda1, logical block 10318
Buffer I/O error on device hda1, logical block 10319
Buffer I/O error on device hda1, logical block 10317
Buffer I/O error on device hda1, logical block 10318
Buffer I/O error on device hda1, logical block 10319
Buffer I/O error on device hda1, logical block 10312
Buffer I/O error on device hda1, logical block 10313
Buffer I/O error on device hda1, logical block 10314
Buffer I/O error on device hda1, logical block 10315

So I figured the whole disk was damaged so I used smartmontools to check for errors on the disk : (I did the short and long test)


root@0[knoppix]# smartctl -a /dev/hda
smartctl version 5.32 Copyright (C) 2002-4 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF INFORMATION SECTION ===
Device Model: ST340014A
Serial Number: 3JX2ZH3X
Firmware Version: 3.06
Device is: In smartctl database [for details use: -P show]
ATA Version is: 6
ATA Standard is: ATA/ATAPI-6 T13 1410D revision 2
Local Time is: Tue Mar 31 18:04:12 2009 EST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 89) The previous self-test completed having
the electrical element of the test
failed.
Total time to complete Offline
data collection: ( 430) seconds.
Offline data collection
capabilities: (0x5b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
No General Purpose Logging support.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 31) minutes.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 049 046 006 Pre-fail Always - 186591624
3 Spin_Up_Time 0x0003 098 098 000 Pre-fail Always - 0
4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 16
5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 1
7 Seek_Error_Rate 0x000f 084 060 030 Pre-fail Always - 268295550
9 Power_On_Hours 0x0032 056 056 000 Old_age Always - 38746
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 421
194 Temperature_Celsius 0x0022 038 058 000 Old_age Always - 38
195 Hardware_ECC_Recovered 0x001a 049 045 000 Old_age Always - 186591624
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 58
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 58
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x0000 100 253 000 Old_age Offline - 0
202 TA_Increase_Count 0x0032 100 253 000 Old_age Always - 0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged. [To run self-tests, use: smartctl -t]


SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

This seems to be contradictory, but I can't really say having never done these tests before ... can someone please help me out ?

rusty
04-01-2009, 01:05 AM
It looks like you got a messed up filesystem, and maybe a drive on the brink of failure. Your best bet at this point might be to zero out the drive, see what smartctl reports, and start with a fresh install. There are some tools for data recovery, the one you might want to look at is lde, which can be downloaded vi apt.
Another command you can try, for diagnostic purposes, is badblocks. And of course there's testdisk.

http://lde.sourceforge.net/

http://linux.die.net/man/8/badblocks

http://www.cgsecurity.org/wiki/TestDisk

http://en.wikipedia.org/wiki/E2fsprogs


HTH

OErjan
04-01-2009, 12:12 PM
uhm, actually, i have had a similar thing happen and it was caused by a misjumpered IDE drive, the jumpering was set so i had two master drives on the same cable, this confused Linux quite a bit.
similar things have happened with disks jumpered cable select. remove the extra disk and reboot without it, if things work check the jumpering again.

EDIT, reason Linux requires correct jumpering is that the BIOS is not used for disks in same way as in M$ systems.

CrimsonS
04-01-2009, 04:01 PM
rusty : Thank you for your help, but those tests all come out pretty random too...

OErjan : thanks, I tried several combinations including plugging it alone, but it doesn't change much. To think about it, the BIOS takes quite some time to recognise the drive, and one of the pins behind the drive seems damaged. It just seems weird that it would let go all of a sudden like this.

My gf still seems convinced that a "professional" could recover the data. Does anyone recommend using "data recovery experts" for this sort of damaged.

Harry Kuhman
04-01-2009, 04:48 PM
My gf still seems convinced that a "professional" could recover the data. Does anyone recommend using "data recovery experts" for this sort of damaged.
I would be extremely cautious about this. All are extremely expensive and some have a reputation for, once they get their hands on your hard drive, stating that the problem is going to be more complex than they expected and extorting you for more money. And if you don't pay up, the drive that you get back (if you even get it back) may no longer have anything recoverable from it. So if you do go that route, try to at least confirm that your "recovery expert" is legitimate and doesn't have a history of extorting customers, And, of course, expect dishonest "professionals" to post their own favorable reviews.

CrimsonS
04-01-2009, 09:42 PM
Thank you Harry Kuhman, its as I thought... I'll pass the info along

OP
04-23-2009, 04:11 PM
http://www.sysresccd.org/Main_Page

"SystemRescueCd is a Linux system on a bootable CD-ROM for repairing your system and recovering your data after a crash. It aims to provide an easy way to carry out admin tasks on your computer, such as creating and editing the partitions of the hard disk."