setht
06-19-2007, 04:35 AM
Hello, all. "New to Linux" - yep. I've been knee-deep in Linux for about a week now. And I've got a RAID5 array that I'd love to mount and get my approx. 700GB of irreplaceable final cut projects, Logic audio projects, DVD Studio Pro projects, screenplays, photos, and other such intellectual property off of in one piece. I'm hoping that the kind and wise among you can give me a few hints - or at least suggest next steps.
In 2005 I built a Rebyte NAS precisely because I didn't want to learn Linux. I think it's JFS; I wish I'd written that down. It has functioned without issue for two years as a file server. It has *NOT* been backed up because, like most idiots, I assumed that a RAID5 array *was* a backup (yes, lesson learned, thankyouverymuch). Last saturday it corrupted some photos while I was browsing them, threw me a whole bunch of file system errors, and then when I rebooted, it wanted a floppy.
Bad news. Long story short, the BIOS on the mobo blew up. So: new motherboard, new CPU, new memory, new power supply, same Promise ATA133 IDE card, same 4 Maxtor 300GB IDE drives.
The Rebyte runs on a Delkin flash card; little IDE dood. It lives on the mobo; the drives live on the Promise. So we hook everything back up, and power on.
I got:
--- rd:4 wd:2 fd:2
disk 0, s:0, o:0, n:0 rd:0 us:1 dev:[dev 00:00]
disk 1, s:0, o:1, n:1 rd:1 us:1 dev:hdf1
disk 2, s:0, o:1, n:2 rd:2 us:1 dev:hdg1
disk 3, s:0, o:0, n:3 rd:3 us:1 dev:[dev 00:00]
raid5: failed to run raid set md0
md: pers->run() failed...
md :do_md_run() returned -22
md: md0 stopped.
md: unbind<hdg1,1>
md: export_rdev(hdg1)
md: unbind<hdf1,0)
md: ... autorun DONE
NET4: Linux TCP/IP 1.0 for NET4.0
IP Protocols: ICMP, UDP, TCP
IP: routing cache hash table of 8192 buckets, 64Kbytes
TCP: Hash tables configured (established 262144 bind 65536)
NET4: Unix domain sockets 1.0/SMP for Linux NET4.0.
ds: no socket drivers loaded!
kmod: failed to exec /sbin/modprobe -s -k block-major-3, errno = 2
VFS: Cannot open root device "303" or 03:03
Please append a correct "root=" boot option
Kernel panic: VFS: Unable to mount root fs on 03:03
...not cool. Not cool at all. Now would be a good time to mention that Rebyte hasn't been supported since, oh, two months after I bought it. They've packed up for parts unknown.
So some theorizing - since it's a LILO linux install, the array (which I think was JFS - wish I'd written that down!) should be addressable by some Linux flavor or other. The thinking was: mount Ubuntu or the like off a LiveCD and try and get things hoppin'.
Well, the Promise Card doesn't like four hard drives and a CD-ROM. No way, no how. So that cooks off any notion of Ubuntu. Fortunately we got Knoppix (KDE 3.5.5, release 2.6.19) up and running off a USB flash drive. So: I've got four drives on my desktop, they won't mount, but they're called hde1, hdf1, hdg1, and hdh1.
Awright. Progress.
This is about where I start gettin' jiggy with the commands that I have only the vaguest understanding of (remember: I went with an embedded Linux appliance so I wouldn't have to make sense of this stuff... that plan turned out well).
~$ sudo mdadm --examine /dev/hde1
/dev/hde1:
Magic : a92b4efc
Version : 00.90.00
UUID : e632930d6:7bfeb139:d5864e82:7e7b2ad7
Creation Time : Thu May 19 08:05:12 2005
Raid Level : raid5
Device Size : 293049600 (279.47 GiB 300.08 GB)
Array Size : 879148800 (838.42 GiB 900.25 GB)
Raid Devices : 4
Total Devices : 4
Preferred Minor 0
Update Time : Sat May 21 10:19:09 2005
State : Clean
Active Devices: 4
Working Devices: 4
Failed Devices: 0
Spare Devices : 0
Checksum : f5f885fc - correct
Events : 0.28
Layout : left-symmetric
Chunk Size : 32k
Number Major Minor RaidDevice State
this 0 33 1 0 active sync /dev/hde1
0 0 33 1 0 active sync /dev/hde1
1 1 33 65 1 active sync /dev/hdf1
2 2 34 1 2 active sync /dev/hdg1
3 3 34 65 3 active sync /dev/hdh1
Hokay. There's a RAID array there; I just can't talk to it. (Anybody see anything else that I, with my severely untrained eye, do not see?)
~$ sudo fdisk -l /dev/hde1 (or hdf1, or hdg1, or hdh1)
Disk /dev/hde1: 300.0 GB, 300082889728 bytes
255 heads, 63 sectors/track, 36482 cylinders
Units = cylinders of 16065 *512 = 8225280 bytes
Disk /dev/hde1 doesn't contain a valid partition table
"doesn't contain a valid partition table" worries me. I like the sound of it much better than "Unable to mount root fs" but it still isn't reassuring.
~$ sudo fsck /dev/hde1
fsck 1.40-WIP (14-Nov-2006)
e2fsck 1.40-WIP (14-Nov-2006)
Group descriptors look bad... trying backup blocks...
Superblock has an invalid ext3 journal (inode 8).
Clear<y>? no
fsck.ext3: Illegal inode number while checking ext3 journal for /dev/hde1
I don't know what an invalid ext3 journal is. I *DO* know that I'm not about to clear something what I don't know what it is. But it seems like a problem. How much trouble am I likely to get into by letting it clear it?
~$ sudo fsck /dev/hdf1
fsck 1.40-WIP (14-Nov-2006)
e2fsck 1.40-WIP (14-Nov-2006)
Couldn't find ext2 superblock, trying backup blocks...
fsck.ext2: Bad magic number in super-block while trying to open /dev/hdf1
The superblock could not be read or does not describe a correct ext2 filesystem. If the device is valid and it really contains an ext2 filesystem (and not swap or ufs or something else) then the superblock is corrupt, and you might try running e2fsck with an alternate superblock:
e2fsck -b 8193 <device>
I'm thinking that not being able to be read is because it's not the first drive in the RAID. Am I correct in this thinking?
Didn't want to e2fsck nuthin' without knowing what I was doing. Which I don't. Which is what I'm hoping you wonderful people can help me with.
~$ sudo mount /dev/hde1
mount: wrong fs type, bad option, bad superblock on /dev/hde1,
missing codepage or other error
In some cases useful info is found in syslong - try
dmesg | tail or so
~$ dmesg | tail
[drm] Initialized i915 1.5.0 20060119 on minor 0
ADDRCONF(NETDEV_UP): eth1: link is not ready
EXT3-fs error (device hde1): ext3_check_descriptors: Block Bitmap for group 896 not in group (block 130055736)!
EXT3-fs: group descriptors corrupted!
EXT3-fs error (device hde1): ext3_check_descriptors: Block Bitmap for group 896 not in group (block 130055736)!
EXT3-fs: group descriptors corrupted!
EXT3-fs error (device hde1): ext3_check_descriptors: Block Bitmap for group 896 not in group (block 130055736)!
EXT3-fs: group descriptors corrupted!
EXT3-fs error (device hde1): ext3_check_descriptors: Block Bitmap for group 896 not in group (block 130055736)!
EXT3-fs: group descriptors corrupted!
~$ sudo tune2fs -l /dev/hde1
tune2fs 1.40-WIP (14-Nov-2006)
Filesystem Volume Name: None
Filesystem UUID: 9b0cdb0b-b18c-45b9-8ba8-10a0c6acf4a5
Filesystem magic number: 0xEF53
Filesystem Revision #: 1 (dynamic_
Filesystem features: has_journal filetype sparse_super large_file
Default mount options: (none)
Filesystem state: clean with errors
Errors behavior: Continue
Filesystem OS type: Linux
Inode Count: 109903872
Block Count: 219787200
Reserved block count: 10989360
Free blocks: 45568125
Free inodes: 31402013
First block: 0
Block size: 4096
Fragment size: 4096
Blocks per group: 32768
Fragments per group: 32768
Inodes per group: 16384
Inode blocks per group: 512
Last mount time: Sat May 21 08:07:54 2005
Last write time: Tue Jun 19 09:22:58 2007
Max count: 14
Maximum mount count: -1
Last checked: Thu May 19 08:05:28 2005
Check interval: 0 (<none>)
Reserved blocks uid: 0 (user root)
Reserved blocks gid: 0 (group root)
First inode: 11
Inode size: 128
Journal inode: 8
~$ sudo tune2fs -l /dev/hdf1 ## (or g1, or h1)
tune2fs 1.40-WIP (14-Nov-2006)
tune2fs: Bad magic number in super-block while trying to open /dev/hdf1
Couldn't find valid filesystem superblock.
...so that's what I know about my array.
QUESTIONS:
1) How much trouble will I get into by clearing the invalid ext3 on hde1? What should I do next if I do that?
2) What will e2fsck -b 8193 <hdf1> or g or h do to my array? Is this a very, very bad idea?
3) What other commands can I run on this sucker to find out more about what's going on?
4) What do I need to do so I can mount the array, suck my data CLEAN OFF of those drives and onto something else, and rebuild it as something else (currently thinking Serverelements Naslite 2 USB, but open to suggestions)?
Any help anybody can offer would be deeply, deeply appreciated. I'm in way over my head; unfortunately I've discovered I"m in way over my friends' heads, too (and a good half dozen of them are sysadmins). I just want my data back. Once I have it I'll go run Ubuntu like a good little boy; please keep me from hurting myself or my data!
Oh, and take your time. I've got to be out out town for a week, so I'm not doing anything to it until next Wednesday. I would *love* to be able to put the silly thing back together again once I get back and get on with my life...
Deepest, sincerest thanks,
Seth
In 2005 I built a Rebyte NAS precisely because I didn't want to learn Linux. I think it's JFS; I wish I'd written that down. It has functioned without issue for two years as a file server. It has *NOT* been backed up because, like most idiots, I assumed that a RAID5 array *was* a backup (yes, lesson learned, thankyouverymuch). Last saturday it corrupted some photos while I was browsing them, threw me a whole bunch of file system errors, and then when I rebooted, it wanted a floppy.
Bad news. Long story short, the BIOS on the mobo blew up. So: new motherboard, new CPU, new memory, new power supply, same Promise ATA133 IDE card, same 4 Maxtor 300GB IDE drives.
The Rebyte runs on a Delkin flash card; little IDE dood. It lives on the mobo; the drives live on the Promise. So we hook everything back up, and power on.
I got:
--- rd:4 wd:2 fd:2
disk 0, s:0, o:0, n:0 rd:0 us:1 dev:[dev 00:00]
disk 1, s:0, o:1, n:1 rd:1 us:1 dev:hdf1
disk 2, s:0, o:1, n:2 rd:2 us:1 dev:hdg1
disk 3, s:0, o:0, n:3 rd:3 us:1 dev:[dev 00:00]
raid5: failed to run raid set md0
md: pers->run() failed...
md :do_md_run() returned -22
md: md0 stopped.
md: unbind<hdg1,1>
md: export_rdev(hdg1)
md: unbind<hdf1,0)
md: ... autorun DONE
NET4: Linux TCP/IP 1.0 for NET4.0
IP Protocols: ICMP, UDP, TCP
IP: routing cache hash table of 8192 buckets, 64Kbytes
TCP: Hash tables configured (established 262144 bind 65536)
NET4: Unix domain sockets 1.0/SMP for Linux NET4.0.
ds: no socket drivers loaded!
kmod: failed to exec /sbin/modprobe -s -k block-major-3, errno = 2
VFS: Cannot open root device "303" or 03:03
Please append a correct "root=" boot option
Kernel panic: VFS: Unable to mount root fs on 03:03
...not cool. Not cool at all. Now would be a good time to mention that Rebyte hasn't been supported since, oh, two months after I bought it. They've packed up for parts unknown.
So some theorizing - since it's a LILO linux install, the array (which I think was JFS - wish I'd written that down!) should be addressable by some Linux flavor or other. The thinking was: mount Ubuntu or the like off a LiveCD and try and get things hoppin'.
Well, the Promise Card doesn't like four hard drives and a CD-ROM. No way, no how. So that cooks off any notion of Ubuntu. Fortunately we got Knoppix (KDE 3.5.5, release 2.6.19) up and running off a USB flash drive. So: I've got four drives on my desktop, they won't mount, but they're called hde1, hdf1, hdg1, and hdh1.
Awright. Progress.
This is about where I start gettin' jiggy with the commands that I have only the vaguest understanding of (remember: I went with an embedded Linux appliance so I wouldn't have to make sense of this stuff... that plan turned out well).
~$ sudo mdadm --examine /dev/hde1
/dev/hde1:
Magic : a92b4efc
Version : 00.90.00
UUID : e632930d6:7bfeb139:d5864e82:7e7b2ad7
Creation Time : Thu May 19 08:05:12 2005
Raid Level : raid5
Device Size : 293049600 (279.47 GiB 300.08 GB)
Array Size : 879148800 (838.42 GiB 900.25 GB)
Raid Devices : 4
Total Devices : 4
Preferred Minor 0
Update Time : Sat May 21 10:19:09 2005
State : Clean
Active Devices: 4
Working Devices: 4
Failed Devices: 0
Spare Devices : 0
Checksum : f5f885fc - correct
Events : 0.28
Layout : left-symmetric
Chunk Size : 32k
Number Major Minor RaidDevice State
this 0 33 1 0 active sync /dev/hde1
0 0 33 1 0 active sync /dev/hde1
1 1 33 65 1 active sync /dev/hdf1
2 2 34 1 2 active sync /dev/hdg1
3 3 34 65 3 active sync /dev/hdh1
Hokay. There's a RAID array there; I just can't talk to it. (Anybody see anything else that I, with my severely untrained eye, do not see?)
~$ sudo fdisk -l /dev/hde1 (or hdf1, or hdg1, or hdh1)
Disk /dev/hde1: 300.0 GB, 300082889728 bytes
255 heads, 63 sectors/track, 36482 cylinders
Units = cylinders of 16065 *512 = 8225280 bytes
Disk /dev/hde1 doesn't contain a valid partition table
"doesn't contain a valid partition table" worries me. I like the sound of it much better than "Unable to mount root fs" but it still isn't reassuring.
~$ sudo fsck /dev/hde1
fsck 1.40-WIP (14-Nov-2006)
e2fsck 1.40-WIP (14-Nov-2006)
Group descriptors look bad... trying backup blocks...
Superblock has an invalid ext3 journal (inode 8).
Clear<y>? no
fsck.ext3: Illegal inode number while checking ext3 journal for /dev/hde1
I don't know what an invalid ext3 journal is. I *DO* know that I'm not about to clear something what I don't know what it is. But it seems like a problem. How much trouble am I likely to get into by letting it clear it?
~$ sudo fsck /dev/hdf1
fsck 1.40-WIP (14-Nov-2006)
e2fsck 1.40-WIP (14-Nov-2006)
Couldn't find ext2 superblock, trying backup blocks...
fsck.ext2: Bad magic number in super-block while trying to open /dev/hdf1
The superblock could not be read or does not describe a correct ext2 filesystem. If the device is valid and it really contains an ext2 filesystem (and not swap or ufs or something else) then the superblock is corrupt, and you might try running e2fsck with an alternate superblock:
e2fsck -b 8193 <device>
I'm thinking that not being able to be read is because it's not the first drive in the RAID. Am I correct in this thinking?
Didn't want to e2fsck nuthin' without knowing what I was doing. Which I don't. Which is what I'm hoping you wonderful people can help me with.
~$ sudo mount /dev/hde1
mount: wrong fs type, bad option, bad superblock on /dev/hde1,
missing codepage or other error
In some cases useful info is found in syslong - try
dmesg | tail or so
~$ dmesg | tail
[drm] Initialized i915 1.5.0 20060119 on minor 0
ADDRCONF(NETDEV_UP): eth1: link is not ready
EXT3-fs error (device hde1): ext3_check_descriptors: Block Bitmap for group 896 not in group (block 130055736)!
EXT3-fs: group descriptors corrupted!
EXT3-fs error (device hde1): ext3_check_descriptors: Block Bitmap for group 896 not in group (block 130055736)!
EXT3-fs: group descriptors corrupted!
EXT3-fs error (device hde1): ext3_check_descriptors: Block Bitmap for group 896 not in group (block 130055736)!
EXT3-fs: group descriptors corrupted!
EXT3-fs error (device hde1): ext3_check_descriptors: Block Bitmap for group 896 not in group (block 130055736)!
EXT3-fs: group descriptors corrupted!
~$ sudo tune2fs -l /dev/hde1
tune2fs 1.40-WIP (14-Nov-2006)
Filesystem Volume Name: None
Filesystem UUID: 9b0cdb0b-b18c-45b9-8ba8-10a0c6acf4a5
Filesystem magic number: 0xEF53
Filesystem Revision #: 1 (dynamic_
Filesystem features: has_journal filetype sparse_super large_file
Default mount options: (none)
Filesystem state: clean with errors
Errors behavior: Continue
Filesystem OS type: Linux
Inode Count: 109903872
Block Count: 219787200
Reserved block count: 10989360
Free blocks: 45568125
Free inodes: 31402013
First block: 0
Block size: 4096
Fragment size: 4096
Blocks per group: 32768
Fragments per group: 32768
Inodes per group: 16384
Inode blocks per group: 512
Last mount time: Sat May 21 08:07:54 2005
Last write time: Tue Jun 19 09:22:58 2007
Max count: 14
Maximum mount count: -1
Last checked: Thu May 19 08:05:28 2005
Check interval: 0 (<none>)
Reserved blocks uid: 0 (user root)
Reserved blocks gid: 0 (group root)
First inode: 11
Inode size: 128
Journal inode: 8
~$ sudo tune2fs -l /dev/hdf1 ## (or g1, or h1)
tune2fs 1.40-WIP (14-Nov-2006)
tune2fs: Bad magic number in super-block while trying to open /dev/hdf1
Couldn't find valid filesystem superblock.
...so that's what I know about my array.
QUESTIONS:
1) How much trouble will I get into by clearing the invalid ext3 on hde1? What should I do next if I do that?
2) What will e2fsck -b 8193 <hdf1> or g or h do to my array? Is this a very, very bad idea?
3) What other commands can I run on this sucker to find out more about what's going on?
4) What do I need to do so I can mount the array, suck my data CLEAN OFF of those drives and onto something else, and rebuild it as something else (currently thinking Serverelements Naslite 2 USB, but open to suggestions)?
Any help anybody can offer would be deeply, deeply appreciated. I'm in way over my head; unfortunately I've discovered I"m in way over my friends' heads, too (and a good half dozen of them are sysadmins). I just want my data back. Once I have it I'll go run Ubuntu like a good little boy; please keep me from hurting myself or my data!
Oh, and take your time. I've got to be out out town for a week, so I'm not doing anything to it until next Wednesday. I would *love* to be able to put the silly thing back together again once I get back and get on with my life...
Deepest, sincerest thanks,
Seth