Page 1 of 3 123 LastLast
Results 1 to 10 of 23

Thread: Please help me reassemble my RAID5! (long)

  1. #1
    Junior Member registered user
    Join Date
    Jun 2007
    Location
    Seattle
    Posts
    11

    Please help me reassemble my RAID5! (long)

    Hello, all. "New to Linux" - yep. I've been knee-deep in Linux for about a week now. And I've got a RAID5 array that I'd love to mount and get my approx. 700GB of irreplaceable final cut projects, Logic audio projects, DVD Studio Pro projects, screenplays, photos, and other such intellectual property off of in one piece. I'm hoping that the kind and wise among you can give me a few hints - or at least suggest next steps.

    In 2005 I built a Rebyte NAS precisely because I didn't want to learn Linux. I think it's JFS; I wish I'd written that down. It has functioned without issue for two years as a file server. It has *NOT* been backed up because, like most idiots, I assumed that a RAID5 array *was* a backup (yes, lesson learned, thankyouverymuch). Last saturday it corrupted some photos while I was browsing them, threw me a whole bunch of file system errors, and then when I rebooted, it wanted a floppy.

    Bad news. Long story short, the BIOS on the mobo blew up. So: new motherboard, new CPU, new memory, new power supply, same Promise ATA133 IDE card, same 4 Maxtor 300GB IDE drives.

    The Rebyte runs on a Delkin flash card; little IDE dood. It lives on the mobo; the drives live on the Promise. So we hook everything back up, and power on.

    I got:

    --- rd:4 wd:2 fd:2
    disk 0, s:0, o:0, n:0 rd:0 us:1 dev:[dev 00:00]
    disk 1, s:0, o:1, n:1 rd:1 us:1 dev:hdf1
    disk 2, s:0, o:1, n:2 rd:2 us:1 dev:hdg1
    disk 3, s:0, o:0, n:3 rd:3 us:1 dev:[dev 00:00]
    raid5: failed to run raid set md0
    md: pers->run() failed...
    md :do_md_run() returned -22
    md: md0 stopped.
    md: unbind<hdg1,1>
    md: export_rdev(hdg1)
    md: unbind<hdf1,0)
    md: ... autorun DONE
    NET4: Linux TCP/IP 1.0 for NET4.0
    IP Protocols: ICMP, UDP, TCP
    IP: routing cache hash table of 8192 buckets, 64Kbytes
    TCP: Hash tables configured (established 262144 bind 65536)
    NET4: Unix domain sockets 1.0/SMP for Linux NET4.0.
    ds: no socket drivers loaded!
    kmod: failed to exec /sbin/modprobe -s -k block-major-3, errno = 2
    VFS: Cannot open root device "303" or 03:03
    Please append a correct "root=" boot option
    Kernel panic: VFS: Unable to mount root fs on 03:03



    ...not cool. Not cool at all. Now would be a good time to mention that Rebyte hasn't been supported since, oh, two months after I bought it. They've packed up for parts unknown.

    So some theorizing - since it's a LILO linux install, the array (which I think was JFS - wish I'd written that down!) should be addressable by some Linux flavor or other. The thinking was: mount Ubuntu or the like off a LiveCD and try and get things hoppin'.

    Well, the Promise Card doesn't like four hard drives and a CD-ROM. No way, no how. So that cooks off any notion of Ubuntu. Fortunately we got Knoppix (KDE 3.5.5, release 2.6.19) up and running off a USB flash drive. So: I've got four drives on my desktop, they won't mount, but they're called hde1, hdf1, hdg1, and hdh1.

    Awright. Progress.

    This is about where I start gettin' jiggy with the commands that I have only the vaguest understanding of (remember: I went with an embedded Linux appliance so I wouldn't have to make sense of this stuff... that plan turned out well).


    ~$ sudo mdadm --examine /dev/hde1

    /dev/hde1:
    Magic : a92b4efc
    Version : 00.90.00
    UUID : e632930d6:7bfeb139:d5864e82:7e7b2ad7
    Creation Time : Thu May 19 08:05:12 2005
    Raid Level : raid5
    Device Size : 293049600 (279.47 GiB 300.08 GB)
    Array Size : 879148800 (838.42 GiB 900.25 GB)
    Raid Devices : 4
    Total Devices : 4
    Preferred Minor 0

    Update Time : Sat May 21 10:19:09 2005
    State : Clean
    Active Devices: 4
    Working Devices: 4
    Failed Devices: 0
    Spare Devices : 0
    Checksum : f5f885fc - correct
    Events : 0.28

    Layout : left-symmetric
    Chunk Size : 32k

    Number Major Minor RaidDevice State
    this 0 33 1 0 active sync /dev/hde1
    0 0 33 1 0 active sync /dev/hde1
    1 1 33 65 1 active sync /dev/hdf1
    2 2 34 1 2 active sync /dev/hdg1
    3 3 34 65 3 active sync /dev/hdh1



    Hokay. There's a RAID array there; I just can't talk to it. (Anybody see anything else that I, with my severely untrained eye, do not see?)


    ~$ sudo fdisk -l /dev/hde1 (or hdf1, or hdg1, or hdh1)

    Disk /dev/hde1: 300.0 GB, 300082889728 bytes
    255 heads, 63 sectors/track, 36482 cylinders
    Units = cylinders of 16065 *512 = 8225280 bytes

    Disk /dev/hde1 doesn't contain a valid partition table



    "doesn't contain a valid partition table" worries me. I like the sound of it much better than "Unable to mount root fs" but it still isn't reassuring.

    ~$ sudo fsck /dev/hde1

    fsck 1.40-WIP (14-Nov-2006)
    e2fsck 1.40-WIP (14-Nov-2006)
    Group descriptors look bad... trying backup blocks...
    Superblock has an invalid ext3 journal (inode 8).
    Clear<y>? no

    fsck.ext3: Illegal inode number while checking ext3 journal for /dev/hde1



    I don't know what an invalid ext3 journal is. I *DO* know that I'm not about to clear something what I don't know what it is. But it seems like a problem. How much trouble am I likely to get into by letting it clear it?


    ~$ sudo fsck /dev/hdf1

    fsck 1.40-WIP (14-Nov-2006)
    e2fsck 1.40-WIP (14-Nov-2006)
    Couldn't find ext2 superblock, trying backup blocks...
    fsck.ext2: Bad magic number in super-block while trying to open /dev/hdf1

    The superblock could not be read or does not describe a correct ext2 filesystem. If the device is valid and it really contains an ext2 filesystem (and not swap or ufs or something else) then the superblock is corrupt, and you might try running e2fsck with an alternate superblock:
    e2fsck -b 8193 <device>


    I'm thinking that not being able to be read is because it's not the first drive in the RAID. Am I correct in this thinking?

    Didn't want to e2fsck nuthin' without knowing what I was doing. Which I don't. Which is what I'm hoping you wonderful people can help me with.


    ~$ sudo mount /dev/hde1
    mount: wrong fs type, bad option, bad superblock on /dev/hde1,
    missing codepage or other error
    In some cases useful info is found in syslong - try
    dmesg | tail or so

    ~$ dmesg | tail

    [drm] Initialized i915 1.5.0 20060119 on minor 0
    ADDRCONF(NETDEV_UP): eth1: link is not ready
    EXT3-fs error (device hde1): ext3_check_descriptors: Block Bitmap for group 896 not in group (block 130055736)!
    EXT3-fs: group descriptors corrupted!
    EXT3-fs error (device hde1): ext3_check_descriptors: Block Bitmap for group 896 not in group (block 130055736)!
    EXT3-fs: group descriptors corrupted!
    EXT3-fs error (device hde1): ext3_check_descriptors: Block Bitmap for group 896 not in group (block 130055736)!
    EXT3-fs: group descriptors corrupted!
    EXT3-fs error (device hde1): ext3_check_descriptors: Block Bitmap for group 896 not in group (block 130055736)!
    EXT3-fs: group descriptors corrupted!

    ~$ sudo tune2fs -l /dev/hde1

    tune2fs 1.40-WIP (14-Nov-2006)
    Filesystem Volume Name: None
    Filesystem UUID: 9b0cdb0b-b18c-45b9-8ba8-10a0c6acf4a5
    Filesystem magic number: 0xEF53
    Filesystem Revision #: 1 (dynamic_
    Filesystem features: has_journal filetype sparse_super large_file
    Default mount options: (none)
    Filesystem state: clean with errors
    Errors behavior: Continue
    Filesystem OS type: Linux
    Inode Count: 109903872
    Block Count: 219787200
    Reserved block count: 10989360
    Free blocks: 45568125
    Free inodes: 31402013
    First block: 0
    Block size: 4096
    Fragment size: 4096
    Blocks per group: 32768
    Fragments per group: 32768
    Inodes per group: 16384
    Inode blocks per group: 512
    Last mount time: Sat May 21 08:07:54 2005
    Last write time: Tue Jun 19 09:22:58 2007
    Max count: 14
    Maximum mount count: -1
    Last checked: Thu May 19 08:05:28 2005
    Check interval: 0 (<none>)
    Reserved blocks uid: 0 (user root)
    Reserved blocks gid: 0 (group root)
    First inode: 11
    Inode size: 128
    Journal inode: 8

    ~$ sudo tune2fs -l /dev/hdf1 ## (or g1, or h1)
    tune2fs 1.40-WIP (14-Nov-2006)
    tune2fs: Bad magic number in super-block while trying to open /dev/hdf1
    Couldn't find valid filesystem superblock.


    ...so that's what I know about my array.

    QUESTIONS:

    1) How much trouble will I get into by clearing the invalid ext3 on hde1? What should I do next if I do that?

    2) What will e2fsck -b 8193 <hdf1> or g or h do to my array? Is this a very, very bad idea?

    3) What other commands can I run on this sucker to find out more about what's going on?

    4) What do I need to do so I can mount the array, suck my data CLEAN OFF of those drives and onto something else, and rebuild it as something else (currently thinking Serverelements Naslite 2 USB, but open to suggestions)?

    Any help anybody can offer would be deeply, deeply appreciated. I'm in way over my head; unfortunately I've discovered I"m in way over my friends' heads, too (and a good half dozen of them are sysadmins). I just want my data back. Once I have it I'll go run Ubuntu like a good little boy; please keep me from hurting myself or my data!

    Oh, and take your time. I've got to be out out town for a week, so I'm not doing anything to it until next Wednesday. I would *love* to be able to put the silly thing back together again once I get back and get on with my life...

    Deepest, sincerest thanks,

    Seth


  2. #2
    Senior Member registered user
    Join Date
    Jan 2007
    Posts
    104

    Re: Please help me reassemble my RAID5! (long)

    I'm probably not the right guy for this job since I don't have a software RAID array at any level, but after a quick google search for "linux software raid repair" it appears that the answer to question #2 in this link might be of some use:

    http://www.linux.com/howtos/Software...-HOWTO-4.shtml

    Hopefully the ckraid utility can help you out.

  3. #3
    Member registered user
    Join Date
    Apr 2004
    Posts
    50
    Okay so my first bit of advice is step lightly. It looks like you have a good chance of recovering data but you need to be very careful of just running random fsck commands.

    What you need to do is rebuild the RAID under Knoppix temporarily and try to recover/fsck/etc the raid device ie /dev/md/0 or /dev/md0, but not the individual partitions that make up that RAID. If you were to actually run a fsck on an individual partition and written superblocks, you would have likely ruined your chances of recovery.

    I actually cover how to mount a modern-style software RAID in the update to Knoppix Hacks, but unfortunately it's still in the editing phase now so instead I can at least give you some of the basic steps in the updated software RAID hack so you can mount the drive:

    Code:
    $ sudo modprobe md
    $ sudo mdadm --assemble --auto=yes /dev/md0 /dev/hde1 /dev/hdf1 /dev/hdg1 /dev/hdh1
    If these commands finished without error, then if you cat /proc/mdstat you should see a listing for the new md0 array you have assembled (if not come back with the error output and we can go from there). If the array was assembled, then attempt to mount the drive:

    Code:
    $ sudo mkdir /mnt/md0
    $ sudo mount /dev/md0 /mnt/md0
    If the mount command can recognize the filesystem it should be able to automatically mount it without extra options. Otherwise if it complains, again come back with the error. You may have to experiment with --dry-run style fscks (ie ones that show you what they would do, but actually write no changes) to /dev/md0 with the different filesystem fsck tools to see which filesystem it has.

    Try some of these and if you run into snags, come back to the thread.

    Good Luck.

  4. #4
    Junior Member registered user
    Join Date
    Jun 2007
    Location
    Seattle
    Posts
    11
    Thanks a bunch, guys. "Step lightly" has been my mantra; I'd rather wait and not have the data for a while than trash it. Which is why I crossed my fingers every time and did my level best to be non-invasive... simply typing "fdisk" makes me nervous as hell...

    I'll be back at the server on Wednesday. I will gently, cautiously, slowly apply both of your advice; I'll report back with whatever I find out.

    Thanks again for your help!

  5. #5
    Junior Member registered user
    Join Date
    Jun 2007
    Location
    Seattle
    Posts
    11

    In the interests of stepping lightly:

    So I tried this:

    $ sudo modprobe md

    ...and it just sorta did it. If I understand the man pages correctly, I'm basically telling the kernel "Prepare for a Software RAID." Yes?

    $ sudo mdadm --assemble --auto=yes /dev/md0 /dev/hde1 /dev/hdf1 /dev/hdg1 /dev/hdh1

    Did this, and it says:

    mdadm: device 3 in /dev/md0 has wrong state in superbock, but /dev/hdh1 seems ok
    mdadm: failed to RUN_ARRAY /dev/md0: Input/output error


    ...and, in the interests of not getting in deeper than I can handle, I calmly, humbly and appreciatively await further advice.

    Thanks, greenfly!

  6. #6
    Member registered user
    Join Date
    Nov 2005
    Location
    Idaho
    Posts
    59
    I can not help with the problem but if the data is important and you have the $$$ , would buy new hdd's of same size and do a sector to sector copy from problem hdd's to new hdd.
    If you do that, then use the new hdd's for recovery and if an error is made you still have the original hdds to try again.

  7. #7
    Junior Member registered user
    Join Date
    Jun 2007
    Location
    Seattle
    Posts
    11
    Topfarmer - I've considered that. I haven't done it yet because I don't have a clear gameplan to get there. I have a Pentium III running XP which has IDE, I have this 2.8 GHz Celeron with no OS (and no spare drive to put one on) but which runs Knoppix off a jump drive. I have two macs, neither of which have IDE connectors on them.

    The Celeron board has four spare SATA connectors; I'd go SATA if anything. So is there a handy program that runs over Knoppix or Ubuntu or XP that will allow me to do a sector-to-sector clone? If someone can give me some clear instructions as to what to buy and where to get it, I'll probably do that.

  8. #8
    Administrator Site Admin-
    Join Date
    Apr 2003
    Location
    USA
    Posts
    5,441
    dd will let you do a sector by sector copy. You don't really need to buy drives with the same size and geometry (which can often be very difficult, particularly for older drives). One thing that you could do that would be only minimally more risky would be to use dd to make sector by sector copies of each raid drive to one or more larger drives. Then go ahead and try to recover from the original drives. If the data on the drives gets mucked up, it can be recovered by again copying with dd from the backups you just made.

  9. #9
    Junior Member registered user
    Join Date
    Jun 2007
    Location
    Seattle
    Posts
    11

    DD drives

    So, let's say I were to buy 4 500GB SATA seagates (which I'm itching to do). Let's say I plug them into this mainboard. Using DD I would be able to do a sector by sector copy from, say, hdg1 to one of these new drives? And then I would do it with all four of them, so I would have hdi1, hdj1, hdk1 and hdl1?

    Could I do it one at a time (since I'm not seeing eight drives being happy in this enclosure)? Say, have one of the 300GBs in there, copy it to one of the 500GBs, put another 300GB in there, copy it to another 500GB, and so on? Then rebuild the RAID using the SATA drives? Or rebuild the raid using the IDE drives?

  10. #10
    Administrator Site Admin-
    Join Date
    Apr 2003
    Location
    USA
    Posts
    5,441
    dd can copy device imagess (drives) to files, or back again. So if you got a sata drive large enough to hold more than one hard disk image, then you could copy multiple drive images to files on one disk, and of course, copy those images back to the drives later. If your new drive is larger than one drive but smaller than 2 then geting as many drives as you already have would be the simplest approach. dd is quite flexiable and you could split a drive dump to multiple files, but this does add complexity.

    It should be obvious that you only need to attach one ide drive to your sata system at a time to copy it. This will involve a bit more swapping around and system cycling, but the limited ide connectors on many new systems may require this.

    Rather than ask more questions about this, I suggest reading the man page for dd and even doing a google search for dd linux for more information.

Page 1 of 3 123 LastLast

Similar Threads

  1. Knoppix 5.3.1 DVD RAID5 Issue
    By VladD in forum Hardware & Booting
    Replies: 2
    Last Post: 04-27-2008, 03:35 AM
  2. Docs Down... How Long?
    By jMon54 in forum General Support
    Replies: 1
    Last Post: 12-13-2004, 04:03 PM
  3. How long do you wait?
    By nishtya in forum The Lounge
    Replies: 0
    Last Post: 09-26-2004, 04:01 AM
  4. A Long Story...
    By linuX-gamEr in forum Laptops
    Replies: 1
    Last Post: 04-18-2004, 01:37 PM
  5. How long does this take??
    By georgetoon in forum General Support
    Replies: 5
    Last Post: 07-15-2003, 07:03 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •