PDA

View Full Version : RAID and journaling fs - a bad idea?



ThomasX
03-18-2005, 03:11 PM
I just wanted to set up a RAID1, to have something I can count on. But then I started to think about it: What happens, if you have a power failure (hard switch off) in a system with e.g. reiserfs and RAID1? On the one HD, the writing transaction may got finished until poweroff, and on the 2nd HD the write transaction was still ongoing. So at the next poweron, the write gets rolled back on the 2nd HD, but not on the 1st HD. Now you have a mirrored system with not identical data.

Do you have any idea, how (or if) this is solved?

Dave_Bechtel
03-20-2005, 06:24 PM
--I'm sure the designers thought of this, as Reiserfs is used in major production systems.

--However, you can always go here ( http://www.namesys.com/ ) and check the FAQ, or send them an email.

--My take on it, is that you're better off WITH journalling, than without. But I also recommend buying a UPS.

--BTW, can anyone recommend a good UPS with Linux support? ;-)


I just wanted to set up a RAID1, to have something I can count on. But then I started to think about it: What happens, if you have a power failure (hard switch off) in a system with e.g. reiserfs and RAID1? On the one HD, the writing transaction may got finished until poweroff, and on the 2nd HD the write transaction was still ongoing. So at the next poweron, the write gets rolled back on the 2nd HD, but not on the 1st HD. Now you have a mirrored system with not identical data.

Do you have any idea, how (or if) this is solved?

Cuddles
03-20-2005, 07:58 PM
Dave Bechtel,

Not quite sure on this, but, as for your querry on UPS's...

I dont think you can go wrong with APC -=- I have been using them for many years, without problems.

I dont have the seriel interface running, I just use the power and phone plugs, but, the units are "rock solid", and have held me through every-other-minute of brown-outs, used to live in Tucson, Arizona, when the thunder and lightening storms get bad during monsoon season.

Considering how big APC has gotten, I can not imagine that they havent, at least, ported, there software over to Linux, if not, created completely new support software for it... Possibly check the web site, and hit up there tech support ?

As for just the hardware, I am a devoted buyer of APC, and feel quite safe having used for more than twenty years, and more than 3 systems - not to mention there warrentee that backs there products.

Ms. Cuddles

koaschten
03-20-2005, 08:29 PM
apc does support linux, at least on the serial port, there even is the software to monitor it, not sure if it has a client, but the demon works like a charm, monitoring it from a windows workstation.

Dave_Bechtel
03-21-2005, 09:30 AM
--Thanks for the UPS advice! :)

ThomasX
03-21-2005, 01:41 PM
back to my orig. question: I got an answer:

I assumed that RAID is on top of the fs, but I got told that actually the fs is on top of the RAID module. Therefore there is only *one* fs "instance" involved, which either gets rolled back or not.

However, a new question arises now: If a journaling fs gives savety on power failure, but a raid software is in between, does it probably break your power off savety for the sake of HD HW failure savety?

ThomasX
03-21-2005, 02:24 PM
> --However, you can always go here ( http://www.namesys.com/ ) and check the FAQ

Jo, that was a good pointer.

However, I found that entry:
"Does using ReiserFS mean I can just press the power off button without running "shutdown" or "init 0," etc? Does it mean there is no risk of data loss?"
(http://www.namesys.com/faq.html#poweroff)
The answer was "no" :-( and they were using the term "data journaling", what will be introduced in the future.

For xfs it doesn't appear to be any better(?)
(I found http://oss.sgi.com/projects/xfs/faq.html)

tom p
03-21-2005, 07:09 PM
The answer was "no" :-(
The journaling file systems only guarantee consistency of the file system, not that no data will be lost after a catastrophic failure.

Thomas

Dave_Bechtel
03-21-2005, 08:21 PM
--I think you're a little too paranoid. :?

--Buy a UPS, and do frequent backups, and you should be fine.


back to my orig. question: I got an answer:

I assumed that RAID is on top of the fs, but I got told that actually the fs is on top of the RAID module. Therefore there is only *one* fs "instance" involved, which either gets rolled back or not.

However, a new question arises now: If a journaling fs gives savety on power failure, but a raid software is in between, does it probably break your power off savety for the sake of HD HW failure savety?

ThomasX
03-22-2005, 04:04 PM
--I think you're a little too paranoid.

Yes, probably. On the other hand - I really like to hard-switch-off my server, as it is a AT board (no ATX, no soft power button) and has no keyboard attached to it. But o.k., I should change my habits :?

Cuddles
03-22-2005, 05:14 PM
ThomasX,

Not sure about these answers, if they pertain to your issues, but, I'll state them, just in case...

I have a few systems, each running a few different "knoppix" versions of OS's, different motherboards, different video, etc... I have one system that "can" actually "power off" completely ( when the screen says: "power off" - it actually does ), and have not had another system that would do that. I think it has to do with APCI or APM. But, that really isnt as important as "seeing" that "power off" message. During the shutdown, it "gracefully" un mounts the HDD's, and shuts down all the processes running, which is "about" the same as what "Windows" does. Just hitting the power button, can cause damage, because things are left open, and possibly, in a state of half-saved / half-not. The shutdown process ensures that these processes "know" they need to clean up, and go away, because the system is going bye-bye. With a RAID system, the problem comes from the cache, and the time between "commits". If the commit time is too slow, you will have changes destine for your hard drives, still stored in the cache, but access times to this data will be faster. If the commit time is too fast, all of the opposite, can result; your hard drives will be kept as up-to-date as possible, but more physical reads from the drives will be needed.

RAID systems, power, comes from the "distributive" process, placing "peices" of your data accross more than one drive, in hopes, that, if, God forbid, a drive failture happens, more of your data can be salvaged from the good parts, and the parts that get lost, can be rebuilt. This depends on the RAID system you use. I used to work on a HP 3000, with ten drives on the RAID. The RAID software allowed "rebuilding" a lost, or replaced, drives data. But, this is what I have found, a RAID system is not intended for recovery, but, for faster access times. The main reason for a RAID system is to "distribute" the process of "getting" the data to the user who requested it, not in the recovery process. Hence, most "big" mainframe computer systems, come with a RAID system, and and, at least two or more drives "dedecated" for the RAID. It was at a time when hard drives didnt have the fastest access times, and the thought that two, or more, hard drives, given the task of accessing data, independantly, at the same time, could come up with the "complete" data, faster, than just one drive. This all comes down to a moot thinking, considering the faster access times that hard drives have now. Access times were slower, back then, because they "usually" used the "big" CDC300 disk packs, to move 30 heads, tied together, back and forth, on the 10 inch disk surface, slowed down times. Having five of these "packs" accessing independantly, was, by sheer mechanics, a lot faster.

Another concern of RAID systems; the drives should "physically" be different, or you dont gain anything. i.e. a RAID system using /dev/hda1, /dev/hda2, /dev/hda3, etc... would not gain you anything. For dissaster recovery, the device is "physically" the same hard drive, and if it crashed, the complete RAID would be lost. For access times, the same holds true, the same heads are being requested to access the different locations. For a "proper" advantage of a RAID system, you would probably want to use devices like /dev/hda1, /dev/hdb1, etc... Lastly, consider your cables, and controller usage. Best possible cable / controller usage, for a RAID, would be having the devices on /hda and /hdc, or /hdb and /hdd -=- both of these examples are using two seperate controllers for accessing the data channels, and thus, can transmitt requests and data at the same time, without the other getting in the way.

Now, on the subject of journaling file systems... Think of it as a backup file, for your word processor. In case of something going wrong, it "can" rebuild the file, to "close" to where it was. Considering that, the more that needs to be "rebuilt" the harder it will be to "recover" completely, from a dissaster. I have been using the ext3 file system, and only recently, moved up to the reiser file system. Dont know much about the reiser FS, but, on the subject of ext3, I used it a lot longer, and had many "recoveries" from using it. I used to have a "stuburn" system, that seemed to like freezing whenever I played a audio CD, or a DVD movie. What would happen was, it would try and read something from these devices, and, only thing I can think of, is, get lost. The system would get frooze waiting for the data it requested, and the device would completely forget that it was supposed to be supplying it. During all of this, the system would simply lock-up, completely, and from what I could tell, indeffinately. The only way to regain control was to hit the main power, and pray that the system would come back up. During that reboot, you could see, when it went to load, or mount the home and root areas, that ext3 was attempting to recover, and rebuild, from a dissaster. I might add, either I was lucky, or, the ext3 FS was just "real" good, but, no data loss was apparent.

As a last thought, neither of these two "systems" is a replacement for doing backups. Way back, running that HP 3000, with the RAID running on ten disks, we not only did backups, I think we did twice as many, as I usually do. Nothing, I repeat, nothing, can replace having a good backup. For dissaster recovery, for physical drive failures, and for even the "oops, I deletely the wrong file" sinario.

Hopefully this helps,
Ms. Cuddles