PDA

View Full Version : Error messages when loading persistant image



moon
01-16-2008, 01:52 AM
I created a persistant disk image on the hard disk using the menu "Knoppix->Configure->Create a persistent Knoppix disk image". After I create it, I reboot and when I get the persistent image dialog, I answer yes and the image loads with no errors. The second time I reboot and every time thereafter, I get the list of error messages shown at the end of this post. I have a newer laptop using Knoppix 5.1.1 DVD and its hard drive is a SATA drive and is recognized as sda. I have an older desktop using the Knoppix 5.1.1 CD and its hard drive is recognized as hda. The same thing happens on both computers. I googled this and found a few people that have had this problem in the past but no solutions were given. I have tried formatting the partition holding the persistent image as both ext2 and ext3 and the same thing happens in both cases. Even though I get these error messages, my settings seem to be getting saved from one session to the next. Should I just ignore these messages?

note - I tried an experiment once and modified the exit script to output the error messages when the umount command was given and the message was something like "device busy" or "file system busy" - sorry, I didn't write it down :(

note2 - This sounds very similar to a problem that was just posted today by Alexey931 titled "USB Flash unmounted incorrectly on every boot".

Here are the messages:

************************************************** ******************************

e2fsck 1.40-WIP (14-Nov-2006)
/media/sda6/knoppix.img was not cleanly unmounted, check forced.
....

Deleted inode xx has zero dtime. Fix? yes
.....

Block bitmap differences: ......................
Fix? yes

Free block counts wrong for group #0 .....
Fix? yes

Free block counts wrong .............
Fix? yes

Inode bitmap differences: ..............
Fix? yes

Free inode count wrong for group #0 ..............
Fix? yes

Free inodes count wrong .........................
Fix? yes

moon
01-16-2008, 07:22 PM
If anyone is able to load a persistent image and NOT get error messages, could you post answers to the following to help me troubleshoot?

1. What is the size of the partition you are using for the persistent image and how much of the partition is used? You can get these answers by running QTParted.

2. What is the file system of the partition? ext2, ext3, etc.. Also in QTParted (although sometimes I think running Partition Manager in Windows gives better answers).

3. When you shut down the system what is the list of directories that are unmounted? UNIONFS, KNOPPIX.IMG, etc..

Thanks in advance.
Bill

moon
01-16-2008, 07:29 PM
Another question for those with no error messages. Sorry I didn't get it into the previous post.

When booting, what options are you selecting in the persistent image dialog? The default options are: home, system, init.

johnrw
02-09-2008, 10:13 AM
I'll confirm this bug. I decided to try using the kernel's Emergency Remount Read-Only hotkey. On the next reboot... the volume was mounted without an error.

This is what I was getting on every bootup except the first one after I created the image, like you.
I also use the defaults offered in the mount dialog.


e2fsck 1.40-WIP (14-Nov-2006)
/media/hdb6/knoppix.img was not cleanly unmounted, check forced.

Pass 1: Checking inodes, blocks, and sizes
Deleted inode 102 has zero dtime.
Fix? yes
Deleted inode 103 has zero dtime.
Fix? yes
Deleted inode 104 has zero dtime.
Fix? yes

Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information

Block bitmap differences: -(8231--8373) -(12288--12299) -(14336--14765)
Fix? yes
Free blocks count wrong for group #0 (30624, counted=31209).
Fix? yes
Free blocks count wrong (547538, counted=548123).
Fix? yes
Inode bitmap differences: -(102--104)
Fix? yes
Free inodes count wrong for group #0 (15897, counted=15900).
Fix? yes
Free inodes count wrong (378704, counted=378707).
Fix? yes

/media/hdb6/knoppix.img: ***** FILE SYSTEM WAS MODIFIED *****
/media/hdb6/knoppix.img: 5293/384000 files (0.8% non-contiguous), 219877/768000 blocks
>> /home/knoppix mounted OK from /media/hdb6/knoppix.img.
>> Read-only CD/DVD system successfully merged with read-write /media/hdb6/knoppix.img.
Network device eth0 detected, DHCP broadcasting for IP. (Backgrounding)
INIT: Entering runlevel: 5
Starting Common Unix Printing System: cupsd.

root!tty1:/# cat /dev/vcs1 > /home/knoppix/Desktop/knoppix.img.error0


Then after I did the Emergency RO remount... and powered off... Then rebooted... and mounted the image without errors and then rebooted using the "reboot" command... I got this on the next startup.


e2fsck 1.40-WIP (14-Nov-2006)
/media/hdb6/knoppix.img was not cleanly unmounted, check forced.

Pass 1: Checking inodes, blocks, and sizes
Deleted inode 102 has zero dtime
Fix? yes
Deleted inode 103 has zero dtime
Fix? yes
Deleted inode 104 has zero dtime
Fix? yes
Deleted inode 105 has zero dtime
Fix? yes
Deleted inode 106 has zero dtime
Fix? yes
Deleted inode 107 has zero dtime
Fix? yes

Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information

Block bitmap differences: -(804--1122) -(1124--1309) -(26628--26779) -(30720--30743)
Fix? yes
Free blocks count wrong for group #0 (30528, counted=31209)
Fix? yes
Free blocks count wrong (541949, counted=542630).
Fix? yes
Inode bitmap differences: -(102--107)
Fix? yes
Free inodes count wrong for group #0 (15894, counted=15900).
Fix? yes
Free inodes count wrong (378605, counted=378611).
Fix? yes

/media/hdb6/knoppix.img: ***** FILE SYSTEM WAS MODIFIED *****
/media/hdb6/knoppix.img: 5389/384000 files (0.9% non-contiguous), 225370/768000 blocks
>> /home/knoppix mounted OK from /media/hdb6/knoppix.img.
>> Read-only CD/DVD system successfully merged with read-write /media/hdb6/knoppix.img.
Network device eth0 detected, DHCP broadcasting for IP. (Backgrounding)
INIT: Entering runlevel: 2
Starting Common Unix Printing System: cupsd.

root!tty1:/# cat /dev/vcs1 > /home/knoppix/Desktop/knoppix.img.error1

Now I am not one who knows enough about disk repairing in the linux world, yet... to be able to see what is being done here... but in case someone wants to jump on this bug with us... I hope that this helps.

Now I do use Gilles van Ruymbeke's minirt_511a.gz, when I am booting an knoppix.iso residing on a fixed disk. I did take a look at the files used and besides Gille's different loop.ko file... everything else... well they looked to be the same. Minirt_511a.gz also has a cloop.ko module, that I didn't see in the knoppix minirt.gz.

So there is something preventing a successful umount of the filesystem.
Here is the output of the mount command by itself.


/dev/root on / type ext2 (rw)
/ramdisk on /ramdisk type tmpfs (rw,size=826960k,mode=755)
/UNIONFS on /UNIONFS type aufs (rw,br:/ramdisk:/KNOPPIX:/KNOPPIX2)
/dev/sda2 on /cdrom type vfat (ro,nodev,fmask=0022,dmask=0022,codepage=cp437,ioc harset=iso8859-1)
/dev/cloop on /KNOPPIX type iso9660 (ro)
/dev/cloop2 on /KNOPPIX2 type iso9660 (ro)
/proc/bus/usb on /proc/bus/usb type usbfs (rw,devmode=0666)
/dev/pts on /dev/pts type devpts (rw)
/dev/hdb6 on /media/hdb6 type ext2 (rw,nosuid,nodev)
/media/hdb6/knoppix.img on /KNOPPIX.IMG type ext2 (rw,loop=/dev/loop0)
persistent on /UNIONFS type aufs (rw,br:/KNOPPIX.IMG:/KNOPPIX:/KNOPPIX2)

johnrw
02-09-2008, 11:12 AM
So what is the shutdown script that suppresses error messages?
I usually use boot to a console... and then I will type "init 5" when I want to use X.
When I want to end the session I just type reboot or poweroff as the need arises.

echo $PATH
/home/knoppix/.dist/bin:/sbin:/bin:/usr/sbin:/usr/bin:/usr/X11R6/bin:/usr/local/sbin:/usr/local/bin:/usr/games:/usr/NX/bin:.


locate reboot
/etc/init.d/knoppix-reboot
/etc/init.d/reboot
/etc/rc6.d/S90knoppix-reboot
/sbin/grub-reboot
/sbin/reboot

and a locate poweroff yields...
locate poweroff
/lib/modules/2.6.19/kernel/drivers/char/ipmi/ipmi_poweroff.ko
/sbin/poweroff
/usr/lib/klibc/bin/poweroff
/usr/share/man/man8/poweroff.8.gz

So a poweroff command executes /sbin/poweroff, and a reboot command executes /sbin/reboot

I will look into those later today... but I think a good goal would be to enable errors to be reported on shutdown.

moon
02-10-2008, 01:20 AM
Hi John,

The exit script I was referring to is in the file /etc/init.d/knoppix-halt. It is also in /etc/init.d/knoppix-reboot (These files might be identical). At line 204 are 3 umount commands, each followed by "2>/dev/null" (no quotes). The "2>/dev/null" is what eats any error messages. If you delete these 3 occurrences of "2>/dev/null" you will see the error messages. You might want to add some newlines to keep the messages from running into each other and wrapping. It also might not hurt to put in a read command to stop the output so you can look at it but I don't think that I did that.
I also just noticed a comment at line 7 of each of these files indicating that the author found it "difficult to unmount everything cleanly" all the time. Hmmm...

Regards,
Bill

johnrw
02-10-2008, 08:55 AM
Hi Bill,

About knoppix-halt and knoppix-reboot. They are the same file... as one is just a link to the other. I forget which is the link.

Okay... I tried changing line 204 to:

umount -d "$mp" >>/home/knoppix/0.out 2>&1 || { umount -r "$mp" >>/home/knoppix/1.out 2>&1 ; umount -l "$mp" >>/home/knoppix/2.out 2>&1 ; }


Here is 0.out

umount: /UNIONFS: device is busy
umount: /UNIONFS: device is busy
umount: /UNIONFS: device is busy
umount: /UNIONFS: device is busy
umount: /KNOPPIX.IMG: device is busy
umount: /KNOPPIX.IMG: device is busy

Here is 1.out

umount: persistent busy - remounted read-only
umount: /KNOPPIX.IMG: device is busy
umount: /KNOPPIX.IMG: device is busy

and 2.out is zero bytes... but there.
I got some errors in the console when running shutdown about can't create the file. I suspect they were about 2.out.
It seems as though the image is unmounted though... there may still be some open files.
How can I tell fsck to save files like chkdsk .chk files?

johnrw
02-10-2008, 09:37 PM
Wouldn't it be better if KNOPPIX.IMG was unmounted first?
How can UnionFS be unmounted when it contains mounted filesystems?

johnrw
02-11-2008, 05:08 AM
Well there is this word persistent that is bugging me. In the output of mounts above... and in the knoppix-image menu options. IE. "Add as persistent, writable system area" Now I unchecked that... and mounted it. Upon reboot, and remount, no error.

So what does that persistent aspect/option mean?

I also tried something crazy... I threw in a fuser -v -m right before the line 204 in knoppix-reboot... and at the first invocation...
it mentioned 2 files were still in use... pump and init.

I also tried something else... saying no to all knoppix.img mounting options... and then doing a
umount /KNOPPIX.IMG, which was successful. Then a restart of knoppix-image mounts without a fsck check. So it is the way the system is being brought down.

takayama
02-14-2008, 04:05 AM
If you kill the "pump" before the shutdown, then fsck will not happen.
(If you have a NFS mounted disk, unmount it before you kill the pump.)

This trouble of the persistent home seems to be related to
the debian bug report
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=261803
"pump stays around and stops /var/ and /usr/from being unmounted"

To remastering, /etc/init.d/knoppix-halt should be modified to stop pump properly.

Credit of this post: knoppix math developping mailing list in Japanese.

johnrw
02-14-2008, 05:19 AM
takayama!

Thanks for jumping in with the answer. So maybe the fuser command wasn't so crazy after all... :)
I had been meaning to try and kill pump and see how that worked... but it slipped my mind.
Then I saw your post... and killed it, and rebooted. Bingo! No error so you are indeed correct in this.
Thank You!

Btw, I would like to have an option to make the knoppix.img a ext3 filesystem... and noticed that ext2 is hardcoded in the scripts.
I had the knoppix.img on an ext3 partition... but the journal did not fix this problem. For example... hdb6 is ext3. Knoppix.img resides on hdb6... in the root. I kept getting those errors on bootup and wondered why if hdb6 was journaled.

Bug fixed.

moon
02-14-2008, 04:39 PM
What I am curious about is this. Do the great majority of Knoppix users out there have these error messages and just ignore them? Or are there just 2 or 3 of us that have these messages?

johnrw
02-14-2008, 08:17 PM
About what all users get and do about it... I can say this. I happened no matter how I had mine setup. So I would guess everybody got the errors. But they also could not see any loss of functionality caused by it.

In /etc/init.d/knoppix-reboot at line 53 I changed mine to:
BLACKLIST="cardmgr dhcpcd dhclient init aufs* *mount portmap udevd ntfs-3g"

and now no errors.

I am also going to point out that it shuts down much faster too.
Cheers!

mlap
03-02-2008, 01:15 AM
MOON:

I have been having this problem for as long as I can remember and have just ignored it, but I knew it was an issue. Lately it's become more irritating and hence I stumbled on this thread.

I would be a great convenience if this worked correctly. If anyone comes across a patch/update to correct this, please share! :D

In the meantime I'll try johnrw's BLACKLIST trick... (Thanks)

moon
03-03-2008, 08:45 PM
Hello all,

I've been meaning to get back to this but I haven't had a chance until now. johnrw and takayama did a great job of getting to the bottom of this issue. John's BLACKLIST fix does work in getting rid of the errors. The only problem with this is that pump is required if any NFS filesystems are to be unmounted. If pump is removed from the BLACKLIST, it will be killed before the NFS filesystems are unmounted. If NFS filesystems are not used, removing pump from the BLACKLIST will work just fine. However I do use NFS sometimes to exchange files between my two computers. My solution was to insert the pump-killing code after the NFS filesystems are unmounted. The code that I came up with is at the end of this message.

The only thing I am not entirely sure of is how much time to allow between the sending of the kill signal and when it can be assumed that the process (pump) is killed. In the code below, I allowed 1 second however this was entirely arbitrary. Maybe on a very slow computer it would not be enough time. If in doubt, this time can be increased. If anyone has any comments on how the timing of the kill command should be handled, it would be greatly appreciated.

If /etc/init.d/knoppix-halt has not been modified the new code should be inserted at line 199. Before any changes are made, the code around line 199 looks like this:

esac
done

# Clean up all mount references carefully, except for /cdrom and /KNOPPIX


After the new code is added it looks like this: ( I hope no unprintable characters got into this as a result of pasting it here)


esac
done

# Kill pump here so umounts are clean.
# Get pump's process id from ps for use in kill command.
ps | gawk '$4~/pump/{print $1 }' | ( read pumpPID #Closing parenthesis comes later
if [ -n "$pumpPID" ]
then
kill -15 $pumpPID 2>/dev/null # Send TERM signal
sleep 1 # Sleep for 1 second
kill -9 $pumpPID 2>/dev/null # Send KILL signal
sleep 1 # Sleep for 1 second
fi
) # Don't forget to put closing parenthesis here

# Clean up all mount references carefully, except for /cdrom and /KNOPPIX


Regards,
Bill

johnrw
03-03-2008, 09:46 PM
Bill... why not make a while loop to be sure?
This slight change would make sure pump is dead(I hope)

# Kill pump here so umounts are clean.
# Get pump's process id from ps for use in kill command.
pumpkill() {
ps | gawk '$4~/pump/{print $1 }' | ( read pumpPID #Closing parenthesis comes later
if [ -n "$pumpPID" ]
then
kill -15 $pumpPID 2>/dev/null # Send TERM signal
sleep 1 # Sleep for 1 second
kill -9 $pumpPID 2>/dev/null # Send KILL signal
sleep 1 # Sleep for 1 second
fi
) # Don't forget to put closing parenthesis here
} # and the closing brace

while [[ ! -z `ps -A | grep pump` ]] ; do
pumpkill
done

I think your way is better than the BLACKLIST hack... I haven't even used NFS yet.
But I have been enjoying seeing those clean image mounting since we started to resolve ourselves to fixing this bug.

cheers

moon
03-04-2008, 12:27 AM
Hi John,

I was thinking about a loop but my reasoning is this. If a loop is used and if the kill signal is sent to pump and something abnormal happens and pump refuses to kill itself, the loop will continue indefinitely and the system would hang while shutting down. Maybe it's best to just give the kill command(s) once and hope for the best. If pump gets killed, all well and good - if it doesn't, I don't think there is anything that can be done about it anyway so maybe it's best to just continue with the shutdown script. I notice that that is what the author of knoppix-halt did when killing the other processes earlier in the script. I've done a lot of Windows programming but I am kind of new to Linux and I certainly don't know about the nuances of shutting down a Linux system so I am far from an expert on this. Any comments would be appreciated.

Later,
Bill

johnrw
03-04-2008, 06:09 PM
Well you're right again Bill...
I am lazy. Here is a infinite loopkiller version... lol
I'm sure you knew that too... but for

LMAX=2
LCOUNT=0

while [[ ! -z `ps -A | grep pump` ]] || [ "$LCOUNT" -le "$LMAX" ] ; do
pumpkill
LCOUNT=$(($LCOUNT+1))
done

Otoh... I guess we could just upgrade pump in the persistent $home.
Did anyone trythat? Ok... I will see if a new version fixes it.

johnrw
03-04-2008, 06:29 PM
http://bugs.debian.org/cgi-bin/pkgreport.cgi?pkg=pump;dist=unstable
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=261803


Package: pump
Version: 0.8.19-2
Severity: normal


unlike dhcp3-client (dhclient3) and its usage, pump continues to run
during shutdown even after everything else has been killed.

consequently, /var and /usr cannot be unmounted, because pump is still
using those partitions (even though lsof /var does not think so).

luckily, dhcp3-client _does_ work properly!

Well this is an old bug... 3 years and 220 days old; Modified 3 years and 220 days ago.

johnrw
03-04-2008, 07:24 PM
Hi Bill,

Ok... I installed the latest pump... and put the knoppix-halt script back to the default actions...
and got errors on reboot.

So instead of loops and whatnot... down around line 170 I killed pump with it's own kill daemon option
after the last network command I could see. Comments in the file seem to indicate a 'bring network down'
procedure was envisioned but not implemented formally. So just try this out with some nfs stuff.


# Unmount network filesystems first before shutting down network
NFSBUSY=""
NETMOUNTS="$(gawk '{if($1~/:/){print $2}}' /proc/mounts 2>/dev/null)"
if [ -n "$NETMOUNTS" ]; then
echo "${BLUE}Unmounting network filesystems.${NORMAL}"
# Preload programs we need later, in case we lose the network too early
swapoff --help >/dev/null 2>&1
losetup --help >/dev/null 2>&1
mount --help >/dev/null 2>&1
umount --help >/dev/null 2>&1
gawk --help >/dev/null 2>&1
tac --help >/dev/null 2>&1
# Umount NFS (if not busy)
umount -t nfs,nfs4,smbfs -alv 2>/dev/null
fi
# see http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=261803 for why this is needed here.
# When knoppix-persistent home feature is used... pump prevents their umount from succeeding.
# unless killed explicitly.
pump -k

# Umount devpts early (otherwise UNIONFS may stay busy)
umount -t devpts -a 2>/dev/null

# UNIONFS umount
if [ "$(ls -l1d /lib | gawk '{print $NF}')" = "/UNIONFS/lib" ]
then

cheers

If it works for you... then I'll post this as a bug to fix on the debian-knoppix list with link to this topic.

moon
03-06-2008, 12:46 AM
pump -k --- I like it. The simplest solution is usually the best. I put it at the same place you did and I get no error messages. Unfortunately the power supply on my other computer died so I can't do any NFS testing right now. It won't be until next week that I will be able to get another supply and put it in. I would go ahead and post the bug because even if this isn't the final solution, something should be done about it anyway. When I get the other computer up, I will do some NFS testing and see what happens when I shut the system down. I suspect that it will work ok. Will post the results here.
Later,
Bill

johnrw
03-06-2008, 02:17 AM
HI Bill,

Have you noticed whether or not the shutdown is also faster? Ok well I wrote the list about this just now.
Hopefully it will make it into the next release.