PDA

View Full Version : Race bug in xsession-knoppix



mack
07-23-2003, 04:56 PM
:!: I encountered a race bug in the xsession-knoppix package, in /etc/X11/Xsession.d/45xsession.

At first, I thought something was wrong with my remastering. See discussion in http://www.knoppix.net/forum/viewtopic.php?t=3558.

After failing to find a change I made which caused KDE configuration to reset to default while loading, I created a new remaster, starting from knoppix 3.2, and made sure I'm not making any changes other than running some daemon I need from some rc file which is totally unrelated to KDE or X11. The bug was still there but was not persistent. Sometimes the environment was ok and sometimes it wasn't. :?

After a lot of digging and adding debug messages all over the place (which made the problem worse), I realized that when 45xsession copies the kde config files in this line:


\cp -ua /etc/skel/{.kde*,Desktop} $HOME/

it fails to copy some important files from /etc/skel/.kde/share/config/, claiming that the directory already exists in the ramdisk homedir and has no write permission. I added
rm -rf /ramdisk/home/knoppix/.kde right before the above line, and still, sometimes these config files were created before the cp had a chance to put the real files there. Some other process was racing with 45xsession on the creation of some files in .kde. :shock:

Looking a bit up in 45xsession reveals a
sleep 2 right before the cp takes place, after ksplash is executed. Playing with the sleep time solves the problem here, but on slower hardware or with mode processes using the CPU during init, the problem can reappear.

:arrow: To sum it up, ksplash, at a certain point, creates .kde/share/config if its not there yet, and prevents cp from putting the real files there.

I don't know why ksplash has to be loaded so early instead of after the non-persistent homedir is created, but in the meantime, here's a real work-around which isn't time-based (so the sleep 2 can be eliminated):

:idea: Right before ksplash is executed, I do
mkdir -p $HOME/.kde/share/config.

This is ugly but solves it for now. As a real solution, ksplash has to be executed AFTER the creation of the non-persistent homedir, or be patched to create these dirs with proper permissions.

Who maintains this package ?

Mack

Stephen
07-23-2003, 06:42 PM
Who maintains this package ?

Mack

Not sure who maintains it but the bugzilla (http://www.knoppix.net/bugs/) is the place to report bugs I'm sure someone will look at it once the problem is known to them.

Fabianx
07-23-2003, 09:00 PM
:!: I encountered a race bug in the xsession-knoppix package, in /etc/X11/Xsession.d/45xsession.

At first, I thought something was wrong with my remastering. See discussion in http://www.knoppix.net/forum/viewtopic.php?t=3558.

After failing to find a change I made which caused KDE configuration to reset to default while loading, I created a new remaster, starting from knoppix 3.2, and made sure I'm not making any changes other than running some daemon I need from some rc file which is totally unrelated to KDE or X11. The bug was still there but was not persistent. Sometimes the environment was ok and sometimes it wasn't. :?

After a lot of digging and adding debug messages all over the place (which made the problem worse), I realized that when 45xsession copies the kde config files in this line:


\cp -ua /etc/skel/{.kde*,Desktop} $HOME/

it fails to copy some important files from /etc/skel/.kde/share/config/, claiming that the directory already exists in the ramdisk homedir and has no write permission. I added
rm -rf /ramdisk/home/knoppix/.kde right before the above line, and still, sometimes these config files were created before the cp had a chance to put the real files there. Some other process was racing with 45xsession on the creation of some files in .kde. :shock:

Looking a bit up in 45xsession reveals a
sleep 2 right before the cp takes place, after ksplash is executed. Playing with the sleep time solves the problem here, but on slower hardware or with mode processes using the CPU during init, the problem can reappear.

:arrow: To sum it up, ksplash, at a certain point, creates .kde/share/config if its not there yet, and prevents cp from putting the real files there.

I don't know why ksplash has to be loaded so early instead of after the non-persistent homedir is created, but in the meantime, here's a real work-around which isn't time-based (so the sleep 2 can be eliminated):

:idea: Right before ksplash is executed, I do
mkdir -p $HOME/.kde/share/config.

This is ugly but solves it for now. As a real solution, ksplash has to be executed AFTER the creation of the non-persistent homedir, or be patched to create these dirs with proper permissions.

Who maintains this package ?

Mack

Klaus Knopper - master himself -, thank you for your bug report and your detailed error analysis. I had this problem today too and asked my what it could be. I think the mkdir is no workaround, but the solution. its ksplash problem if it creates "wrong" permissions.

You ask, why the ksplash is loaded so early ?

Well, its all about psychology, yo know. Joe user, sees an grey screen (standard x) and waits and nothing happens, and he turns the computer off or is angry or bored.

But if there is quick visual feedback "Something is happening" Joe User is happy and waits for it ...

Thats the whole secret about why people can wait about microsoft boot (There is some bar moving), but not if there is just text. It stands there for already 1 minute, nothing happens.

Klaus Knopper, does not make Knoppix so good, because it has only great hardware-detection like many think. No, its the small details, that matter and of which Klaus Knoper takes care. (Like with openoffice also an quick screen is started to keep the user happy)

There _must_ be some magic in Knoppix :-)) <=:-)

cu

Fabian

mack
07-25-2003, 01:30 AM
Klaus Knopper - master himself -, thank you for your bug report and your detailed error analysis. I had this problem today too and asked my what it could be. I think the mkdir is no workaround, but the solution. its ksplash problem if it creates "wrong" permissions.

I'm glad to have saved you the need to hunt this bug. :)
It was a particularly tricky one to track down. I actually remastered again for it, with the original KNOPPIX filesystem - no files changed - just some extra driver in initrd. When the bug still appeareded in a system with nothing changed but a driver which is totally unrelated to KDE or X11, I figured it must be some weird timing thing. Took days to figure out!

Re mkdir, I guess we could call it a solution, at least until ksplash is fixed. The funny thing is that I didn't even find where ksplash calls mkdir(2) for it. I straced it from 45xsession, didn't see an mkdir call, but the dirs were just there after a few seconds. I guess it triggers some other process which causes it, by some IPC method. (It doesn't fork anything that causes it - strace followed forks).


You ask, why the ksplash is loaded so early ?

Well, its all about psychology, yo know. Joe user, sees an grey screen (standard x) and waits and nothing happens, and he turns the computer off or is angry or bored.

But if there is quick visual feedback "Something is happening" Joe User is happy and waits for it ...

Thats the whole secret about why people can wait about microsoft boot (There is some bar moving), but not if there is just text. It stands there for already 1 minute, nothing happens.

Yes, I know all that - I read the comment before ksplash is executed. :)

Me, I'd prefer to just eliminate the 'quiet' keyword from the kernel params and read the kernel messages while booting. Its much more fun than watching a bar. But than again, thats me :)


Klaus Knopper, does not make Knoppix so good, because it has only great hardware-detection like many think. No, its the small details, that matter and of which Klaus Knoper takes care. (Like with openoffice also an quick screen is started to keep the user happy)

There _must_ be some magic in Knoppix :-)) <=:-)

cu

Fabian

I totally agree. Knoppix rocks. 8)

On a side note, I still think openoffice takes way too much time to load. A splash screen is not enough for that. They should splash some tetris or something for the user to play with, as openoffice loads. :lol: