Gentoo Archives: gentoo-user

From: Joshua Murphy <poisonbl@×××××.com>
To: gentoo-user <gentoo-user@l.g.o>
Subject: Re: [gentoo-user] Broken upgrade from udev troubles.
Date: Thu, 17 Dec 2009 06:34:42
Message-Id: c30988c30912162123y6e2de48ah763f80d6f2e1a9b7@mail.gmail.com
In Reply to: [gentoo-user] Broken upgrade from udev troubles. by Tom Bennet
1 On Wed, Dec 16, 2009 at 10:07 PM, Tom Bennet <twbennet@×××××.com> wrote:
2 > I spent the day recovering from a Gentoo upgrade, and thought I'd document
3 > the experience in case it helps someone else.
4 >
5 > I'm running a custom kernel 2.6.25-gentoo-r7 on amd64, though I don't think
6 > the rarer hardware is relevant.
7 >
8 > I tend to put off upgrading my Gentoo box because anytime I do, something
9 > breaks.  I'm afraid I haven't changed my opinion about that.  Anyway, I did
10 > "emerge --update --deep world" and plugged my ears. Some 600-odd packages
11 > (and a few simpler problems) later, the system seemed to be doing okay.  So
12 > I thought I'd see if it could survive a reboot.  No, it couldn't.
13 >
14 > On boot it failed checking the root file system and dropped into the repair
15 > shell.  The reason the fsck failed is that the root pseudo device file
16 > /dev/md0, didn't exist.  The root file system was actually, fine, though.
17 > Inside the repair shell, I could see all the files from my root, but there
18 > wasn't much in /dev.  (I have the md stuff compiled in to the kernel, and
19 > don't use an initrd, so it wasn't an initrd problem.)
20 >
21 > Short Solution
22 >
23 > The problem was with udev, the facility which automatically populates the
24 > /dev directory.  During the upgrade, emerge noted that my kernel version was
25 > a bit early, but acceptable.  What was missing, apparently, was the signalfd
26 > syscall, which that kernel version either doesn't have or I hadn't
27 > configured.  Apparently, udev has only started using signalfd recently, so
28 > the solution was to downgrade to an older version of udev (udev-141 to be
29 > precise).
30 >
31 > What I Actually Did To Get There
32 >
33 > Of course, I didn't know that at first.  Just had a fun unbootable system.
34 > I might have been able to simply emerge the downgrade from the repair shell
35 > (the network did come up), but I didn't know to try that yet.  I figured I
36 > wanted to find some way to make the system boot.  Since the failing file
37 > check is done from /etc/init.d/checkroot, I added a mknod command to create
38 > the device node before trying to run the file check.  At the start of the
39 > start() method:
40 >
41 >         if [ ! -e /dev/md0 ] ; then
42 >            mknod -m 0660 /dev/md0 b 9 0
43 >         fi
44 >
45 > It's a hack, not a solution, but it did make the system boot, to a rather
46 > crippled state.  Since there were a lot of devices missing, a lot of
47 > services wouldn't start.  (If you're using a more boring root partition, it
48 > might be something like "mknod -m 0660 /dev/sda1 b 8 1")
49 >
50 > So I had managed by now to gather that udev wasn't working, but I didn't
51 > know why.  My first thought was to try "/etc/init.d/udev start", to see if
52 > it would start.  But it told me that the script is written for baselevel-2,
53 > and I shouldn't use it on baselevel-1.  Following a bit of googling about
54 > what the heck a baselevel is, I gathered that I was using baselevel-1, and
55 > so the service wasn't supposed to be started that way.   So it wasn't a bug
56 > that it wouldn't start that way.  Another page suggested trying to run it
57 > directly, with "/sbin/udevd --daemon", which gave the message "error getting
58 > signalfd".  That told my why it didn't start. This message was also in the
59 > logs, but for some reason I didn't look there until later.
60 >
61 > So back to Google, and I found a message on a Debian board noting that udev
62 > had started using signalfd recently.  This suggested an old version might do
63 > the trick.  I tried one, and it did.
64
65 I really only have two things to say, after reading this... First, and
66 this really does overshadow the second in weight, thank you for the
67 excellently presented writeup of problem *and* solution, as more often
68 than ever should be (less so here, but across the net as a whole),
69 problems are mentioned, solutions are offered, and rarely does a good,
70 clear, "this worked" follow. Secondly... it's been my experience, with
71 Gentoo, that things break far more often when I allow longer delays
72 between updating than when I keep up to date with everything, and it's
73 held true for me on both x86 and ~x86 systems (as has the headache
74 when I've put updates off).
75
76 And.. I reiterate a part of the "first"... Thank you for the writeup.
77
78 --
79 Poison [BLX]
80 Joshua M. Murphy