Re: [gentoo-user] Broken upgrade from udev troubles. - gentoo-user

From:	Joshua Murphy <poisonbl@×××××.com>
To:	gentoo-user <gentoo-user@l.g.o>
Subject:	Re: [gentoo-user] Broken upgrade from udev troubles.
Date:	Thu, 17 Dec 2009 06:34:42
Message-Id:	`c30988c30912162123y6e2de48ah763f80d6f2e1a9b7@mail.gmail.com`
In Reply to:	[gentoo-user] Broken upgrade from udev troubles. by Tom Bennet

1

On Wed, Dec 16, 2009 at 10:07 PM, Tom Bennet <twbennet@×××××.com> wrote:

2

> I spent the day recovering from a Gentoo upgrade, and thought I'd document

3

> the experience in case it helps someone else.

4

>

5

> I'm running a custom kernel 2.6.25-gentoo-r7 on amd64, though I don't think

6

> the rarer hardware is relevant.

7

>

8

> I tend to put off upgrading my Gentoo box because anytime I do, something

9

> breaks.  I'm afraid I haven't changed my opinion about that.  Anyway, I did

10

> "emerge --update --deep world" and plugged my ears. Some 600-odd packages

11

> (and a few simpler problems) later, the system seemed to be doing okay.  So

12

> I thought I'd see if it could survive a reboot.  No, it couldn't.

13

>

14

> On boot it failed checking the root file system and dropped into the repair

15

> shell.  The reason the fsck failed is that the root pseudo device file

16

> /dev/md0, didn't exist.  The root file system was actually, fine, though.

17

> Inside the repair shell, I could see all the files from my root, but there

18

> wasn't much in /dev.  (I have the md stuff compiled in to the kernel, and

19

> don't use an initrd, so it wasn't an initrd problem.)

20

>

21

> Short Solution

22

>

23

> The problem was with udev, the facility which automatically populates the

24

> /dev directory.  During the upgrade, emerge noted that my kernel version was

25

> a bit early, but acceptable.  What was missing, apparently, was the signalfd

26

> syscall, which that kernel version either doesn't have or I hadn't

27

> configured.  Apparently, udev has only started using signalfd recently, so

28

> the solution was to downgrade to an older version of udev (udev-141 to be

29

> precise).

30

>

31

> What I Actually Did To Get There

32

>

33

> Of course, I didn't know that at first.  Just had a fun unbootable system.

34

> I might have been able to simply emerge the downgrade from the repair shell

35

> (the network did come up), but I didn't know to try that yet.  I figured I

36

> wanted to find some way to make the system boot.  Since the failing file

37

> check is done from /etc/init.d/checkroot, I added a mknod command to create

38

> the device node before trying to run the file check.  At the start of the

39

> start() method:

40

>

41

>         if [ ! -e /dev/md0 ] ; then

42

>            mknod -m 0660 /dev/md0 b 9 0

43

>         fi

44

>

45

> It's a hack, not a solution, but it did make the system boot, to a rather

46

> crippled state.  Since there were a lot of devices missing, a lot of

47

> services wouldn't start.  (If you're using a more boring root partition, it

48

> might be something like "mknod -m 0660 /dev/sda1 b 8 1")

49

>

50

> So I had managed by now to gather that udev wasn't working, but I didn't

51

> know why.  My first thought was to try "/etc/init.d/udev start", to see if

52

> it would start.  But it told me that the script is written for baselevel-2,

53

> and I shouldn't use it on baselevel-1.  Following a bit of googling about

54

> what the heck a baselevel is, I gathered that I was using baselevel-1, and

55

> so the service wasn't supposed to be started that way.   So it wasn't a bug

56

> that it wouldn't start that way.  Another page suggested trying to run it

57

> directly, with "/sbin/udevd --daemon", which gave the message "error getting

58

> signalfd".  That told my why it didn't start. This message was also in the

59

> logs, but for some reason I didn't look there until later.

60

>

61

> So back to Google, and I found a message on a Debian board noting that udev

62

> had started using signalfd recently.  This suggested an old version might do

63

> the trick.  I tried one, and it did.

64

65

I really only have two things to say, after reading this... First, and

66

this really does overshadow the second in weight, thank you for the

67

excellently presented writeup of problem *and* solution, as more often

68

than ever should be (less so here, but across the net as a whole),

69

problems are mentioned, solutions are offered, and rarely does a good,

70

clear, "this worked" follow. Secondly... it's been my experience, with

71

Gentoo, that things break far more often when I allow longer delays

72

between updating than when I keep up to date with everything, and it's

73

held true for me on both x86 and ~x86 systems (as has the headache

74

when I've put updates off).

75

76

And.. I reiterate a part of the "first"... Thank you for the writeup.

77

78

--

79

Poison [BLX]

80

Joshua M. Murphy

1	On Wed, Dec 16, 2009 at 10:07 PM, Tom Bennet <twbennet@×××××.com> wrote:
2	> I spent the day recovering from a Gentoo upgrade, and thought I'd document
3	> the experience in case it helps someone else.
4	>
5	> I'm running a custom kernel 2.6.25-gentoo-r7 on amd64, though I don't think
6	> the rarer hardware is relevant.
7	>
8	> I tend to put off upgrading my Gentoo box because anytime I do, something
9	> breaks. I'm afraid I haven't changed my opinion about that. Anyway, I did
10	> "emerge --update --deep world" and plugged my ears. Some 600-odd packages
11	> (and a few simpler problems) later, the system seemed to be doing okay. So
12	> I thought I'd see if it could survive a reboot. No, it couldn't.
13	>
14	> On boot it failed checking the root file system and dropped into the repair
15	> shell. The reason the fsck failed is that the root pseudo device file
16	> /dev/md0, didn't exist. The root file system was actually, fine, though.
17	> Inside the repair shell, I could see all the files from my root, but there
18	> wasn't much in /dev. (I have the md stuff compiled in to the kernel, and
19	> don't use an initrd, so it wasn't an initrd problem.)
20	>
21	> Short Solution
22	>
23	> The problem was with udev, the facility which automatically populates the
24	> /dev directory. During the upgrade, emerge noted that my kernel version was
25	> a bit early, but acceptable. What was missing, apparently, was the signalfd
26	> syscall, which that kernel version either doesn't have or I hadn't
27	> configured. Apparently, udev has only started using signalfd recently, so
28	> the solution was to downgrade to an older version of udev (udev-141 to be
29	> precise).
30	>
31	> What I Actually Did To Get There
32	>
33	> Of course, I didn't know that at first. Just had a fun unbootable system.
34	> I might have been able to simply emerge the downgrade from the repair shell
35	> (the network did come up), but I didn't know to try that yet. I figured I
36	> wanted to find some way to make the system boot. Since the failing file
37	> check is done from /etc/init.d/checkroot, I added a mknod command to create
38	> the device node before trying to run the file check. At the start of the
39	> start() method:
40	>
41	> if [ ! -e /dev/md0 ] ; then
42	> mknod -m 0660 /dev/md0 b 9 0
43	> fi
44	>
45	> It's a hack, not a solution, but it did make the system boot, to a rather
46	> crippled state. Since there were a lot of devices missing, a lot of
47	> services wouldn't start. (If you're using a more boring root partition, it
48	> might be something like "mknod -m 0660 /dev/sda1 b 8 1")
49	>
50	> So I had managed by now to gather that udev wasn't working, but I didn't
51	> know why. My first thought was to try "/etc/init.d/udev start", to see if
52	> it would start. But it told me that the script is written for baselevel-2,
53	> and I shouldn't use it on baselevel-1. Following a bit of googling about
54	> what the heck a baselevel is, I gathered that I was using baselevel-1, and
55	> so the service wasn't supposed to be started that way. So it wasn't a bug
56	> that it wouldn't start that way. Another page suggested trying to run it
57	> directly, with "/sbin/udevd --daemon", which gave the message "error getting
58	> signalfd". That told my why it didn't start. This message was also in the
59	> logs, but for some reason I didn't look there until later.
60	>
61	> So back to Google, and I found a message on a Debian board noting that udev
62	> had started using signalfd recently. This suggested an old version might do
63	> the trick. I tried one, and it did.
64
65	I really only have two things to say, after reading this... First, and
66	this really does overshadow the second in weight, thank you for the
67	excellently presented writeup of problem and solution, as more often
68	than ever should be (less so here, but across the net as a whole),
69	problems are mentioned, solutions are offered, and rarely does a good,
70	clear, "this worked" follow. Secondly... it's been my experience, with
71	Gentoo, that things break far more often when I allow longer delays
72	between updating than when I keep up to date with everything, and it's
73	held true for me on both x86 and ~x86 systems (as has the headache
74	when I've put updates off).
75
76	And.. I reiterate a part of the "first"... Thank you for the writeup.
77
78	--
79	Poison [BLX]
80	Joshua M. Murphy

Gentoo Archives: gentoo-user