1 |
On Thursday, September 15, 2011 10:18:27 PM Robin H. Johnson wrote: |
2 |
> On Thu, Sep 15, 2011 at 11:00:47PM +0200, Joost Roeleveld wrote: |
3 |
> > > See below on the existing udev retry queue that is hiding many of |
4 |
> > > the |
5 |
> > > issues from you. This hidden issues are also negatively affecting |
6 |
> > > boot |
7 |
> > > times (failures and retries take time). |
8 |
> > |
9 |
> > I don't actually mind too much about the boot time. If there are retries |
10 |
> > like this, I would expect this to be visible in the system logs. |
11 |
> |
12 |
> udev does not log rule failures to the best of my knowledge. |
13 |
|
14 |
In other words, it silently fails... |
15 |
That is unfortunate. |
16 |
|
17 |
> > Either udev does this already and the execution sequence is always the |
18 |
> > same. In which case my suggestion above would follow the same sequence |
19 |
> > as the queue would be on a First-in First-out basis. |
20 |
> > Or, if udev doesn't do this yet, udev will end up having the same |
21 |
> > problem. |
22 |
> It doesn't do it presently. The worst case I've seen is having an early |
23 |
> rule that succeeds, but gives different results w/ /var mounted vs. not |
24 |
> mounted. This rule didn't actual fail, so it does not get retried... |
25 |
|
26 |
And here is my main concern with this. The udev team don't list all the |
27 |
possible filesystems where things can go wrong. They only mention /usr. |
28 |
|
29 |
> > > 1. While the binary invoked by a given rule might reside entirely on |
30 |
> > > a |
31 |
> > > |
32 |
> > > mounting that is already available, how do you ensure that the |
33 |
> > > other |
34 |
> > > mountpoints required by said binary are ALSO available (the |
35 |
> > > bluetooth and ALSA rules actually need /var, what if you have |
36 |
> > > a bluetooth keyboard? [see footnote]). |
37 |
> > |
38 |
> > This is why I would suggest the "actiond" process to be started after |
39 |
> > localmount. |
40 |
> |
41 |
> That's too late. What about all the udev rules required to get the |
42 |
> device nodes for localmount to succeed? |
43 |
|
44 |
Shouldn't these already exist for currently working setups? |
45 |
|
46 |
> Anyway, take your proposed split actiond/udev solution to the upstream |
47 |
> udev list. I don't believe that we have the manpower to develop it here. |
48 |
> If we did develop it here, I don't believe it will gain enough traction |
49 |
> amongst other distros, and we'll be stuck supporting it. |
50 |
> |
51 |
> I personally don't think your split solution covers the usage cases well |
52 |
> enough, but that's an actual decision best left to the upstream udev |
53 |
> developers. Please take the discussion there, and don't pursue it on |
54 |
> this list. |
55 |
|
56 |
Ok. |
57 |
|
58 |
> > > The upstream discussions I've been party to previously (both on |
59 |
> > > lists |
60 |
> > > and in person), have been trying to avoid needing a full dependency |
61 |
> > > system in udev, because it's a huge degree of additional complexity. |
62 |
> > |
63 |
> > I don't see why it would not be possible to pause actioning of these |
64 |
> > scripts till the boot-process says all required mounts are available. |
65 |
> |
66 |
> You still have to declare which scripts need to be paused, and/or which |
67 |
> rules inside the scripts need to be paused. Even worse are cases where |
68 |
> mounting some of localmount entries requires those scripts to have |
69 |
> completed. |
70 |
|
71 |
In other words, a dependency on the rules would be needed? |
72 |
|
73 |
> > I see this as a "solution" for the situation where someone decides to |
74 |
> > use |
75 |
> > less-common hardware and force this solution onto everyone else. |
76 |
> |
77 |
> I'm trying to get as little forced on us as possible. Gentoo is one of |
78 |
> the few distros at this point that doesn't already require initramfs. |
79 |
> While we're going to carry on supporting not requiring an initramfs as |
80 |
> long as possible (and documenting what is needed for that), we also |
81 |
> don't want this to become a stumbling block for users that just want |
82 |
> their system to work, and with how upstream udev and other projects are |
83 |
> going, there is a real possibility of breakage caused by their code, far |
84 |
> more than the potential of breakage because /usr failed to mount. |
85 |
|
86 |
I agree with you on this one. That is also why I am trying to get a clear |
87 |
picture of all the possible alternatives. |
88 |
|
89 |
> > If I would want to put my /usr filesystem on a bluetooth harddrive (for |
90 |
> > instance my mobile phone), then I would not expect to have this work |
91 |
> > without a lot of extra effort. |
92 |
> |
93 |
> While that is in the realms of extreme, having /usr or /var on NFS isn't |
94 |
> extreme at all. |
95 |
|
96 |
I agree, I just used this example to explain that it shouldn't be necessary to |
97 |
force an initramfs on all users just because there is a small group who wants |
98 |
to have an extreme setup. |
99 |
|
100 |
> > > udev has a retry queue already, see udev-postmount: |
101 |
> > > === |
102 |
> > > # Run the events that failed at first udev trigger |
103 |
> > > udevadm trigger --type=failed -v |
104 |
> > > === |
105 |
> > |
106 |
> > This is a retry-queue. That's a good start already, but why not redo |
107 |
> > this |
108 |
> > queue and put ALL the scripts into that queue untill after localmount? |
109 |
> |
110 |
> See above, about rules that are required for localmount to be able to |
111 |
> complete. The most prevalent ones would probably be devices by-uuid and |
112 |
> by-label. Those symlinks come from udev... |
113 |
|
114 |
These must also come from somewhere else as this also works in the initramfs |
115 |
stage. Which is, from what I gather, used to get to the stage where udev can |
116 |
run. |
117 |
|
118 |
> > > > With a smaller udev, the chances of it failing should also be |
119 |
> > > > less. |
120 |
> > > > (less |
121 |
> > > > code-lines to check) and as long as the /dev-entries are |
122 |
> > > > created, these can be used to manually mount the other |
123 |
> > > > partitions to get to the point where the system can be fixed to |
124 |
> > > > get it back to a workable state.> > |
125 |
> > > The problem is NOT in the udev codebase. It's in udev rules. Even at |
126 |
> > > the rule level, it's mostly rules for packages other than udev |
127 |
> > > itself.> |
128 |
> > Yes, but as I already stated, the problem-rules do not exist on all |
129 |
> > systems. My systems for instance don't have any pointing to anything |
130 |
> > other then /etc/... |
131 |
> > These scripts also don't call anything that isn't mounted at the time. |
132 |
> |
133 |
> Does your desktop use ALSA? |
134 |
> /lib/udev/rules.d/90-alsa-restore.rules |
135 |
> invokes |
136 |
> "/usr/sbin/alsactl restore ..." |
137 |
> Which in turn reads from /var/lib/alsa/asound.state. |
138 |
> |
139 |
> We presently have the restore() function in /etc/init.d/alsasound that |
140 |
> repeats this, because that rule fails to work often during boot |
141 |
> (non-existence of the state file causes it to use built-in defaults |
142 |
> instead). |
143 |
> |
144 |
> udev runs that rule as soon as the hardware turns up, which is often |
145 |
> before localmount. |
146 |
|
147 |
I have doubts about having all these things handled by udev. As you said, |
148 |
there is an init-script that handles this. Is the ultimate goal to get rid of |
149 |
init-scripts and have everything done automagically? |
150 |
|
151 |
> > That system has been running without incident for several years. Why do |
152 |
> > I |
153 |
> > suddenly have to make that system more complex? |
154 |
> |
155 |
> Just because there are no visible errors, doesn't mean that they don't |
156 |
> exist. This move to encourage initramfs is to ensure that there isn't |
157 |
> any major breakage incidents soon. What if udev upstream suddenely |
158 |
> starts hard requiring /usr to mounted, and not doing retries at all. |
159 |
> How many systems are going to break, and users going to complain about |
160 |
> needing to use livecds to recover? |
161 |
|
162 |
A lot. And those will be very vocal. |
163 |
I have a few goals with this thread and one of them is to try to figure out |
164 |
how best to avoid users to get affected by this. |
165 |
I am not a developer (I would like to try to do some programming), but I do |
166 |
have a lot of ideas. And due to the lack of information available to the |
167 |
users, I decided to check here. |
168 |
|
169 |
I truly appreciate the time everyone is taking to try to answer the questions |
170 |
I have. I hope to be able to come to a clearer understanding of the whole |
171 |
thing. |
172 |
|
173 |
> > If the changes proposed are actually done (moving everything out of / |
174 |
> > and into /usr) then udev won't be available to create the /dev-entries. |
175 |
> > A pre-populated set would work for most, but /dev/mapper used to require |
176 |
> > an initramfs as this device would have different numbers upon boot. If |
177 |
> > this is still the case, how would I be able to get LVM and MDADM to run |
178 |
> > to get to my partitions? |
179 |
> |
180 |
> DEVTMPFS creates the first batch, and udev creates the rest. |
181 |
> |
182 |
> The deciding case then becomes: |
183 |
> - Is the device for your /usr entry in fstab created by udev or |
184 |
> something else? |
185 |
> |
186 |
> MD: done by devtmpfs |
187 |
> LVM: done by udev+lvm |
188 |
> by-uuid/by-label: done by udev |
189 |
> |
190 |
> by-uuid and by-label present a lot of annoyance to the minimal |
191 |
> initramfs. We have to ensure that we explicitly support them, which has |
192 |
> increased the complexity of the initramfs. |
193 |
|
194 |
My /usr is on LVM. That requires udev. |
195 |
|
196 |
My understanding is: |
197 |
- udev needs /usr to be mounted to work |
198 |
- udev is needed to sort out LVM to get access to /usr |
199 |
|
200 |
How does the initramfs handle this? |
201 |
And why can't this be implemented in localmount? |
202 |
|
203 |
-- |
204 |
Joost |