Gentoo Archives: gentoo-dev

From: Alexis Ballier <aballier@g.o>
To: gentoo-dev@l.g.o
Subject: Re: [gentoo-dev] Pre-GLEP RFC: Automated enforcing of REQUIRED_USE constraints
Date: Sun, 09 Jul 2017 09:29:46
Message-Id: 20170709112932.6364a23c@gentoo.org
In Reply to: Re: [gentoo-dev] Pre-GLEP RFC: Automated enforcing of REQUIRED_USE constraints by "Michał Górny"
1 On Sat, 08 Jul 2017 23:56:07 +0200
2 Michał Górny <mgorny@g.o> wrote:
3
4 > On sob, 2017-07-08 at 22:34 +0200, Alexis Ballier wrote:
5 > > On Sat, 08 Jul 2017 20:44:24 +0200
6 > > Michał Górny <mgorny@g.o> wrote:
7 > >
8 > > > On sob, 2017-07-08 at 16:12 +0200, Alexis Ballier wrote:
9 > > > > On Sat, 08 Jul 2017 11:43:39 +0200
10 > > > > Michał Górny <mgorny@g.o> wrote:
11 > > > >
12 > > > > > Hi, everyone.
13 > > > > >
14 > > > > > I think the affairs have settled enough and I've finished
15 > > > > > filling in the pre-GLEP for REQUIRED_USE auto-enforcing. It's
16 > > > > > got all the algorithms, rationale and separated reference
17 > > > > > implementation.
18 > > > > >
19 > > > > > If there are no major concerns raised, I will soon start
20 > > > > > working on writing an optimized implementation for
21 > > > > > pkgcore/pkgcheck and integrating the verification algos with
22 > > > > > the CI.
23 > > > > >
24 > > > > > The pre-GLEP for review is here:
25 > > > > >
26 > > > > > https://wiki.gentoo.org/wiki/User:MGorny/GLEP:ReqUse
27 > > > >
28 > > > >
29 > > > > Constraint group reordering algorithm
30 > > > >
31 > > > > I really think we should only consider REQUIRED_USE with
32 > > > > forced/masked useflags instantiated there. And ban (in repoman)
33 > > > > REQUIRED_USE that contain some "False": "a? ( b )" with 'a' free
34 > > > > and 'b' masked is perfectly ok now but it hides a serious
35 > > > > problem in the package/profile. Instantiating this would give:
36 > > > > "a? ( False )" and be an error just like we have depend.bad &
37 > > > > co. This is independent of auto solving or not, it's already
38 > > > > wrong.
39 > > >
40 > > > As I've already explained you multiple times, this obtains
41 > > > *exactly the same* effect. However, it's much simpler when it's
42 > > > done like this because it makes it possible to reuse the already
43 > > > defined algorithms instead of having to precisely define how to
44 > > > preprocess REQUIRED_USE for this and cover all the corner cases.
45 > >
46 > > Simpler??? I don't think so. What I wrote clearly pinpoints that:
47 > > When you'll write the algorithm for "Verifying that the constraints
48 > > do not alter immutable flags" you'll notice this is exactly that
49 > > and can be put as a preprocessing step and then you can drop all
50 > > the corner cases considerations for immutable flags. I never
51 > > understood why you're insisting that much on immutables: they're
52 > > really not natural, not simple, not standard, and carrying them all
53 > > over seems to be a burden to me.
54 >
55 > I wrote the algorithms, and they're simple. This specific check is
56 > the combination of three simple steps:
57 >
58 > a. reordering the groups based on immutables,
59 >
60 > b. transforming the AST into flat form,
61 >
62 > c. verifying each flat constraint.
63 >
64 > The first step is trivial -- it's basically 'move true to front, false
65 > to back'. The second step is more complex but it's needed anyway,
66 > and quite well-defined, especially with the assumption that all
67 > the groups always have at least one flag inside. The third step is
68 > trivial again because it's just checking the conditions and
69 > constraints against a list.
70 >
71 > The alternative to reordering is altering the groups. Altering means
72 > we need to have separate logic for every type of group while sorting
73 > works the same in all of them. Altering means we need to explicitly
74 > special case forcing 1 and >1 items, and masking all items, for each
75 > group. Again, sorting does not need to be concerned about that because
76 > the check following it (also trivial) will catch it anyway.
77
78 You don't seem to get how normalizing and constant
79 propagation/elimination works.
80
81 Basically, reordering would be:
82 And(list) -> And( forced(list) + free(list) + masked(list) )
83 Or(list) -> Or( ... . )
84 etc.
85
86 While normalizing is:
87 And(list) -> False if False appears in a normalize(x) for x in list,
88 True if normalize(x) for x in list is empty or all True,
89 And(normalize(x) for x in list if != True and != False)
90 etc.
91
92 That's described in one point: Apply boolean algebra rules.
93
94 What I don't like about reordering is that it is too tightly coupled to
95 the following solving algorithm and the restricted syntax, while really,
96 having REQUIRED_USE constraints asking you to enable a masked flag is
97 something we ought to kill even without solving as they hide broken
98 deps and make all our QA checks less useful.
99
100
101 Finally, reordering, being essentially a local thing, does not have the
102 proper behavior in a general AST:
103 '|| ( ( a b ) c )' with 'a' and 'b' masked will be left invariant by
104 reordering and the resulting expanded form will be rejected while
105 constant propagation/elimination will reduce that to 'c' and be good.
106 Hence, the reordering check cannot be used as a good input for checking
107 for broken REQUIRED_USE: I really think things like 'vulkan? ( ||
108 ( video_cards_i965 video_cards_radeonsi ) )' should be a repoman error
109 on stable profiles where both those video cards are masked and vulkan is
110 not. For that, we need to support the whole PMS-defined syntax, not a
111 reduced one.
112 Basically, this duplicates the code and work on that side and makes you
113 having to carry the reordering burden all along your GLEP :)
114
115
116 > Of course, you could say that you will get a little better error
117 > message, like 'all flags inside || are masked' instead of '!b -> a'
118 > will alter immutable flag. But that's purely an implementation
119 > detail. It's not worth making the reference algorithms longer.
120
121 Well, as noted above, I don't think this is related to any solver.
122
123 > > > > Reordering is a dangerous path as we've already seen since it
124 > > > > can create unexpected loops for the solver.
125 > > >
126 > > > Freeform reordering is dangerous, and I've removed that already.
127 > > > Reordering restricted to immutables can not cause any issues that
128 > > > any other solution wouldn't cause.
129 > >
130 > > You're very likely right there. Any proof? Esp. any proof that the
131 > > checker still guarantees the existence of a solution in all cases?
132 > > I'm not asking for a formal proof, but simply a bit more details
133 > > than just an assertion saying it's fine.
134 >
135 > The case for checker is just the same as with any other kind of
136 > immutability processing. We need to run the reordering, transform
137 > and verification separately for every possible combination of
138 > immutable flags.
139 >
140 > The reordering explicitly alters the results of the transform, and
141 > with the trivial implication form of the flattened constraints
142 > the verification stage checks will find any problems that may arise
143 > from it, just like they would find any problem from doing a similar
144 > thing verbatim.
145
146 Ok that works. I don't see much difference between
147 'check(reorder(formula,immutables))' and
148 'check(reduce(formula,immutables))'... except the latter does not
149 have to deal with any immutable anymore.
150
151 > >
152 > > > > Working on instantiated REQUIRED_USE constraints would probably
153 > > > > simplify quite a bit your GLEP too: you already have the
154 > > > > "Verifying that the constraints do not alter immutable flags"
155 > > > > part that roughly does the same thing as instantiating, except
156 > > > > if you assume it's already true you can skip the reordering.
157 > > >
158 > > > Except that the reordering can be described in 2 points, and so
159 > > > can be the immutability verification. Please prove that you can
160 > > > provide a simpler explanation that doesn't fail in any of the
161 > > > corner cases.
162 > >
163 > > Except reordering is an invention and immutable checking is simply
164 > > applying boolean logic rules to your implication and check that no
165 > > "False" can appear. You can simply start by applying boolean logic
166 > > and forget about reordering.
167 >
168 > No, it is not. You do not have the values of all the items inside
169 > the group, just some of them. Depending on how many of them do you
170 > have and what are them, you need to transform the group
171 > appropriately, e.g. by removing items, replacing the group or failing
172 > entirely.
173
174 Yes, that's boolean algebra rules. You propagate constants from leaves
175 to root in the AST and if some 'False' appears in your AST when you've
176 reached the root you fail. I agree one needs some practice on recursive
177 structures to understand that quickly and easily though.
178
179
180 > > > > One big advantage of working on ASTs is that it becomes trivial
181 > > > > to suggest a proper reordering.
182 > > >
183 > > > Reordering is never a trivial problem. Unless I'm missing
184 > > > something, it is technically possible that a 'reordering' will
185 > > > actually require a sub- item being moved out of the containing
186 > > > group.
187 > >
188 > > Not if done at the AST level.
189 > >
190 > > > And to be honest, I find the output of the verification script in
191 > > > this regard quite useful. That is, it saying 'X affects Y, so it
192 > > > needs to go before it' is quite clear to me. I don't think most
193 > > > developers would actually need to script to pinpoint a specific
194 > > > location for every single constraint.
195 > >
196 > > In most cases this is sufficient.
197 > > Think of a more complex case:
198 > > A -> B
199 > > B -> C
200 > > A -> D
201 > > D -> C
202 > >
203 > > |-> B -|
204 > > A -| |->C
205 > > |-> D -|
206 > >
207 > > It's starting to be a more complex mental exercise to get the proper
208 > > ordering when given the 1st form only.
209 > >
210 > >
211 > > Actually, considering people rant against git merges because they
212 > > want linear history in the graph log but fail to understand 'git
213 > > log' is precisely about displaying such a linear ordering, I'm
214 > > ready to bet someone will rant :)
215 >
216 > We can discuss this when you have a working algorithm. Right now, it's
217 > a purely theoretical exercise unless someone can come up with
218 > a reasonable way of implementing it.
219
220 Hmmm. lol ?
221 May I suggest you spend 30 minutes on wikipedia about what topological
222 sorting is, what it does and what purpose it serves?
223 It's a bit annoying to see you completely lost every time it comes up.
224
225 In a few words, from a list of 'X affects Y' it gives you a compatible
226 linear ordering. While it can be done by hand on the easy cases, that's
227 something computers are much better at doing on the more complex cases.
228
229
230 > > > > -------
231 > > > >
232 > > > > Restrictions on REQUIRED_USE format
233 > > > >
234 > > > > I still fail to see the point here. One can simply apply the
235 > > > > rewriting you suggest below and be done with it. Rationale is
236 > > > > not very convincing to me:
237 > > > >
238 > > > > - avoiding unpredictable results of automatic flag adjustments:
239 > > > > A deterministic algorithm is, by definition,
240 > > > > predictable.
241 > > >
242 > > > s/unpredictable/surprising/?
243 > > >
244 > > > The goal is for it do something that the developer *not reading
245 > > > the spec* could reasonably predict happening.
246 > >
247 > >
248 > > There is a huge gap between failing to solve a constraint
249 > > voluntarily because it has one too much nesting level and having a
250 > > repoman warning telling 'it is not recommended you write it that
251 > > way'. The latter ensures said developer is able to predict what is
252 > > happening, the former just adds annoyance onto users for no real
253 > > reason.
254 >
255 > Except it doesn't because it's extremely uncommon (and even unlikely)
256 > and I am successfully exterminating the last occurrences. Implementing
257 > support for something that will be never used is a waste of time.
258
259 Except you're already wasting your time exterminating some cases for
260 which support is already written. You still don't know whether your
261 restricted syntax will be accepted in PMS (which mostly depends if
262 people feel comfortable with its expressivity), so you don't know if
263 you won't have to plug support for it later. May I remind you the
264 numbers I ran? Among all the rejected "too complex" usages of requse,
265 only *1* could have led to an invalid solution by the solver.
266
267 Feel free to do so, I mean, it cannot hurt to simplify requse in the
268 tree, but keep in mind that maybe support for more syntaxes will still
269 be needed. Esp. since I learnt recently that PMS factors the definition
270 of ||, &&, ^^ and co with all the other usages of it (DEPEND, etc.).
271
272 Anyway, I'm thinking it's more important to move on with this, and
273 iterate later, than getting everything perfect beforehand.
274
275
276 > > > > - improving readability of REQUIRED_USE constraints:
277 > > > > No need for a restriction for that. If people want to
278 > > > > shoot themselves in the foot, it is not a PMS problem. I see
279 > > > > that like proposing death penalty for those who commit
280 > > > > suicide :)
281 > > >
282 > > > This is not PMS. This is a GLEP which serves both the purpose of
283 > > > a technical specification with extended rationale and a policy
284 > > > document.
285 > >
286 > > Isn't the goal of it to have it in a future EAPI ?
287 >
288 > I don't see how that's relevant. Nobody will be copying the whole GLEP
289 > verbatim into the PMS.
290
291 Unless you're writing a throwaway GLEP, the syntax restrictions are
292 supposed to go to PMS since that's what defines it...
293
294
295 > > > > - keeping the specification and implementation relatively
296 > > > > simple: You already define everything for working without
297 > > > > restriction. Plus, unlimited implication nesting has the same
298 > > > > complexity.
299 > > >
300 > > > No, I don't. I don't cover the meaning of nested groups and things
301 > > > just explode when they come into game.
302 > >
303 > >
304 > > You mostly do with the rewriting part, you're only refusing to admit
305 > > it :)
306 >
307 > You mean the transform? It doesn't cover the possibility of those
308 > groups containing anything but plain flags. As we've already
309 > established, the results become non-trivial when they do and those
310 > cases are certainly not covered here.
311
312 That's why I said mostly. The missing parts are really small, trust me.
313
314
315 [...]
316
317 Alexis.

Replies