Gentoo Archives: gentoo-dev

From: Alexis Ballier <aballier@g.o>
To: gentoo-dev@l.g.o
Subject: Re: [gentoo-dev] [RFC] Forced/automatic USE flag constraints (codename: ENFORCED_USE)
Date: Sun, 11 Jun 2017 16:05:33
Message-Id: 20170611180518.5e28ddfa@gentoo.org
In Reply to: Re: [gentoo-dev] [RFC] Forced/automatic USE flag constraints (codename: ENFORCED_USE) by "Michał Górny"
1 On Fri, 09 Jun 2017 18:21:50 +0200
2 Michał Górny <mgorny@g.o> wrote:
3
4 > (cut off the parts where I agree and there's nothing to add)
5 >
6 > On pią, 2017-06-09 at 16:16 +0200, Alexis Ballier wrote:
7 > > [...]
8 > > > > In your example above, we'd call 'nsolve("|| ( X )")' and
9 > > > > 'nsolve("|| ( Y )")' (or even simpler, depending on how
10 > > > > simplify() is defined). If both X and Y are masked on a
11 > > > > profile, then that'd reduce to 'nsolve("False")' which would
12 > > > > rant.
13 > > >
14 > > > So you're talking about reducing prior to transforming? Yes,
15 > > > that'd work. As I mentioned in one of the first mails wrt my
16 > > > reference implementation, I've used reordering (stable sort)
17 > > > instead of reducing since it was simpler.
18 > > >
19 > > > If you reduce (simplify), you need to account for special cases
20 > > > like getting '|| ()' etc. If you reorder only, things just fail
21 > > > the normal way.
22 > >
23 > > While the reordering idea seems nice as it factors both user
24 > > preference and masks, the problem with reordering is that nothing
25 > > guarantees that the solver won't try to enable a masked flag. We'd
26 > > have to deal with that somehow.
27 >
28 > Well, yes and no.
29 >
30 > The algorithm always needs to account for the possibility of
31 > constraints altering immutable flags. Of course, there's more than
32 > one way of doing it.
33 >
34 > AFAIU you are aiming for separate processing of immutable flags
35 > and explicit failure if the constraints would attempt to force value
36 > of those flags. That surely makes sense for verification.
37
38 The semi-hidden goal here for me is to have purely ast rewriting rules
39 giving a list of implications. This makes the solver trivial as those
40 are read as "if condition then constraint" and can be used as input for
41 the checker. Failing that, this would need to be done on the checker
42 side anyway and then we might run into problems like the checker not
43 really checking reality since the solver behaves a little bit
44 differently.
45
46 > My approach was simpler -- marking the flags immutable, and failing if
47 > something actually tries to alter their value. I think it's a simpler
48 > solution for the plain solver and it works as well. After all, we do
49 > not want the solver to attempt to find workarounds for the problem
50 > but just fail.
51
52 This should be equivalent: masked flags will be toggled as last resort
53 and fail; eliminated flags will not be toggled at all and fail if
54 having them as immutable causes a contradiction
55
56 > The above applies clearly to the plain conflicts such as:
57 >
58 > foo? ( bar )
59 >
60 > where bar is immutable. The n-ary operators can be flexed a little.
61 > That's what reordering achieves -- it makes sure they come as the most
62 > or the least preferred as appropriate. Which means that the same
63 > algorithm either succeeds (by not having to touch them) or fails at
64 > attempting to flip them.
65 >
66 > Think of:
67 >
68 > ?? ( a b c )
69 >
70 > with both b&c in use.force. This gets reordered to:
71 >
72 > ?? ( b c a )
73 >
74 > The order between b&c doesn't matter. Since b comes first now, it
75 > forces c being disabled. Since c is immutable, the solver fails with
76 > ImmutabilityError. We solve the problem with minimal code redundancy.
77
78 Considering that code should ideally be checked, that'd be '?? ( a true
79 true )' reducing to 'false' and a repoman error.
80
81
82 > > I think reordering should be kept for user preferences
83 > > (soft-enable/soft-disable) while masks for hard-no's or hard-yes'es.
84 > >
85 > >
86 > > Be careful with reordering though:
87 > > '^^ ( a b ) b? ( a )' can be solved in one pass.
88 > > (it disables b if both are set and enables a if none are set)
89 > >
90 > > while:
91 > > '^^ ( b a ) b? ( a )' loops
92 > > (if both are set it disables 'a' for the 1st clause but then
93 > > enables it for the 2nd)
94 > >
95 > > This is not checked by nsolve().
96 >
97 > Yes, this is a problem I was considering. I planned to think a bit if
98 > we could possibly generate some more complex transformations to
99 > trigger nsolve to notice this kind of issues.
100
101
102 Except feeding the n! possible reorderings to nsolve() and checking them
103 all I don't see many other possibilities.
104
105 We could think about a transformation that would be order agnostic,
106 like '|| ( a b c )' giving the same output as '|| ( b c a )' but then
107 this would not express any preference anymore. Remember: The whole
108 point of the solver is to break the symmetry of SAT formulas so that
109 there is a natural way to solve them instead of just "figure out some
110 useflags that make it work". In other words, order does actually
111 matter a lot, otherwise you fall into the "solve SAT" trap again.
112
113
114 > And now two updates on other matters.
115 >
116 > Firstly, all-of inside ??. Per the construct:
117 >
118 > ?? ( ( a b ) c )
119 >
120 > the expansion would be:
121 >
122 > [a b]? ( !c ) c? ( ![a b] )
123 >
124 > which means we should be able to easily affect the effective behavior
125 > of both implementations by defining how to handle/expand negations of
126 > all- of groups.
127 >
128 > It's then a matter of replacing it with:
129 >
130 > a. !a, or
131 >
132 > b. !a !b.
133 >
134 > As you pointed out, a. has the advantage that we alter less flags. b.
135 > has the advantage that we alter more flags -- so it's less likely
136 > we'll actually leave some unused flag enabled. Whichever we choose, it
137 > probably doesn't matter as I can't think of a valid use case for this
138 > constraint that would clearly define the result.
139
140
141 I think this can result in (otherwise solvable) errors too:
142 "c? ( ![a b] ) b" is ok if expanded to "c? ( !a ) b" but not if
143 expanded to "c? ( !a !b ) b" with USE="-* c".
144
145 This is quite a stupid and extreme example but I can't think of any
146 case that solve() would work when negating all of them but not when
147 negating only the first.
148
149 > Secondly, nested n-ary operators. I have taken the following snippet
150 > as a simple example:
151 >
152 > || ( a || ( b c ) )
153 >
154 > Logically (and per constraint checking algo), this should be
155 > equivalent to:
156 >
157 > || ( a b c )
158 >
159 > However, if we expand it to implication form, we get:
160 >
161 > ![ || ( b c ) ] => a
162 >
163 > ![ !c => b ] => a
164 >
165 > At this point, we already see some contract problem/ambiguity. Per
166 > contract, we are supposed not to alter any flags on LHS of
167 > implication. However, we have another implication there, so it is
168 > unclear if RHS of that nested implication should be mutable or not.
169 >
170 > Let's consider the nested implication first:
171 >
172 > !c => b
173 >
174 > Per the constraint checking rules, this constraint is met (evaluates
175 > to true) either if c is enabled, or both c is disabled and b is
176 > enabled. In other words, it fails (evaluates to false) only if both b
177 > and c are disabled. Putting that into a table we get:
178 >
179 > b c | ?
180 > 0 0 | 0 (fail -- LHS matches, RHS does not)
181 > 0 1 | 1 (LHS does not match)
182 > 1 0 | 1 (LHS & RHS matches)
183 > 1 1 | 1 (LHS does not match)
184 >
185 > Per the solving rules, in the only failing case we should enforce RHS
186 > -- i.e. enable b.
187 >
188 > Now, let's consider its negation:
189 >
190 > ![ !c => b ]
191 >
192 > Per the rules of logic, it is true (= matches the constraint) only if
193 > both b and c are disabled. While it is unclear if we should be
194 > enforcing RHS inside negation, logically saying I don't think it can
195 > actually happen. Because:
196 >
197 > 1. if b=c=0, the whole negated constraint is satisfied, so we
198 > shouldn't apply any solving to it (attempting to solve it would be
199 > inconsistent with the case where whole REQUIRED_USE is satisfied
200 > immediately),
201 >
202 > 2. in any other case, the inner constraint is satisfied, so there's no
203 > change to be applied.
204 >
205 > That considered, I think we could reasonably replace the negation of
206 > implication with plain conjunction of LHS and negation of RHS:
207 >
208 > [ !c !b ]
209 >
210 > This would render the outer expression as:
211 >
212 > [ !c !b ] => a
213 >
214 > which is equivalent to || ( a b c ).
215 >
216 > The question is whether applying that rule for implications nested on
217 > LHS of another implication is going to work fine for all our
218 > expansions.
219 >
220
221 We have two issues here I think: We want to preserve the semantics of
222 the whole thing, that is: solve(rewrite(x)) always does the exact same
223 thing as solve(x) (and its definition) and rewrite(x) should not grow x
224 exponentially (or at least, not too much). To add to the fun, we want
225 solve(rewrite(x)) to always provide a valid combination and rewrite(x)
226 to play nice with nsolve() (or alter a bit nsolve to play nice with it).
227
228 || ( foo? ( bar ) baz ) would read as "if not baz and foo then bar else
229 baz"; this is equivalent to "foo? ( || ( bar baz ) ) !foo? ( baz )".
230 However, this roughly doubles the size of the formula for each nested
231 implication.
232
233 If we blindly apply the || expansion, then we get:
234 '!baz? ( foo? ( bar ) )' so that the solver does not give a valid
235 answer for USE='-*' as it won't change anything and this does not
236 match the original constraint.
237
238
239 foo -> bar is logically equivalent to || ( !foo bar ); semantically
240 this is wrong since we prefer leftmost and this would mean 'if not bar
241 then disable foo' while we want 'if foo then bar'. So we should write
242 it '|| ( bar !foo )'. This is why, while truth tables are necessary,
243 you should not rely on them either since they fail to capture the non
244 commutativity of the ||, && & co.
245
246 After replacing, we get:
247 || ( || ( bar !foo ) baz )
248 and we're back to your original problem.
249
250 I'm starting to think the nested implications are not implications at
251 all.
252
253 If we want '|| ( foo? ( bar ) baz )' to be equivalent to 'foo? ( ||
254 ( bar baz ) ) !foo? ( baz )' then 'foo? ( bar )' should evaluate to
255 false when foo is disabled so that the algorithm chooses 'baz'.
256 That'd be: 'foo? ( bar )' is replaced by '( foo bar )', giving:
257
258 '|| ( ( foo bar ) baz )' being rewritten as '!baz? ( foo bar )'. This
259 starts to make more sense: Now the solver will always provide a valid
260 answer. We can also see that the leftmost preference is now very
261 strong as it will seriously force the leftmost clause to be true.
262 There is not much better we can do here I think.
263
264 Now, consider its AllOf equivalent '( foo? ( bar ) baz )'; replacing as
265 above gives '( foo bar baz )' enabling them all. While this works, this
266 is likely not what we want. Better replace it by '( || ( bar !foo ) baz
267 )' as a real implication, giving two toplevel implications: 'foo? ( bar
268 )' and 'baz'.
269
270 And finally '?? ( foo? ( bar ) baz )': "if baz and foo then not bar".
271 Here it is equivalent to '?? ( ( foo bar ) baz )'.
272
273
274 While it is a bit annoying to have to look at a clause's parent in the
275 AST in order to rewrite it, we have now eliminated nested implications
276 without exponential grow.
277
278
279 Note also that we need to merge nested implications too:
280 || ( a? ( b? ( c ) ) d )
281 should read as:
282 || ( [a,b]?[c] d ) and be translated to:
283 || ( ( a b c ) d )
284
285 This reads as: 'if not d then a, b and c'
286
287 If we do not merge them and apply the algorithm recursively in a
288 breadth-first way we get:
289 || ( ( a b? ( c ) ) d )
290 And then, since we have an AllOf clause:
291 || ( ( a || ( c !b ) ) d )
292
293 Reading as: 'if not d then ( if b and not c then a )' where the solver
294 will not change anything with e.g. USE='-* c' leaving the initial
295 constraint unsatisfied. We can force this behavior with the following
296 constraint: '|| ( a? ( b? ( c ) e ) d )' (so that a? ( AllOf(['b?c',
297 'e'] )' does not allow easy merging of the implications).
298
299 So we have to forget about merging nested implications and have a
300 problem here. Even for the solver. I think the safest way is to always
301 replace nested implications 'foo? ( ... )' by AllOf([foo, ...]).
302 While it would give some unexpected results, this would make the solver
303 work in every case.
304
305 || ( a? ( b? ( c ) ) d ) will be '|| ( ( a b c ) d )' in every case.
306 || ( a? ( b? ( c ) e ) d ) will be '|| ( ( a b c e ) d )'.
307
308
309
310 Then comes '??':
311 ?? ( c1 c2 ... ck ) is expanded to:
312 c1? ( !c2 ... !ck )
313 c2? ( !c3 ... !ck )
314 etc.
315
316
317 One problem we've seen is computing things like !c1 when c1 is, for
318 example, an implication. The above replacement makes that easy since
319 we're now in a boolean algebra and can bubble down the negations to
320 useflags.
321
322 ?? ( a? ( b ) c d )
323 becomes:
324 ?? ( ( a b ) c d )
325 then:
326 (( a b ) ? (!c !d))
327 c? ( !d )
328 which is good.
329
330 ?? ( c a? ( b ) d )
331 becomes:
332 ?? ( c ( a b ) d )
333 then:
334 c? ( !(a b) !d ) -> c? ( || ( !a !b ) !d )
335 (a b)? ( !d )
336 then, expanding everything:
337 c? !d
338 c? b? !a
339 a? b? !d
340
341 which is good too.
342
343
344 We can define how to bubble down the negations:
345 Not(AnyOf(l)) -> AllOf([Not(x) for x in l])
346 Not(AllOf(l)) -> AnyOf([Not(x) for x in l])
347
348
349 Another option you suggested is Not(AllOf(l)) -> AllOf([Not(x) for x in
350 l]). This preserves solving but breaks Not(Not(x)) == x here. In the
351 above this would change 'c? ( !(a b) !d )' as 'c? ( !a !b !d )' which
352 would work too but be more constraining.
353
354 After eliminating nested implications, ^^, ?? and bubbling down the
355 negations, the remaining AST for the constraints is much simpler:
356 rclause =
357 AllOf(list of rclauses)
358 | AnyOf(list of rclauses)
359 | Not(useflag)
360 | useflag
361
362 And a REQUIRED_USE is a list of Implication([list of flags or !flags],
363 rclause), where the list is read as 'AllOf' which we want to transform
364 as a list of 'Implication([list of flags or !flags], [list of flags
365 or !flags])'.
366
367 We can do that recursively:
368 transform (expr: rclause) -> list(Implication[list of flags or !flags],
369 [list of flags or !flags])
370
371 Here I assume Implication([],[b]) is 'b') and Implication([...],[]) is
372 nothing.
373
374 transform (expr:rclause):
375 match expr with:
376 useflag | !useflag -> [ Implication([], [expr]) ]
377 | AnyOf(l) ->
378 f = l.pop(0)
379 if len(l)>0: return merge(Not(AnyOf(l)), transform(f))
380 else return transform(f)
381 | AllOf(l) ->
382 r = []
383 for i in l: r+=transform(i)
384 return r
385
386 And the merge function:
387
388 merge (condition:rclause, consequences:list of implications):
389 match reduce(rclause) with:
390 useflag | !useflag as x -> [ Implication([x]+c.conditions,
391 c.consequences) for c in consequences ]
392 | AnyOf(l) ->
393 r = []
394 for c in l:
395 r+=merge(c,consequences)
396 return r
397 | AllOf(l) ->
398 if len(l) <= 0: return consequences
399 return merge(l[0], merge(AllOf(l[1:]),consequences))
400
401
402 Note that this is exponential (we're basically computing the DNF form
403 of the formula here), but the exponent is roughly the number of
404 alternating nested || and &&, so that should still be ok.
405
406
407
408 Let's apply that to your examples:
409 transform(|| ( a || ( b c ) ))
410 -> merge( Not(AnyOf('|| ( b c )'), transform('a'))
411 -> merge( AllOf(AllOf(['!b', '!c']), [Implication([],['a'])])
412 (bubbling down the Not())
413 -> merge( AllOf(['!b', '!c']), merge(AllOf([]), [Implication([],['a'])])
414 -> merge( AllOf(['!b', '!c']), [Implication([],['a'])])
415 -> merge( '!b', merge(AllOf(['!c'], [Implication([],['a'])]))
416 -> merge( '!b', merge('!c', [Implication([],['a'])]))
417 -> merge( '!b', [Implication([!c], ['a'])])
418 -> Implication(['!b','!c'],['a'])
419
420 transform(|| ( a b c ))
421 -> merge( Not(AnyOf(['b', 'c']), [Implication([],['a'])])
422 -> merge( AllOf(['!b', '!c']), [Implication([],['a'])])
423 And we're at the same point as above.
424
425 Suppose we write it: '|| ( a !c? ( b ) )'. This gets rewritten by the
426 nested implication remover as '|| ( a ( !c b ) )'.
427
428 transform(|| ( a ( !c b ) ))
429 -> merge(Not(AnyOf('(!c b)')), [Implication([],[a]))
430 -> merge(AllOf([AnyOf([c, !b])])), [Implication([],[a]))
431 -> merge(AnyOf([c,!b]), [Implication([],[a]))
432 -> merge(c, [Implication([],[a])) + merge(!b, [Implication([],[a]))
433 -> [ Implication([c],[a]), Implication([!b],[a]) ]
434
435 aka 'c? ( a ) !b? ( a )'
436 While it works, we can see a serious bias towards the left-most in a ||.
437
438
439 I think this handles all the cases. I'll try to update the repo with
440 that algo.
441
442 Bests,
443
444 Alexis.

Replies