1 |
On Sat, 08 Jul 2017 23:56:07 +0200 |
2 |
Michał Górny <mgorny@g.o> wrote: |
3 |
|
4 |
> On sob, 2017-07-08 at 22:34 +0200, Alexis Ballier wrote: |
5 |
> > On Sat, 08 Jul 2017 20:44:24 +0200 |
6 |
> > Michał Górny <mgorny@g.o> wrote: |
7 |
> > |
8 |
> > > On sob, 2017-07-08 at 16:12 +0200, Alexis Ballier wrote: |
9 |
> > > > On Sat, 08 Jul 2017 11:43:39 +0200 |
10 |
> > > > Michał Górny <mgorny@g.o> wrote: |
11 |
> > > > |
12 |
> > > > > Hi, everyone. |
13 |
> > > > > |
14 |
> > > > > I think the affairs have settled enough and I've finished |
15 |
> > > > > filling in the pre-GLEP for REQUIRED_USE auto-enforcing. It's |
16 |
> > > > > got all the algorithms, rationale and separated reference |
17 |
> > > > > implementation. |
18 |
> > > > > |
19 |
> > > > > If there are no major concerns raised, I will soon start |
20 |
> > > > > working on writing an optimized implementation for |
21 |
> > > > > pkgcore/pkgcheck and integrating the verification algos with |
22 |
> > > > > the CI. |
23 |
> > > > > |
24 |
> > > > > The pre-GLEP for review is here: |
25 |
> > > > > |
26 |
> > > > > https://wiki.gentoo.org/wiki/User:MGorny/GLEP:ReqUse |
27 |
> > > > |
28 |
> > > > |
29 |
> > > > Constraint group reordering algorithm |
30 |
> > > > |
31 |
> > > > I really think we should only consider REQUIRED_USE with |
32 |
> > > > forced/masked useflags instantiated there. And ban (in repoman) |
33 |
> > > > REQUIRED_USE that contain some "False": "a? ( b )" with 'a' free |
34 |
> > > > and 'b' masked is perfectly ok now but it hides a serious |
35 |
> > > > problem in the package/profile. Instantiating this would give: |
36 |
> > > > "a? ( False )" and be an error just like we have depend.bad & |
37 |
> > > > co. This is independent of auto solving or not, it's already |
38 |
> > > > wrong. |
39 |
> > > |
40 |
> > > As I've already explained you multiple times, this obtains |
41 |
> > > *exactly the same* effect. However, it's much simpler when it's |
42 |
> > > done like this because it makes it possible to reuse the already |
43 |
> > > defined algorithms instead of having to precisely define how to |
44 |
> > > preprocess REQUIRED_USE for this and cover all the corner cases. |
45 |
> > |
46 |
> > Simpler??? I don't think so. What I wrote clearly pinpoints that: |
47 |
> > When you'll write the algorithm for "Verifying that the constraints |
48 |
> > do not alter immutable flags" you'll notice this is exactly that |
49 |
> > and can be put as a preprocessing step and then you can drop all |
50 |
> > the corner cases considerations for immutable flags. I never |
51 |
> > understood why you're insisting that much on immutables: they're |
52 |
> > really not natural, not simple, not standard, and carrying them all |
53 |
> > over seems to be a burden to me. |
54 |
> |
55 |
> I wrote the algorithms, and they're simple. This specific check is |
56 |
> the combination of three simple steps: |
57 |
> |
58 |
> a. reordering the groups based on immutables, |
59 |
> |
60 |
> b. transforming the AST into flat form, |
61 |
> |
62 |
> c. verifying each flat constraint. |
63 |
> |
64 |
> The first step is trivial -- it's basically 'move true to front, false |
65 |
> to back'. The second step is more complex but it's needed anyway, |
66 |
> and quite well-defined, especially with the assumption that all |
67 |
> the groups always have at least one flag inside. The third step is |
68 |
> trivial again because it's just checking the conditions and |
69 |
> constraints against a list. |
70 |
> |
71 |
> The alternative to reordering is altering the groups. Altering means |
72 |
> we need to have separate logic for every type of group while sorting |
73 |
> works the same in all of them. Altering means we need to explicitly |
74 |
> special case forcing 1 and >1 items, and masking all items, for each |
75 |
> group. Again, sorting does not need to be concerned about that because |
76 |
> the check following it (also trivial) will catch it anyway. |
77 |
|
78 |
You don't seem to get how normalizing and constant |
79 |
propagation/elimination works. |
80 |
|
81 |
Basically, reordering would be: |
82 |
And(list) -> And( forced(list) + free(list) + masked(list) ) |
83 |
Or(list) -> Or( ... . ) |
84 |
etc. |
85 |
|
86 |
While normalizing is: |
87 |
And(list) -> False if False appears in a normalize(x) for x in list, |
88 |
True if normalize(x) for x in list is empty or all True, |
89 |
And(normalize(x) for x in list if != True and != False) |
90 |
etc. |
91 |
|
92 |
That's described in one point: Apply boolean algebra rules. |
93 |
|
94 |
What I don't like about reordering is that it is too tightly coupled to |
95 |
the following solving algorithm and the restricted syntax, while really, |
96 |
having REQUIRED_USE constraints asking you to enable a masked flag is |
97 |
something we ought to kill even without solving as they hide broken |
98 |
deps and make all our QA checks less useful. |
99 |
|
100 |
|
101 |
Finally, reordering, being essentially a local thing, does not have the |
102 |
proper behavior in a general AST: |
103 |
'|| ( ( a b ) c )' with 'a' and 'b' masked will be left invariant by |
104 |
reordering and the resulting expanded form will be rejected while |
105 |
constant propagation/elimination will reduce that to 'c' and be good. |
106 |
Hence, the reordering check cannot be used as a good input for checking |
107 |
for broken REQUIRED_USE: I really think things like 'vulkan? ( || |
108 |
( video_cards_i965 video_cards_radeonsi ) )' should be a repoman error |
109 |
on stable profiles where both those video cards are masked and vulkan is |
110 |
not. For that, we need to support the whole PMS-defined syntax, not a |
111 |
reduced one. |
112 |
Basically, this duplicates the code and work on that side and makes you |
113 |
having to carry the reordering burden all along your GLEP :) |
114 |
|
115 |
|
116 |
> Of course, you could say that you will get a little better error |
117 |
> message, like 'all flags inside || are masked' instead of '!b -> a' |
118 |
> will alter immutable flag. But that's purely an implementation |
119 |
> detail. It's not worth making the reference algorithms longer. |
120 |
|
121 |
Well, as noted above, I don't think this is related to any solver. |
122 |
|
123 |
> > > > Reordering is a dangerous path as we've already seen since it |
124 |
> > > > can create unexpected loops for the solver. |
125 |
> > > |
126 |
> > > Freeform reordering is dangerous, and I've removed that already. |
127 |
> > > Reordering restricted to immutables can not cause any issues that |
128 |
> > > any other solution wouldn't cause. |
129 |
> > |
130 |
> > You're very likely right there. Any proof? Esp. any proof that the |
131 |
> > checker still guarantees the existence of a solution in all cases? |
132 |
> > I'm not asking for a formal proof, but simply a bit more details |
133 |
> > than just an assertion saying it's fine. |
134 |
> |
135 |
> The case for checker is just the same as with any other kind of |
136 |
> immutability processing. We need to run the reordering, transform |
137 |
> and verification separately for every possible combination of |
138 |
> immutable flags. |
139 |
> |
140 |
> The reordering explicitly alters the results of the transform, and |
141 |
> with the trivial implication form of the flattened constraints |
142 |
> the verification stage checks will find any problems that may arise |
143 |
> from it, just like they would find any problem from doing a similar |
144 |
> thing verbatim. |
145 |
|
146 |
Ok that works. I don't see much difference between |
147 |
'check(reorder(formula,immutables))' and |
148 |
'check(reduce(formula,immutables))'... except the latter does not |
149 |
have to deal with any immutable anymore. |
150 |
|
151 |
> > |
152 |
> > > > Working on instantiated REQUIRED_USE constraints would probably |
153 |
> > > > simplify quite a bit your GLEP too: you already have the |
154 |
> > > > "Verifying that the constraints do not alter immutable flags" |
155 |
> > > > part that roughly does the same thing as instantiating, except |
156 |
> > > > if you assume it's already true you can skip the reordering. |
157 |
> > > |
158 |
> > > Except that the reordering can be described in 2 points, and so |
159 |
> > > can be the immutability verification. Please prove that you can |
160 |
> > > provide a simpler explanation that doesn't fail in any of the |
161 |
> > > corner cases. |
162 |
> > |
163 |
> > Except reordering is an invention and immutable checking is simply |
164 |
> > applying boolean logic rules to your implication and check that no |
165 |
> > "False" can appear. You can simply start by applying boolean logic |
166 |
> > and forget about reordering. |
167 |
> |
168 |
> No, it is not. You do not have the values of all the items inside |
169 |
> the group, just some of them. Depending on how many of them do you |
170 |
> have and what are them, you need to transform the group |
171 |
> appropriately, e.g. by removing items, replacing the group or failing |
172 |
> entirely. |
173 |
|
174 |
Yes, that's boolean algebra rules. You propagate constants from leaves |
175 |
to root in the AST and if some 'False' appears in your AST when you've |
176 |
reached the root you fail. I agree one needs some practice on recursive |
177 |
structures to understand that quickly and easily though. |
178 |
|
179 |
|
180 |
> > > > One big advantage of working on ASTs is that it becomes trivial |
181 |
> > > > to suggest a proper reordering. |
182 |
> > > |
183 |
> > > Reordering is never a trivial problem. Unless I'm missing |
184 |
> > > something, it is technically possible that a 'reordering' will |
185 |
> > > actually require a sub- item being moved out of the containing |
186 |
> > > group. |
187 |
> > |
188 |
> > Not if done at the AST level. |
189 |
> > |
190 |
> > > And to be honest, I find the output of the verification script in |
191 |
> > > this regard quite useful. That is, it saying 'X affects Y, so it |
192 |
> > > needs to go before it' is quite clear to me. I don't think most |
193 |
> > > developers would actually need to script to pinpoint a specific |
194 |
> > > location for every single constraint. |
195 |
> > |
196 |
> > In most cases this is sufficient. |
197 |
> > Think of a more complex case: |
198 |
> > A -> B |
199 |
> > B -> C |
200 |
> > A -> D |
201 |
> > D -> C |
202 |
> > |
203 |
> > |-> B -| |
204 |
> > A -| |->C |
205 |
> > |-> D -| |
206 |
> > |
207 |
> > It's starting to be a more complex mental exercise to get the proper |
208 |
> > ordering when given the 1st form only. |
209 |
> > |
210 |
> > |
211 |
> > Actually, considering people rant against git merges because they |
212 |
> > want linear history in the graph log but fail to understand 'git |
213 |
> > log' is precisely about displaying such a linear ordering, I'm |
214 |
> > ready to bet someone will rant :) |
215 |
> |
216 |
> We can discuss this when you have a working algorithm. Right now, it's |
217 |
> a purely theoretical exercise unless someone can come up with |
218 |
> a reasonable way of implementing it. |
219 |
|
220 |
Hmmm. lol ? |
221 |
May I suggest you spend 30 minutes on wikipedia about what topological |
222 |
sorting is, what it does and what purpose it serves? |
223 |
It's a bit annoying to see you completely lost every time it comes up. |
224 |
|
225 |
In a few words, from a list of 'X affects Y' it gives you a compatible |
226 |
linear ordering. While it can be done by hand on the easy cases, that's |
227 |
something computers are much better at doing on the more complex cases. |
228 |
|
229 |
|
230 |
> > > > ------- |
231 |
> > > > |
232 |
> > > > Restrictions on REQUIRED_USE format |
233 |
> > > > |
234 |
> > > > I still fail to see the point here. One can simply apply the |
235 |
> > > > rewriting you suggest below and be done with it. Rationale is |
236 |
> > > > not very convincing to me: |
237 |
> > > > |
238 |
> > > > - avoiding unpredictable results of automatic flag adjustments: |
239 |
> > > > A deterministic algorithm is, by definition, |
240 |
> > > > predictable. |
241 |
> > > |
242 |
> > > s/unpredictable/surprising/? |
243 |
> > > |
244 |
> > > The goal is for it do something that the developer *not reading |
245 |
> > > the spec* could reasonably predict happening. |
246 |
> > |
247 |
> > |
248 |
> > There is a huge gap between failing to solve a constraint |
249 |
> > voluntarily because it has one too much nesting level and having a |
250 |
> > repoman warning telling 'it is not recommended you write it that |
251 |
> > way'. The latter ensures said developer is able to predict what is |
252 |
> > happening, the former just adds annoyance onto users for no real |
253 |
> > reason. |
254 |
> |
255 |
> Except it doesn't because it's extremely uncommon (and even unlikely) |
256 |
> and I am successfully exterminating the last occurrences. Implementing |
257 |
> support for something that will be never used is a waste of time. |
258 |
|
259 |
Except you're already wasting your time exterminating some cases for |
260 |
which support is already written. You still don't know whether your |
261 |
restricted syntax will be accepted in PMS (which mostly depends if |
262 |
people feel comfortable with its expressivity), so you don't know if |
263 |
you won't have to plug support for it later. May I remind you the |
264 |
numbers I ran? Among all the rejected "too complex" usages of requse, |
265 |
only *1* could have led to an invalid solution by the solver. |
266 |
|
267 |
Feel free to do so, I mean, it cannot hurt to simplify requse in the |
268 |
tree, but keep in mind that maybe support for more syntaxes will still |
269 |
be needed. Esp. since I learnt recently that PMS factors the definition |
270 |
of ||, &&, ^^ and co with all the other usages of it (DEPEND, etc.). |
271 |
|
272 |
Anyway, I'm thinking it's more important to move on with this, and |
273 |
iterate later, than getting everything perfect beforehand. |
274 |
|
275 |
|
276 |
> > > > - improving readability of REQUIRED_USE constraints: |
277 |
> > > > No need for a restriction for that. If people want to |
278 |
> > > > shoot themselves in the foot, it is not a PMS problem. I see |
279 |
> > > > that like proposing death penalty for those who commit |
280 |
> > > > suicide :) |
281 |
> > > |
282 |
> > > This is not PMS. This is a GLEP which serves both the purpose of |
283 |
> > > a technical specification with extended rationale and a policy |
284 |
> > > document. |
285 |
> > |
286 |
> > Isn't the goal of it to have it in a future EAPI ? |
287 |
> |
288 |
> I don't see how that's relevant. Nobody will be copying the whole GLEP |
289 |
> verbatim into the PMS. |
290 |
|
291 |
Unless you're writing a throwaway GLEP, the syntax restrictions are |
292 |
supposed to go to PMS since that's what defines it... |
293 |
|
294 |
|
295 |
> > > > - keeping the specification and implementation relatively |
296 |
> > > > simple: You already define everything for working without |
297 |
> > > > restriction. Plus, unlimited implication nesting has the same |
298 |
> > > > complexity. |
299 |
> > > |
300 |
> > > No, I don't. I don't cover the meaning of nested groups and things |
301 |
> > > just explode when they come into game. |
302 |
> > |
303 |
> > |
304 |
> > You mostly do with the rewriting part, you're only refusing to admit |
305 |
> > it :) |
306 |
> |
307 |
> You mean the transform? It doesn't cover the possibility of those |
308 |
> groups containing anything but plain flags. As we've already |
309 |
> established, the results become non-trivial when they do and those |
310 |
> cases are certainly not covered here. |
311 |
|
312 |
That's why I said mostly. The missing parts are really small, trust me. |
313 |
|
314 |
|
315 |
[...] |
316 |
|
317 |
Alexis. |