1 |
On sob, 2017-07-08 at 22:34 +0200, Alexis Ballier wrote: |
2 |
> On Sat, 08 Jul 2017 20:44:24 +0200 |
3 |
> Michał Górny <mgorny@g.o> wrote: |
4 |
> |
5 |
> > On sob, 2017-07-08 at 16:12 +0200, Alexis Ballier wrote: |
6 |
> > > On Sat, 08 Jul 2017 11:43:39 +0200 |
7 |
> > > Michał Górny <mgorny@g.o> wrote: |
8 |
> > > |
9 |
> > > > Hi, everyone. |
10 |
> > > > |
11 |
> > > > I think the affairs have settled enough and I've finished filling |
12 |
> > > > in the pre-GLEP for REQUIRED_USE auto-enforcing. It's got all |
13 |
> > > > the algorithms, rationale and separated reference implementation. |
14 |
> > > > |
15 |
> > > > If there are no major concerns raised, I will soon start working |
16 |
> > > > on writing an optimized implementation for pkgcore/pkgcheck |
17 |
> > > > and integrating the verification algos with the CI. |
18 |
> > > > |
19 |
> > > > The pre-GLEP for review is here: |
20 |
> > > > |
21 |
> > > > https://wiki.gentoo.org/wiki/User:MGorny/GLEP:ReqUse |
22 |
> > > |
23 |
> > > |
24 |
> > > Constraint group reordering algorithm |
25 |
> > > |
26 |
> > > I really think we should only consider REQUIRED_USE with |
27 |
> > > forced/masked useflags instantiated there. And ban (in repoman) |
28 |
> > > REQUIRED_USE that contain some "False": "a? ( b )" with 'a' free |
29 |
> > > and 'b' masked is perfectly ok now but it hides a serious problem |
30 |
> > > in the package/profile. Instantiating this would give: "a? ( False |
31 |
> > > )" and be an error just like we have depend.bad & co. This is |
32 |
> > > independent of auto solving or not, it's already wrong. |
33 |
> > |
34 |
> > As I've already explained you multiple times, this obtains *exactly |
35 |
> > the same* effect. However, it's much simpler when it's done like this |
36 |
> > because it makes it possible to reuse the already defined algorithms |
37 |
> > instead of having to precisely define how to preprocess REQUIRED_USE |
38 |
> > for this and cover all the corner cases. |
39 |
> |
40 |
> Simpler??? I don't think so. What I wrote clearly pinpoints that: |
41 |
> When you'll write the algorithm for "Verifying that the constraints do |
42 |
> not alter immutable flags" you'll notice this is exactly that and can |
43 |
> be put as a preprocessing step and then you can drop all the corner |
44 |
> cases considerations for immutable flags. I never understood why you're |
45 |
> insisting that much on immutables: they're really not natural, not |
46 |
> simple, not standard, and carrying them all over seems to be a burden |
47 |
> to me. |
48 |
|
49 |
I wrote the algorithms, and they're simple. This specific check is |
50 |
the combination of three simple steps: |
51 |
|
52 |
a. reordering the groups based on immutables, |
53 |
|
54 |
b. transforming the AST into flat form, |
55 |
|
56 |
c. verifying each flat constraint. |
57 |
|
58 |
The first step is trivial -- it's basically 'move true to front, false |
59 |
to back'. The second step is more complex but it's needed anyway, |
60 |
and quite well-defined, especially with the assumption that all |
61 |
the groups always have at least one flag inside. The third step is |
62 |
trivial again because it's just checking the conditions and constraints |
63 |
against a list. |
64 |
|
65 |
The alternative to reordering is altering the groups. Altering means we |
66 |
need to have separate logic for every type of group while sorting works |
67 |
the same in all of them. Altering means we need to explicitly special |
68 |
case forcing 1 and >1 items, and masking all items, for each group. |
69 |
Again, sorting does not need to be concerned about that because |
70 |
the check following it (also trivial) will catch it anyway. |
71 |
|
72 |
Of course, you could say that you will get a little better error |
73 |
message, like 'all flags inside || are masked' instead of '!b -> a' will |
74 |
alter immutable flag. But that's purely an implementation detail. It's |
75 |
not worth making the reference algorithms longer. |
76 |
|
77 |
> > > Reordering is a dangerous path as we've already seen since it can |
78 |
> > > create unexpected loops for the solver. |
79 |
> > |
80 |
> > Freeform reordering is dangerous, and I've removed that already. |
81 |
> > Reordering restricted to immutables can not cause any issues that any |
82 |
> > other solution wouldn't cause. |
83 |
> |
84 |
> You're very likely right there. Any proof? Esp. any proof that the |
85 |
> checker still guarantees the existence of a solution in all cases? |
86 |
> I'm not asking for a formal proof, but simply a bit more details than |
87 |
> just an assertion saying it's fine. |
88 |
|
89 |
The case for checker is just the same as with any other kind of |
90 |
immutability processing. We need to run the reordering, transform |
91 |
and verification separately for every possible combination of immutable |
92 |
flags. |
93 |
|
94 |
The reordering explicitly alters the results of the transform, and with |
95 |
the trivial implication form of the flattened constraints |
96 |
the verification stage checks will find any problems that may arise from |
97 |
it, just like they would find any problem from doing a similar thing |
98 |
verbatim. |
99 |
|
100 |
> |
101 |
> > > Working on instantiated REQUIRED_USE constraints would probably |
102 |
> > > simplify quite a bit your GLEP too: you already have the "Verifying |
103 |
> > > that the constraints do not alter immutable flags" part that roughly |
104 |
> > > does the same thing as instantiating, except if you assume it's |
105 |
> > > already true you can skip the reordering. |
106 |
> > |
107 |
> > Except that the reordering can be described in 2 points, and so can be |
108 |
> > the immutability verification. Please prove that you can provide |
109 |
> > a simpler explanation that doesn't fail in any of the corner cases. |
110 |
> |
111 |
> Except reordering is an invention and immutable checking is simply |
112 |
> applying boolean logic rules to your implication and check that no |
113 |
> "False" can appear. You can simply start by applying boolean logic and |
114 |
> forget about reordering. |
115 |
|
116 |
No, it is not. You do not have the values of all the items inside |
117 |
the group, just some of them. Depending on how many of them do you have |
118 |
and what are them, you need to transform the group appropriately, e.g. |
119 |
by removing items, replacing the group or failing entirely. |
120 |
|
121 |
> > > One big advantage of working on ASTs is that it becomes trivial to |
122 |
> > > suggest a proper reordering. |
123 |
> > |
124 |
> > Reordering is never a trivial problem. Unless I'm missing something, |
125 |
> > it is technically possible that a 'reordering' will actually require |
126 |
> > a sub- item being moved out of the containing group. |
127 |
> |
128 |
> Not if done at the AST level. |
129 |
> |
130 |
> > And to be honest, I find the output of the verification script in this |
131 |
> > regard quite useful. That is, it saying 'X affects Y, so it needs to |
132 |
> > go before it' is quite clear to me. I don't think most developers |
133 |
> > would actually need to script to pinpoint a specific location for |
134 |
> > every single constraint. |
135 |
> |
136 |
> In most cases this is sufficient. |
137 |
> Think of a more complex case: |
138 |
> A -> B |
139 |
> B -> C |
140 |
> A -> D |
141 |
> D -> C |
142 |
> |
143 |
> |-> B -| |
144 |
> A -| |->C |
145 |
> |-> D -| |
146 |
> |
147 |
> It's starting to be a more complex mental exercise to get the proper |
148 |
> ordering when given the 1st form only. |
149 |
> |
150 |
> |
151 |
> Actually, considering people rant against git merges because they want |
152 |
> linear history in the graph log but fail to understand 'git log' is |
153 |
> precisely about displaying such a linear ordering, I'm ready to bet |
154 |
> someone will rant :) |
155 |
|
156 |
We can discuss this when you have a working algorithm. Right now, it's |
157 |
a purely theoretical exercise unless someone can come up with |
158 |
a reasonable way of implementing it. |
159 |
|
160 |
> > > ------- |
161 |
> > > |
162 |
> > > Restrictions on REQUIRED_USE format |
163 |
> > > |
164 |
> > > I still fail to see the point here. One can simply apply the |
165 |
> > > rewriting you suggest below and be done with it. Rationale is not |
166 |
> > > very convincing to me: |
167 |
> > > |
168 |
> > > - avoiding unpredictable results of automatic flag adjustments: |
169 |
> > > A deterministic algorithm is, by definition, predictable. |
170 |
> > |
171 |
> > s/unpredictable/surprising/? |
172 |
> > |
173 |
> > The goal is for it do something that the developer *not reading |
174 |
> > the spec* could reasonably predict happening. |
175 |
> |
176 |
> |
177 |
> There is a huge gap between failing to solve a constraint voluntarily |
178 |
> because it has one too much nesting level and having a repoman warning |
179 |
> telling 'it is not recommended you write it that way'. The latter |
180 |
> ensures said developer is able to predict what is happening, the former |
181 |
> just adds annoyance onto users for no real reason. |
182 |
|
183 |
Except it doesn't because it's extremely uncommon (and even unlikely) |
184 |
and I am successfully exterminating the last occurrences. Implementing |
185 |
support for something that will be never used is a waste of time. |
186 |
|
187 |
> > > - improving readability of REQUIRED_USE constraints: |
188 |
> > > No need for a restriction for that. If people want to shoot |
189 |
> > > themselves in the foot, it is not a PMS problem. I see that |
190 |
> > > like proposing death penalty for those who commit |
191 |
> > > suicide :) |
192 |
> > |
193 |
> > This is not PMS. This is a GLEP which serves both the purpose of |
194 |
> > a technical specification with extended rationale and a policy |
195 |
> > document. |
196 |
> |
197 |
> Isn't the goal of it to have it in a future EAPI ? |
198 |
|
199 |
I don't see how that's relevant. Nobody will be copying the whole GLEP |
200 |
verbatim into the PMS. |
201 |
|
202 |
> > > - keeping the specification and implementation relatively simple: |
203 |
> > > You already define everything for working without |
204 |
> > > restriction. Plus, unlimited implication nesting has the same |
205 |
> > > complexity. |
206 |
> > |
207 |
> > No, I don't. I don't cover the meaning of nested groups and things |
208 |
> > just explode when they come into game. |
209 |
> |
210 |
> |
211 |
> You mostly do with the rewriting part, you're only refusing to admit |
212 |
> it :) |
213 |
|
214 |
You mean the transform? It doesn't cover the possibility of those groups |
215 |
containing anything but plain flags. As we've already established, |
216 |
the results become non-trivial when they do and those cases are |
217 |
certainly not covered here. |
218 |
|
219 |
> > > ------- |
220 |
> > > |
221 |
> > > |
222 |
> > > Do you have numbers on the checker run on all inputs from |
223 |
> > > gentoo-x86 ? Since we're dealing with heuristics those are |
224 |
> > > particularly important to validate we're not rejecting too many |
225 |
> > > constructs. |
226 |
> > |
227 |
> > I think I've mentioned that a few times in the GLEP. The validation |
228 |
> > and verification bits were tested on all REQUIRED_USE cases from a few |
229 |
> > days ago. For the former, all cases are listed in rationale. For |
230 |
> > the latter, the reference implementation has test cases for every |
231 |
> > construct that triggered an issue, including the false positives that |
232 |
> > were fixed. I've only skipped a few that were completely redundant |
233 |
> > (e.g. different versions of the same package that added 1-2 |
234 |
> > irrelevant items). |
235 |
> |
236 |
> Unless I'm missing something, rationale seems more about cases rejected |
237 |
> by the restricted syntax. Numbers I'm talking about is the # of rejected |
238 |
> constraints vs accepted (and assumed solvable) ones. |
239 |
|
240 |
I'll crunch some fresh numbers based on today's repository. |
241 |
|
242 |
-- |
243 |
Best regards, |
244 |
Michał Górny |