Gentoo Archives: gentoo-dev

From: Alexis Ballier <aballier@g.o>
To: gentoo-dev@l.g.o
Subject: Re: [gentoo-dev] Pre-GLEP RFC: Automated enforcing of REQUIRED_USE constraints
Date: Sat, 08 Jul 2017 20:34:29
Message-Id: 20170708223416.071583a0@gentoo.org
In Reply to: Re: [gentoo-dev] Pre-GLEP RFC: Automated enforcing of REQUIRED_USE constraints by "Michał Górny"
1 On Sat, 08 Jul 2017 20:44:24 +0200
2 Michał Górny <mgorny@g.o> wrote:
3
4 > On sob, 2017-07-08 at 16:12 +0200, Alexis Ballier wrote:
5 > > On Sat, 08 Jul 2017 11:43:39 +0200
6 > > Michał Górny <mgorny@g.o> wrote:
7 > >
8 > > > Hi, everyone.
9 > > >
10 > > > I think the affairs have settled enough and I've finished filling
11 > > > in the pre-GLEP for REQUIRED_USE auto-enforcing. It's got all
12 > > > the algorithms, rationale and separated reference implementation.
13 > > >
14 > > > If there are no major concerns raised, I will soon start working
15 > > > on writing an optimized implementation for pkgcore/pkgcheck
16 > > > and integrating the verification algos with the CI.
17 > > >
18 > > > The pre-GLEP for review is here:
19 > > >
20 > > > https://wiki.gentoo.org/wiki/User:MGorny/GLEP:ReqUse
21 > >
22 > >
23 > > Constraint group reordering algorithm
24 > >
25 > > I really think we should only consider REQUIRED_USE with
26 > > forced/masked useflags instantiated there. And ban (in repoman)
27 > > REQUIRED_USE that contain some "False": "a? ( b )" with 'a' free
28 > > and 'b' masked is perfectly ok now but it hides a serious problem
29 > > in the package/profile. Instantiating this would give: "a? ( False
30 > > )" and be an error just like we have depend.bad & co. This is
31 > > independent of auto solving or not, it's already wrong.
32 >
33 > As I've already explained you multiple times, this obtains *exactly
34 > the same* effect. However, it's much simpler when it's done like this
35 > because it makes it possible to reuse the already defined algorithms
36 > instead of having to precisely define how to preprocess REQUIRED_USE
37 > for this and cover all the corner cases.
38
39 Simpler??? I don't think so. What I wrote clearly pinpoints that:
40 When you'll write the algorithm for "Verifying that the constraints do
41 not alter immutable flags" you'll notice this is exactly that and can
42 be put as a preprocessing step and then you can drop all the corner
43 cases considerations for immutable flags. I never understood why you're
44 insisting that much on immutables: they're really not natural, not
45 simple, not standard, and carrying them all over seems to be a burden
46 to me.
47
48 > > Reordering is a dangerous path as we've already seen since it can
49 > > create unexpected loops for the solver.
50 >
51 > Freeform reordering is dangerous, and I've removed that already.
52 > Reordering restricted to immutables can not cause any issues that any
53 > other solution wouldn't cause.
54
55 You're very likely right there. Any proof? Esp. any proof that the
56 checker still guarantees the existence of a solution in all cases?
57 I'm not asking for a formal proof, but simply a bit more details than
58 just an assertion saying it's fine.
59
60 > > Working on instantiated REQUIRED_USE constraints would probably
61 > > simplify quite a bit your GLEP too: you already have the "Verifying
62 > > that the constraints do not alter immutable flags" part that roughly
63 > > does the same thing as instantiating, except if you assume it's
64 > > already true you can skip the reordering.
65 >
66 > Except that the reordering can be described in 2 points, and so can be
67 > the immutability verification. Please prove that you can provide
68 > a simpler explanation that doesn't fail in any of the corner cases.
69
70 Except reordering is an invention and immutable checking is simply
71 applying boolean logic rules to your implication and check that no
72 "False" can appear. You can simply start by applying boolean logic and
73 forget about reordering.
74
75
76 > > Concept for transforming REQUIRED_USE into implications
77 > >
78 > > Ok, now I probably understand better the concept of common prefix.
79 > > I'm definitely biased here, but I would feel better with a more
80 > > recursive presentation of it. Assume we have 'validate(list of
81 > > clauses)'; basically, the common prefix idea is that for an
82 > > implication 'foo? ( consequences = list of clauses )' you first
83 > > validate the consequences as if it were a REQUIRED_USE (so that the
84 > > subtree rooted at foo is not self-conflicting) and then consider
85 > > the whole thing as a clause. The idea would then be to have similar
86 > > checks as to what you've written but working on trees (ASTs)
87 > > instead of flattened clauses. This would avoid having to deal with
88 > > unique identities (these would come for free) and IMHO would be
89 > > easier to understand. I'm not sure how to do this though, I'll ping
90 > > you when I have some idea.
91 >
92 > Well, the problem of common prefix is quite complex, and I'm not even
93 > sure if it's really worth more consideration. After all, we're
94 > prettych much talking about doing:
95 >
96 > a? ( !a ... )
97 >
98 > which has extremely low usability and even lower likeness of
99 > occurring.
100
101
102 Hmm. I think you came up with more valid cases. Can't remember a
103 precise one though.
104
105
106 > > One big advantage of working on ASTs is that it becomes trivial to
107 > > suggest a proper reordering.
108 >
109 > Reordering is never a trivial problem. Unless I'm missing something,
110 > it is technically possible that a 'reordering' will actually require
111 > a sub- item being moved out of the containing group.
112
113 Not if done at the AST level.
114
115 > And to be honest, I find the output of the verification script in this
116 > regard quite useful. That is, it saying 'X affects Y, so it needs to
117 > go before it' is quite clear to me. I don't think most developers
118 > would actually need to script to pinpoint a specific location for
119 > every single constraint.
120
121 In most cases this is sufficient.
122 Think of a more complex case:
123 A -> B
124 B -> C
125 A -> D
126 D -> C
127
128 |-> B -|
129 A -| |->C
130 |-> D -|
131
132 It's starting to be a more complex mental exercise to get the proper
133 ordering when given the 1st form only.
134
135
136 Actually, considering people rant against git merges because they want
137 linear history in the graph log but fail to understand 'git log' is
138 precisely about displaying such a linear ordering, I'm ready to bet
139 someone will rant :)
140
141
142 > > -------
143 > >
144 > > Restrictions on REQUIRED_USE format
145 > >
146 > > I still fail to see the point here. One can simply apply the
147 > > rewriting you suggest below and be done with it. Rationale is not
148 > > very convincing to me:
149 > >
150 > > - avoiding unpredictable results of automatic flag adjustments:
151 > > A deterministic algorithm is, by definition, predictable.
152 >
153 > s/unpredictable/surprising/?
154 >
155 > The goal is for it do something that the developer *not reading
156 > the spec* could reasonably predict happening.
157
158
159 There is a huge gap between failing to solve a constraint voluntarily
160 because it has one too much nesting level and having a repoman warning
161 telling 'it is not recommended you write it that way'. The latter
162 ensures said developer is able to predict what is happening, the former
163 just adds annoyance onto users for no real reason.
164
165
166 > > - improving readability of REQUIRED_USE constraints:
167 > > No need for a restriction for that. If people want to shoot
168 > > themselves in the foot, it is not a PMS problem. I see that
169 > > like proposing death penalty for those who commit
170 > > suicide :)
171 >
172 > This is not PMS. This is a GLEP which serves both the purpose of
173 > a technical specification with extended rationale and a policy
174 > document.
175
176 Isn't the goal of it to have it in a future EAPI ?
177
178
179 > > - keeping the specification and implementation relatively simple:
180 > > You already define everything for working without
181 > > restriction. Plus, unlimited implication nesting has the same
182 > > complexity.
183 >
184 > No, I don't. I don't cover the meaning of nested groups and things
185 > just explode when they come into game.
186
187
188 You mostly do with the rewriting part, you're only refusing to admit
189 it :)
190
191
192 > > -------
193 > >
194 > >
195 > > Do you have numbers on the checker run on all inputs from
196 > > gentoo-x86 ? Since we're dealing with heuristics those are
197 > > particularly important to validate we're not rejecting too many
198 > > constructs.
199 >
200 > I think I've mentioned that a few times in the GLEP. The validation
201 > and verification bits were tested on all REQUIRED_USE cases from a few
202 > days ago. For the former, all cases are listed in rationale. For
203 > the latter, the reference implementation has test cases for every
204 > construct that triggered an issue, including the false positives that
205 > were fixed. I've only skipped a few that were completely redundant
206 > (e.g. different versions of the same package that added 1-2
207 > irrelevant items).
208
209 Unless I'm missing something, rationale seems more about cases rejected
210 by the restricted syntax. Numbers I'm talking about is the # of rejected
211 constraints vs accepted (and assumed solvable) ones.
212
213
214 > The solver was tested on all combinations of most of the corner
215 > cases. I have only skipped those that would require humongous number
216 > of combinations.
217
218 That's great!
219
220 > I haven't tested all combinations of masked/forced flags from
221 > the profiles though. I plan to do this by implementing it in pkgcheck.
222 > I will also have better performance reports based on optimized
223 > algorithms and real use.
224
225 Makes sense.
226
227
228 Alexis.

Replies