Gentoo Archives: gentoo-dev

From: "Michał Górny" <mgorny@g.o>
To: gentoo-dev@l.g.o
Subject: Re: [gentoo-dev] [RFC] Forced/automatic USE flag constraints (codename: ENFORCED_USE)
Date: Thu, 01 Jun 2017 21:31:40
Message-Id: 1496352685.30502.4.camel@gentoo.org
In Reply to: Re: [gentoo-dev] [RFC] Forced/automatic USE flag constraints (codename: ENFORCED_USE) by Alexis Ballier
1 On czw, 2017-06-01 at 10:55 +0200, Alexis Ballier wrote:
2 > On Wed, 31 May 2017 21:02:24 +0200
3 > Michał Górny <mgorny@g.o> wrote:
4 >
5 > > On śro, 2017-05-31 at 19:39 +0200, Alexis Ballier wrote:
6 > > > > > Again, you *need* to process the constraints in order. '!a?
7 > > > > > ( b ) !b? ( a )' is not deterministic when none of a and b are
8 > > > > > enabled otherwise.
9 > > > >
10 > > > > You can't rely on any particular order of constraints, especially
11 > > > > when eclass stacking comes into play. You could try defining some
12 > > > > constraint- sorting algorithm but it's going to be complex and
13 > > > > it's usefulness will be limited anyway due to various kinds of
14 > > > > grouping.
15 > > >
16 > > >
17 > > > Better have some order: If half of the above constraint comes from
18 > > > ebuild and the other half comes from eclass, then PM might toggle
19 > > > 'a' or 'b' depending on the phase of the moon which is specifically
20 > > > what we're trying to avoid.
21 > >
22 > > No, it can't. That's the whole point. The algorithm must be defined so
23 > > that it is always predictable independently of order (maybe except
24 > > the ordering inside ^^, ||, ??) and independently of how it's nested
25 > > (i.e. 'a? ( b? ( c ) )' must give the same result as 'b? ( a? ( c )
26 > > )').
27 >
28 > This is a lot of handwaving without real description of how it would
29 > work. This starts to look a lot like confluence in rewriting systems
30 > and you want developers to write only confluent rules and repoman to
31 > decide that. Good luck with that.
32
33 I'm sorry, what I said was quite stupid. I did some thinking and testing
34 today, and the algorithm can be actually quite trivial. The real issue
35 is getting a correct input, that is REQUIRED_USE constraints that have
36 only one solution.
37
38 That is the whole problem I was signaling, and which I seem to have lost
39 sight of: solving is trivial, ensuring that the constraint can have only
40 one solution isn't. Hopefully, most of the simple constraints in use
41 will 'just work'.
42
43 The biggest issue for me right now is to find a way to determine whether
44 a particular REQUIRED_USE constraint can have only one solution,
45 independently of the order of non-n-ary constraints. However, some of it
46 can be easily eyeball-checked using a graph.
47
48 > Let's look at real world examples:
49 > gtk+:
50 >
51 > REQUIRED_USE="
52 > || ( aqua wayland X )
53 > xinerama? ( X )
54 > "
55
56 $ ./solve.py '|| ( aqua wayland X ) xinerama? ( X )'
57 [!wayland? => [!X? => [aqua]], xinerama? => [X]]
58
59        x         x
60      w i       w i
61      a n       a n
62      y e       y e
63    a l r     a l r
64    q a a     q a a
65    u n m     u n m
66  X a d a | X a d a
67  0 0 0 0 | 0 1 0 0
68  0 0 0 1 | 1 1 0 1
69  0 0 1 0 | 0 0 1 0 (==)
70  0 0 1 1 | 1 0 1 1
71  0 1 0 0 | 0 1 0 0 (==)
72  0 1 0 1 | 1 1 0 1
73  0 1 1 0 | 0 1 1 0 (==)
74  0 1 1 1 | 1 1 1 1
75  1 0 0 0 | 1 0 0 0 (==)
76  1 0 0 1 | 1 0 0 1 (==)
77  1 0 1 0 | 1 0 1 0 (==)
78  1 0 1 1 | 1 0 1 1 (==)
79  1 1 0 0 | 1 1 0 0 (==)
80  1 1 0 1 | 1 1 0 1 (==)
81  1 1 1 0 | 1 1 1 0 (==)
82  1 1 1 1 | 1 1 1 1 (==)
83
84 >
85 > Let's assume aqua is masked, which reduces to:
86 > REQUIRED_USE="
87 > || ( wayland X )
88 > xinerama? ( X )
89 > "
90
91 $ ./solve.py '|| ( aqua wayland X ) xinerama? ( X )' '!aqua'
92 [!X? => [!aqua? => [wayland]], xinerama? => [X]]
93
94        x         x
95      w i       w i
96      a n       a n
97      y e       y e
98    a l r     a l r
99    q a a     q a a
100    u n m     u n m
101  X a d a | X a d a
102  0 0 0 0 | 0 0 1 0
103  0 0 0 1 | 1 0 1 1
104  0 0 1 0 | 0 0 1 0 (==)
105  0 0 1 1 | 1 0 1 1
106  1 0 0 0 | 1 0 0 0 (==)
107  1 0 0 1 | 1 0 0 1 (==)
108  1 0 1 0 | 1 0 1 0 (==)
109  1 0 1 1 | 1 0 1 1 (==)
110
111 (to avoid having to explicitly play with empty clauses and such, it
112 doesn't remove masked flags, just shifts them to the end)
113
114 >
115 > With USE='-* xinerama' this one is invalid: applying the first
116 > constraint first enables wayland, then the 2nd enables X, giving a
117 > result of USE="xinerama wayland X". Applying the 2nd first enables X,
118 > then the 1st does nothing, giving a result of USE="xinerama X".
119
120 Indeed. To regain order-indepedence, you could do:
121
122 xinerama? ( X ) !xinerama? ( || ( aqua wayland X ) )
123
124 While it's non-obvious, it's the only reasonable way to ensure that
125 things don't start falling apart randomly.
126
127 > Now the funny one:
128 > $ portageq metadata / ebuild dev-lang/php-7.0.18 REQUIRED_USE
129 > cli? ( ^^ ( readline libedit ) ) truetype? ( gd ) webp? ( gd ) cjk?
130 > ( gd ) exif? ( gd ) xpm? ( gd ) gd? ( zlib ) simplexml? ( xml ) soap?
131 > ( xml ) wddx? ( xml ) xmlrpc? ( || ( xml iconv ) ) xmlreader? ( xml )
132 > xslt? ( xml ) ldap-sasl? ( ldap ) mhash? ( hash ) phar? ( hash ) qdbm?
133 > ( !gdbm ) readline? ( !libedit ) recode? ( !imap !mysqli ) sharedmem?
134 > ( !threads ) mysql? ( || ( mysqli pdo ) ) || ( cli cgi fpm apache2
135 > embed phpdbg )
136
137 It isn't as scary as it looks. If you graph it, then you can notice it's
138 just got a few separate sub-graphs:
139
140 1. [!cgi, !fpm, !phpdbg, !apache2, !embed] -> cli -> (readline <=> !libedit)
141
142 (the last bit having duplicate 'readline? ( !libedit )' which is wrong)
143
144 2. [truetype, webp, cjk, exif, xpm] -> gd -> zlib
145
146 3. [simplexml, soap, wddx, xmlrpc, !iconv, xmlreader, xslt] -> xml
147
148 4. ldap-sasl -> ldap
149
150 5. [mhash, phar] -> hash
151
152 6. qdbm -> !gdbm
153
154 7. recode -> [!imap, !mysqli]
155
156 [mysql, !pdo] -> mysqli
157
158 (note that there are both implications for mysqli and !mysqli -- for 4
159 cases they cause convergence error)
160
161 8. sharedmem -> !threads
162
163 All but 1 and 7 are trivial.
164
165 > There are probably dozens of ways to make that non deterministic.
166 > Here's one: USE='-*'. Apply '|| ( cli cgi fpm apache2 embed phpdbg )'
167 > last; this enables 'cli'. Since it's the last one, REQUIRED_USE is not
168 > satisfied because of 'cli? ( ^^ ( readline libedit ) )'.
169 > If you process it first, then you enable cli and readline and are good.
170
171 You just take a second iteration, and things fall into place. That's not
172 a problem.
173
174 > > > Also, what happens if we applied all the constraints and obtained
175 > > > some useflags setting that still fails REQUIRED_USE check ?
176 > >
177 > > It can't happen. If you can apply all the constraints, then implicitly
178 > > REQUIRED_USE is satisfied. If you can't apply all the constraints,
179 > > then it just fails. Of course, we want to ultimately avoid that case.
180 >
181 > See the php example above. How do you ensure this does not happen?
182 >
183 > Note that your assertion 'If you can apply all the constraints, then
184 > implicitly REQUIRED_USE is satisfied.' is false: You're changing the
185 > values of useflags, thus might violate some previously checked
186 > constraint.
187
188 So you reiterate. Either you reach a solution (preferably there's only
189 a single valid solution you can reach) or you hit a previous state which
190 indicates a loop and you fail.
191
192 > > > > The point would be: by definition, the solver should be able to
193 > > > > touch *any* USE flag. If it can't, it mostly implies that you
194 > > > > can't use the automatic solver, and so the case is irrelevant for
195 > > > > checking. Attempting to go beyond that is going to give a lot of
196 > > > > complexity and most likely kill the whole idea.
197 > > > >
198 > > > > One thing we need to worry about are masks. We need to think about
199 > > > > them.
200 > > >
201 > > > It makes zero difference for any solver if you replace variables
202 > > > (useflags) by 'true' or 'false' constants (masked/forced/user-forced
203 > > > useflags). It's even simpler actually. Whether the formula is not
204 > > > constantly 'true' (ie REQUIRED_USE is useless) or constantly
205 > > > 'false' (ie there's no way to solve it) is equivalent to solving
206 > > > SAT. We likely don't want that for repoman running on php.
207 > > >
208 > >
209 > > Well, probably yes. We just need to make sure to apply them correctly
210 > > in different contexts, to avoid accidentally skipping some
211 > > constraints. I think it would be reasonably to assume that:
212 >
213 > [...]
214 >
215 > Yes, you can eliminate constants. It is probably even confluent, i.e.
216 > the order in which you eliminate them does not matter and it always
217 > converges to the same result.
218 >
219
220 As mentioned above, I've chosen an even simpler path -- I just push true
221 parts up front, and false to back. This handles all the cases without
222 extra checking logic -- if immutables satisfy the constraint, it's
223 satisfied early; if they make it impossible to satisfy it, they just
224 make it fail at trying to touch an immutable flag.
225
226 $ ./solve.py "^^ ( c b a )" 'a b'
227 [!a? => [!c? => [b]], b? => [!a, !c], a? => [!c]]
228
229  a b c | a b c
230  1 1 0 | [unsolvable: immutable a mismatched]
231  1 1 1 | [unsolvable: immutable a mismatched]
232
233 My current code is on github [1]. It's ugly, slow and incomplete. It's
234 merely a proof-of-concept and testing toy but still could give some
235 clues.
236
237 [1]:https://github.com/mgorny/required-use
238
239 --
240 Best regards,
241 Michał Górny

Attachments

File name MIME type
signature.asc application/pgp-signature

Replies