1 |
On czw, 2017-06-01 at 10:55 +0200, Alexis Ballier wrote: |
2 |
> On Wed, 31 May 2017 21:02:24 +0200 |
3 |
> Michał Górny <mgorny@g.o> wrote: |
4 |
> |
5 |
> > On śro, 2017-05-31 at 19:39 +0200, Alexis Ballier wrote: |
6 |
> > > > > Again, you *need* to process the constraints in order. '!a? |
7 |
> > > > > ( b ) !b? ( a )' is not deterministic when none of a and b are |
8 |
> > > > > enabled otherwise. |
9 |
> > > > |
10 |
> > > > You can't rely on any particular order of constraints, especially |
11 |
> > > > when eclass stacking comes into play. You could try defining some |
12 |
> > > > constraint- sorting algorithm but it's going to be complex and |
13 |
> > > > it's usefulness will be limited anyway due to various kinds of |
14 |
> > > > grouping. |
15 |
> > > |
16 |
> > > |
17 |
> > > Better have some order: If half of the above constraint comes from |
18 |
> > > ebuild and the other half comes from eclass, then PM might toggle |
19 |
> > > 'a' or 'b' depending on the phase of the moon which is specifically |
20 |
> > > what we're trying to avoid. |
21 |
> > |
22 |
> > No, it can't. That's the whole point. The algorithm must be defined so |
23 |
> > that it is always predictable independently of order (maybe except |
24 |
> > the ordering inside ^^, ||, ??) and independently of how it's nested |
25 |
> > (i.e. 'a? ( b? ( c ) )' must give the same result as 'b? ( a? ( c ) |
26 |
> > )'). |
27 |
> |
28 |
> This is a lot of handwaving without real description of how it would |
29 |
> work. This starts to look a lot like confluence in rewriting systems |
30 |
> and you want developers to write only confluent rules and repoman to |
31 |
> decide that. Good luck with that. |
32 |
|
33 |
I'm sorry, what I said was quite stupid. I did some thinking and testing |
34 |
today, and the algorithm can be actually quite trivial. The real issue |
35 |
is getting a correct input, that is REQUIRED_USE constraints that have |
36 |
only one solution. |
37 |
|
38 |
That is the whole problem I was signaling, and which I seem to have lost |
39 |
sight of: solving is trivial, ensuring that the constraint can have only |
40 |
one solution isn't. Hopefully, most of the simple constraints in use |
41 |
will 'just work'. |
42 |
|
43 |
The biggest issue for me right now is to find a way to determine whether |
44 |
a particular REQUIRED_USE constraint can have only one solution, |
45 |
independently of the order of non-n-ary constraints. However, some of it |
46 |
can be easily eyeball-checked using a graph. |
47 |
|
48 |
> Let's look at real world examples: |
49 |
> gtk+: |
50 |
> |
51 |
> REQUIRED_USE=" |
52 |
> || ( aqua wayland X ) |
53 |
> xinerama? ( X ) |
54 |
> " |
55 |
|
56 |
$ ./solve.py '|| ( aqua wayland X ) xinerama? ( X )' |
57 |
[!wayland? => [!X? => [aqua]], xinerama? => [X]] |
58 |
|
59 |
x x |
60 |
w i w i |
61 |
a n a n |
62 |
y e y e |
63 |
a l r a l r |
64 |
q a a q a a |
65 |
u n m u n m |
66 |
X a d a | X a d a |
67 |
0 0 0 0 | 0 1 0 0 |
68 |
0 0 0 1 | 1 1 0 1 |
69 |
0 0 1 0 | 0 0 1 0 (==) |
70 |
0 0 1 1 | 1 0 1 1 |
71 |
0 1 0 0 | 0 1 0 0 (==) |
72 |
0 1 0 1 | 1 1 0 1 |
73 |
0 1 1 0 | 0 1 1 0 (==) |
74 |
0 1 1 1 | 1 1 1 1 |
75 |
1 0 0 0 | 1 0 0 0 (==) |
76 |
1 0 0 1 | 1 0 0 1 (==) |
77 |
1 0 1 0 | 1 0 1 0 (==) |
78 |
1 0 1 1 | 1 0 1 1 (==) |
79 |
1 1 0 0 | 1 1 0 0 (==) |
80 |
1 1 0 1 | 1 1 0 1 (==) |
81 |
1 1 1 0 | 1 1 1 0 (==) |
82 |
1 1 1 1 | 1 1 1 1 (==) |
83 |
|
84 |
> |
85 |
> Let's assume aqua is masked, which reduces to: |
86 |
> REQUIRED_USE=" |
87 |
> || ( wayland X ) |
88 |
> xinerama? ( X ) |
89 |
> " |
90 |
|
91 |
$ ./solve.py '|| ( aqua wayland X ) xinerama? ( X )' '!aqua' |
92 |
[!X? => [!aqua? => [wayland]], xinerama? => [X]] |
93 |
|
94 |
x x |
95 |
w i w i |
96 |
a n a n |
97 |
y e y e |
98 |
a l r a l r |
99 |
q a a q a a |
100 |
u n m u n m |
101 |
X a d a | X a d a |
102 |
0 0 0 0 | 0 0 1 0 |
103 |
0 0 0 1 | 1 0 1 1 |
104 |
0 0 1 0 | 0 0 1 0 (==) |
105 |
0 0 1 1 | 1 0 1 1 |
106 |
1 0 0 0 | 1 0 0 0 (==) |
107 |
1 0 0 1 | 1 0 0 1 (==) |
108 |
1 0 1 0 | 1 0 1 0 (==) |
109 |
1 0 1 1 | 1 0 1 1 (==) |
110 |
|
111 |
(to avoid having to explicitly play with empty clauses and such, it |
112 |
doesn't remove masked flags, just shifts them to the end) |
113 |
|
114 |
> |
115 |
> With USE='-* xinerama' this one is invalid: applying the first |
116 |
> constraint first enables wayland, then the 2nd enables X, giving a |
117 |
> result of USE="xinerama wayland X". Applying the 2nd first enables X, |
118 |
> then the 1st does nothing, giving a result of USE="xinerama X". |
119 |
|
120 |
Indeed. To regain order-indepedence, you could do: |
121 |
|
122 |
xinerama? ( X ) !xinerama? ( || ( aqua wayland X ) ) |
123 |
|
124 |
While it's non-obvious, it's the only reasonable way to ensure that |
125 |
things don't start falling apart randomly. |
126 |
|
127 |
> Now the funny one: |
128 |
> $ portageq metadata / ebuild dev-lang/php-7.0.18 REQUIRED_USE |
129 |
> cli? ( ^^ ( readline libedit ) ) truetype? ( gd ) webp? ( gd ) cjk? |
130 |
> ( gd ) exif? ( gd ) xpm? ( gd ) gd? ( zlib ) simplexml? ( xml ) soap? |
131 |
> ( xml ) wddx? ( xml ) xmlrpc? ( || ( xml iconv ) ) xmlreader? ( xml ) |
132 |
> xslt? ( xml ) ldap-sasl? ( ldap ) mhash? ( hash ) phar? ( hash ) qdbm? |
133 |
> ( !gdbm ) readline? ( !libedit ) recode? ( !imap !mysqli ) sharedmem? |
134 |
> ( !threads ) mysql? ( || ( mysqli pdo ) ) || ( cli cgi fpm apache2 |
135 |
> embed phpdbg ) |
136 |
|
137 |
It isn't as scary as it looks. If you graph it, then you can notice it's |
138 |
just got a few separate sub-graphs: |
139 |
|
140 |
1. [!cgi, !fpm, !phpdbg, !apache2, !embed] -> cli -> (readline <=> !libedit) |
141 |
|
142 |
(the last bit having duplicate 'readline? ( !libedit )' which is wrong) |
143 |
|
144 |
2. [truetype, webp, cjk, exif, xpm] -> gd -> zlib |
145 |
|
146 |
3. [simplexml, soap, wddx, xmlrpc, !iconv, xmlreader, xslt] -> xml |
147 |
|
148 |
4. ldap-sasl -> ldap |
149 |
|
150 |
5. [mhash, phar] -> hash |
151 |
|
152 |
6. qdbm -> !gdbm |
153 |
|
154 |
7. recode -> [!imap, !mysqli] |
155 |
|
156 |
[mysql, !pdo] -> mysqli |
157 |
|
158 |
(note that there are both implications for mysqli and !mysqli -- for 4 |
159 |
cases they cause convergence error) |
160 |
|
161 |
8. sharedmem -> !threads |
162 |
|
163 |
All but 1 and 7 are trivial. |
164 |
|
165 |
> There are probably dozens of ways to make that non deterministic. |
166 |
> Here's one: USE='-*'. Apply '|| ( cli cgi fpm apache2 embed phpdbg )' |
167 |
> last; this enables 'cli'. Since it's the last one, REQUIRED_USE is not |
168 |
> satisfied because of 'cli? ( ^^ ( readline libedit ) )'. |
169 |
> If you process it first, then you enable cli and readline and are good. |
170 |
|
171 |
You just take a second iteration, and things fall into place. That's not |
172 |
a problem. |
173 |
|
174 |
> > > Also, what happens if we applied all the constraints and obtained |
175 |
> > > some useflags setting that still fails REQUIRED_USE check ? |
176 |
> > |
177 |
> > It can't happen. If you can apply all the constraints, then implicitly |
178 |
> > REQUIRED_USE is satisfied. If you can't apply all the constraints, |
179 |
> > then it just fails. Of course, we want to ultimately avoid that case. |
180 |
> |
181 |
> See the php example above. How do you ensure this does not happen? |
182 |
> |
183 |
> Note that your assertion 'If you can apply all the constraints, then |
184 |
> implicitly REQUIRED_USE is satisfied.' is false: You're changing the |
185 |
> values of useflags, thus might violate some previously checked |
186 |
> constraint. |
187 |
|
188 |
So you reiterate. Either you reach a solution (preferably there's only |
189 |
a single valid solution you can reach) or you hit a previous state which |
190 |
indicates a loop and you fail. |
191 |
|
192 |
> > > > The point would be: by definition, the solver should be able to |
193 |
> > > > touch *any* USE flag. If it can't, it mostly implies that you |
194 |
> > > > can't use the automatic solver, and so the case is irrelevant for |
195 |
> > > > checking. Attempting to go beyond that is going to give a lot of |
196 |
> > > > complexity and most likely kill the whole idea. |
197 |
> > > > |
198 |
> > > > One thing we need to worry about are masks. We need to think about |
199 |
> > > > them. |
200 |
> > > |
201 |
> > > It makes zero difference for any solver if you replace variables |
202 |
> > > (useflags) by 'true' or 'false' constants (masked/forced/user-forced |
203 |
> > > useflags). It's even simpler actually. Whether the formula is not |
204 |
> > > constantly 'true' (ie REQUIRED_USE is useless) or constantly |
205 |
> > > 'false' (ie there's no way to solve it) is equivalent to solving |
206 |
> > > SAT. We likely don't want that for repoman running on php. |
207 |
> > > |
208 |
> > |
209 |
> > Well, probably yes. We just need to make sure to apply them correctly |
210 |
> > in different contexts, to avoid accidentally skipping some |
211 |
> > constraints. I think it would be reasonably to assume that: |
212 |
> |
213 |
> [...] |
214 |
> |
215 |
> Yes, you can eliminate constants. It is probably even confluent, i.e. |
216 |
> the order in which you eliminate them does not matter and it always |
217 |
> converges to the same result. |
218 |
> |
219 |
|
220 |
As mentioned above, I've chosen an even simpler path -- I just push true |
221 |
parts up front, and false to back. This handles all the cases without |
222 |
extra checking logic -- if immutables satisfy the constraint, it's |
223 |
satisfied early; if they make it impossible to satisfy it, they just |
224 |
make it fail at trying to touch an immutable flag. |
225 |
|
226 |
$ ./solve.py "^^ ( c b a )" 'a b' |
227 |
[!a? => [!c? => [b]], b? => [!a, !c], a? => [!c]] |
228 |
|
229 |
a b c | a b c |
230 |
1 1 0 | [unsolvable: immutable a mismatched] |
231 |
1 1 1 | [unsolvable: immutable a mismatched] |
232 |
|
233 |
My current code is on github [1]. It's ugly, slow and incomplete. It's |
234 |
merely a proof-of-concept and testing toy but still could give some |
235 |
clues. |
236 |
|
237 |
[1]:https://github.com/mgorny/required-use |
238 |
|
239 |
-- |
240 |
Best regards, |
241 |
Michał Górny |