Re: [gentoo-dev] [RFC] Forced/automatic USE flag constraints (codename: ENFORCED_USE) - gentoo-dev

From:	"Michał Górny" <mgorny@g.o>
To:	gentoo-dev@l.g.o
Subject:	Re: [gentoo-dev] [RFC] Forced/automatic USE flag constraints (codename: ENFORCED_USE)
Date:	Fri, 09 Jun 2017 12:54:22
Message-Id:	`1497012847.25475.4.camel@gentoo.org`
In Reply to:	Re: [gentoo-dev] [RFC] Forced/automatic USE flag constraints (codename: ENFORCED_USE) by Alexis Ballier

1

On pią, 2017-06-09 at 13:41 +0200, Alexis Ballier wrote:

2

> On Fri, 09 Jun 2017 11:19:20 +0200

3

> Michał Górny <mgorny@g.o> wrote:

4

>

5

> > On śro, 2017-06-07 at 11:56 +0200, Alexis Ballier wrote:

6

> > > On Wed, 07 Jun 2017 11:27:59 +0200

7

> > > Michał Górny <mgorny@g.o> wrote:

8

> > >

9

> > > > On śro, 2017-06-07 at 10:17 +0200, Alexis Ballier wrote:

10

> > > > > > Also, do I presume correctly that for all supported cases

11

> > > > > > (i.e. those which your nsolve does not reject), solve and

12

> > > > > > nsolve are compatible?

13

> > > > >

14

> > > > > Not sure what you mean here. nsolve does not solve anything, it

15

> > > > > just validates REQUIRED_USE so that it is guaranteed it can be

16

> > > > > solved.

17

> > > >

18

> > > > What I mean is whether you can guarantee that:

19

> > > >

20

> > > > a. for every X that nsolve(X) == ok, solve() will be able to find

21

> > > > a valid solution,

22

> > >

23

> > > yes

24

> > >

25

> > > > b. for every X that solve() can solve reliably, nsolve(X) == ok.

26

> > >

27

> > > no and that's not really possible

28

> >

29

> > I thought so. To expand it a little, could you confirm whether I

30

> > correctly presume that:

31

> >

32

> > a. for all 'good' results, the plain iterative solve() should be able

33

> > to find a solution with a single iteration?

34

>

35

> yes that's the point of it

36

>

37

> > b. for all 'need toposort' results, the solve() should be able to find

38

> > a solution with n>=1 iterations?

39

>

40

> yes; though n is only bounded by the # of clauses (expanded as

41

> implications) while it can be 1 if reorderer properly; I wouldn't bother

42

> doing several iterations and just reject that at repoman side since it's

43

> easily solved

44

45

I would prefer not having Portage fail randomly on local ebuilds where

46

the cost of multiple iterations is practically zero, and certainly it's

47

not worth the effort to ensure a particular ordering.

48

49

> > c. all of 'circular' results have at least one USE flag combination

50

> > that can not be solved by solve()?

51

>

52

> In theory no as that would imply your 1st b. In practice, I've only

53

> seen cases like that.

54

55

That was my thought as well. However, I've tested a few of example

56

failures and all of them broke solve() as well.

57

58

> > > > > > Implication(useflag, consequence) ->

59

> > > > >

60

> > > > >                  if not input[useflag]: raise "impossible"

61

> > > >

62

> > > > Why impossible? Unless I'm missing something, it's false

63

> > > > already.

64

> > >

65

> > > 'foo? bar' is always true if foo is false; so it's impossible to

66

> > > make it false

67

> >

68

> > Yes, you are correct. I was actually thinking of 'if LHS is false, we

69

> > do not enforce RHS'.

70

>

71

> I'm wrong actually. It can be falsified by setting foo to True and bar

72

> to False. More on it below.

73

74

Well, I'm not sure if it can still plainly apply here but the generic

75

contract was that in implication clauses only RHS is mutable.

76

77

> > > it's really a corner case as I think we don't allow nested

78

> > > implications inside ||, ^^, () or ??, which is the only way to

79

> > > reach that.

80

> >

81

> > Strictly speaking, there's no rule prohibiting that. And I think we

82

> > actually have or had used 'foo?' inside '||' at least in dependencies.

83

> > The basic idea is that if the flag is disabled, the contents disappear

84

> > from the containing '||' block.

85

>

86

> Interesting. Then we should, sadly, support that.

87

>

88

> Let's think a bit about its meaning.

89

> ?? ( X Y ) is "if X then not Y".

90

> ?? ( a b? ( c ) ) is "if a then not "b? ( c )", that is, "if a then b

91

> and not c", so that's rewritten as "a? ( b !c )".

92

>

93

> That doesn't really seem to match your "basic idea". Instead, this

94

> could be rewritten as:

95

> b? ( ?? ( a c ) ) !b? ( ?? ( a ) )

96

> that is: "b? ( a? ( !c ) )"

97

>

98

> "If a and b are enabled then disable c" seems a much better

99

> interpretation than "If a is enabled then enable b and disable c".

100

>

101

>

102

> Now, we can apply your basic idea to bubble up all the implications so

103

> that they're only at toplevel.

104

>

105

> Something([begin Implication(X,Y) end]) is rewritten as:

106

> Implication(X, Something([begin Y end]))

107

> Implication(!X, Something([begin end]))

108

109

Makes sense. Not that I really like cartesian products but since I had

110

to do it for one thing already, I don't see a major problem splitting

111

this one as well. I'll try to implement it today.

112

113

> > > > >

114

> > > > > Note how the above favors leftmost in all cases. If it needs to

115

> > > > > change something, it always tries to leave the leftmost

116

> > > > > untouched. Note also that it processes everything left to right

117

> > > > > (the AllOf case where REQUIRED_USE="AllOf(list of clauses)"

118

> > > > > ).

119

> > > >

120

> > > > You need to be able to reorder the clauses to handle use.force

121

> > > > and use.mask.

122

> > >

123

> > > Not sure if reorder is the best way. It sure works, but maybe we'd

124

> > > want a repoman error if e.g. 'foo? ( bar )' is in REQUIRED_USE, bar

125

> > > is masked but not foo. That'd be a matter of eliminating the

126

> > > constants in the ast and if we get 'false' for a profile we error

127

> > > out.

128

> >

129

> > Yes, we want that. It makes sense for pure implications. For n-ary

130

> > thingies, it's harder than that and I'd rather not require developers

131

> > to figure out a specific order to make things work.

132

> >

133

> > Think of the following:

134

> >

135

> >   || ( X Y )

136

> >

137

> > with X being masked on profile A, Y being masked on profile B. You

138

> > just can't get it right then.

139

> >

140

> > Of course, this is a bit unrealistic but I think a case when a

141

> > preferred (newer) provider is masked on some architectures is

142

> > realistic. I don't think we want to prefer a worse solution (or go

143

> > back to applying package.use workarounds) because of fringe arches.

144

> >

145

> > Another question is whether we could solve this specific problem via

146

> > applying some kind of 'if forced/masked' logic into the AST.

147

>

148

> I think we're having communication interferences here :) That's exactly

149

> what I'm talking about.

150

>

151

> More specifically:

152

>

153

> For each profile:

154

> 	formula=replace_masks_and_force_by_constants(required_use)

155

> 	simplify(formula)

156

> 	nsolve(formula)

157

>

158

> In your example above, we'd call 'nsolve("|| ( X )")' and 'nsolve("||

159

> ( Y )")' (or even simpler, depending on how simplify() is defined). If

160

> both X and Y are masked on a profile, then that'd reduce to

161

> 'nsolve("False")' which would rant.

162

163

So you're talking about reducing prior to transforming? Yes, that'd

164

work. As I mentioned in one of the first mails wrt my reference

165

implementation, I've used reordering (stable sort) instead of reducing

166

since it was simpler.

167

168

If you reduce (simplify), you need to account for special cases like

169

getting '|| ()' etc. If you reorder only, things just fail the normal

170

way.

171

172

> I think that for solve(), reordering is basically equivalent since we

173

> always favour leftmost. However, we have to somehow guarantee that

174

> solve() will never try to enable a masked flag. This is guaranteed by

175

> the above checks on profiles if enforced by repoman. If we apply the

176

> same logic for solve that'd basically be built-in (masked/forced flags

177

> wont appear in solve()'s input anymore), which I believe is better. Keep

178

> in mind that users can mask and force some useflags, so the repoman

179

> check won't catch that.

180

>

181

>

182

> > > > > So, the very first thing to do is to agree that the above solver

183

> > > > > (the trueify function) is what we want to implement and set

184

> > > > > this in stone. There's no point in implementing a proper requse

185

> > > > > checker if the algorithm is meant to change. Having a formal

186

> > > > > definition will also be necessary to mandate that in future

187

> > > > > EAPIs.

188

> > > > >

189

> > > > > Then, and only then, we'd need to have the above solver

190

> > > > > implemented into portage (hidden under a FEATURES) and import

191

> > > > > my nsolve into repoman (after due cleanup).

192

> > > > >

193

> > > >

194

> > > > Yes, that's my goal. However, before we can set the algorithm in

195

> > > > stone we need to verify that it will work in all of the supported

196

> > > > cases.

197

> > >

198

> > > Yep, that's the point of nsolve/classify :)

199

> > >

200

> > > > Preferably it should also be as simple as possible to avoid

201

> > > > putting too much complexity in the spec.

202

> > >

203

> > > Yes; not sure if anything simpler than the above can be achieved

204

> > > though.

205

> >

206

> > Actually, I think we should roughly proceed in the following way:

207

> >

208

> > 1. Design the solve() algorithm.

209

> >

210

> > 1a. Test solve() on isolated test cases.

211

> >

212

> > 1b. Integrate solve() into Portage (as optional feature) and encourage

213

> > wider testing.

214

> >

215

> > 2. Design the nsolve() verification based on solve().

216

> >

217

> > 2a. Test nsolve() separately.

218

> >

219

> > 2b. Start fixing simple cases that do not bring any controversy (like

220

> > most of the reorderings).

221

> >

222

> > 2c. Integrate nsolve() into repoman.

223

> >

224

> > 3. Spec the whole thing and decide how to go next.

225

> >

226

> >

227

> > Assuming that 1* and 2* is to be done simultaneously, and each

228

> > subpoint implies that if we hit any walls, we can go back to a

229

> > previous point and fix the design.

230

> >

231

> > I think we're mostly past 1a and 2a now. We should try to think a bit

232

> > more about the corner cases right now, and when we're done with that,

233

> > the next step would be to:

234

> >

235

> > A. start fixing some cases ('need topo sort' at least?) to improve

236

> > testing area for the implementation,

237

> >

238

> > B. integrate solve() into Portage,

239

> >

240

> > C. maybe optionally integrate nsolve() into repoman.

241

> >

242

> > However, I don't want to set the spec into stone until we have some

243

> > live testing via Portage with FEATURES=solve-required-use or alike.

244

> > For repoman, we can add the verification optionally (i.e. with the

245

> > above feature or via a separate command-line option) but I don't

246

> > think we should enable it by default until we verify it.

247

> >

248

>

249

> Yes, I agree. Since solve() and nsolve() are very related we should

250

> carefully think how to organize the code then in order to avoid

251

> duplication or them getting out of sync.

252

>

253

> As for enabling it into repoman, I'm not sure which is best. Safest way

254

> is obviously as you suggest, but a warning is not a blocker, so

255

> enabling it by default can have a nice effect of receiving hate mails

256

> from angry developers if something is wrong.

257

>

258

259

There's wisdom in this as well. I'm just a little bit worried that we're

260

first make developers change their REQUIRED_USE, and then have to change

261

the algorithm anyway ;-).

262

263

--

264

Best regards,

265

Michał Górny

Gentoo Archives: gentoo-dev

Attachments

Replies

1	On pią, 2017-06-09 at 13:41 +0200, Alexis Ballier wrote:
2	> On Fri, 09 Jun 2017 11:19:20 +0200
3	> Michał Górny <mgorny@g.o> wrote:
4	>
5	> > On śro, 2017-06-07 at 11:56 +0200, Alexis Ballier wrote:
6	> > > On Wed, 07 Jun 2017 11:27:59 +0200
7	> > > Michał Górny <mgorny@g.o> wrote:
8	> > >
9	> > > > On śro, 2017-06-07 at 10:17 +0200, Alexis Ballier wrote:
10	> > > > > > Also, do I presume correctly that for all supported cases
11	> > > > > > (i.e. those which your nsolve does not reject), solve and
12	> > > > > > nsolve are compatible?
13	> > > > >
14	> > > > > Not sure what you mean here. nsolve does not solve anything, it
15	> > > > > just validates REQUIRED_USE so that it is guaranteed it can be
16	> > > > > solved.
17	> > > >
18	> > > > What I mean is whether you can guarantee that:
19	> > > >
20	> > > > a. for every X that nsolve(X) == ok, solve() will be able to find
21	> > > > a valid solution,
22	> > >
23	> > > yes
24	> > >
25	> > > > b. for every X that solve() can solve reliably, nsolve(X) == ok.
26	> > >
27	> > > no and that's not really possible
28	> >
29	> > I thought so. To expand it a little, could you confirm whether I
30	> > correctly presume that:
31	> >
32	> > a. for all 'good' results, the plain iterative solve() should be able
33	> > to find a solution with a single iteration?
34	>
35	> yes that's the point of it
36	>
37	> > b. for all 'need toposort' results, the solve() should be able to find
38	> > a solution with n>=1 iterations?
39	>
40	> yes; though n is only bounded by the # of clauses (expanded as
41	> implications) while it can be 1 if reorderer properly; I wouldn't bother
42	> doing several iterations and just reject that at repoman side since it's
43	> easily solved
44
45	I would prefer not having Portage fail randomly on local ebuilds where
46	the cost of multiple iterations is practically zero, and certainly it's
47	not worth the effort to ensure a particular ordering.
48
49	> > c. all of 'circular' results have at least one USE flag combination
50	> > that can not be solved by solve()?
51	>
52	> In theory no as that would imply your 1st b. In practice, I've only
53	> seen cases like that.
54
55	That was my thought as well. However, I've tested a few of example
56	failures and all of them broke solve() as well.
57
58	> > > > > > Implication(useflag, consequence) ->
59	> > > > >
60	> > > > > if not input[useflag]: raise "impossible"
61	> > > >
62	> > > > Why impossible? Unless I'm missing something, it's false
63	> > > > already.
64	> > >
65	> > > 'foo? bar' is always true if foo is false; so it's impossible to
66	> > > make it false
67	> >
68	> > Yes, you are correct. I was actually thinking of 'if LHS is false, we
69	> > do not enforce RHS'.
70	>
71	> I'm wrong actually. It can be falsified by setting foo to True and bar
72	> to False. More on it below.
73
74	Well, I'm not sure if it can still plainly apply here but the generic
75	contract was that in implication clauses only RHS is mutable.
76
77	> > > it's really a corner case as I think we don't allow nested
78	> > > implications inside \|\|, ^^, () or ??, which is the only way to
79	> > > reach that.
80	> >
81	> > Strictly speaking, there's no rule prohibiting that. And I think we
82	> > actually have or had used 'foo?' inside '\|\|' at least in dependencies.
83	> > The basic idea is that if the flag is disabled, the contents disappear
84	> > from the containing '\|\|' block.
85	>
86	> Interesting. Then we should, sadly, support that.
87	>
88	> Let's think a bit about its meaning.
89	> ?? ( X Y ) is "if X then not Y".
90	> ?? ( a b? ( c ) ) is "if a then not "b? ( c )", that is, "if a then b
91	> and not c", so that's rewritten as "a? ( b !c )".
92	>
93	> That doesn't really seem to match your "basic idea". Instead, this
94	> could be rewritten as:
95	> b? ( ?? ( a c ) ) !b? ( ?? ( a ) )
96	> that is: "b? ( a? ( !c ) )"
97	>
98	> "If a and b are enabled then disable c" seems a much better
99	> interpretation than "If a is enabled then enable b and disable c".
100	>
101	>
102	> Now, we can apply your basic idea to bubble up all the implications so
103	> that they're only at toplevel.
104	>
105	> Something([begin Implication(X,Y) end]) is rewritten as:
106	> Implication(X, Something([begin Y end]))
107	> Implication(!X, Something([begin end]))
108
109	Makes sense. Not that I really like cartesian products but since I had
110	to do it for one thing already, I don't see a major problem splitting
111	this one as well. I'll try to implement it today.
112
113	> > > > >
114	> > > > > Note how the above favors leftmost in all cases. If it needs to
115	> > > > > change something, it always tries to leave the leftmost
116	> > > > > untouched. Note also that it processes everything left to right
117	> > > > > (the AllOf case where REQUIRED_USE="AllOf(list of clauses)"
118	> > > > > ).
119	> > > >
120	> > > > You need to be able to reorder the clauses to handle use.force
121	> > > > and use.mask.
122	> > >
123	> > > Not sure if reorder is the best way. It sure works, but maybe we'd
124	> > > want a repoman error if e.g. 'foo? ( bar )' is in REQUIRED_USE, bar
125	> > > is masked but not foo. That'd be a matter of eliminating the
126	> > > constants in the ast and if we get 'false' for a profile we error
127	> > > out.
128	> >
129	> > Yes, we want that. It makes sense for pure implications. For n-ary
130	> > thingies, it's harder than that and I'd rather not require developers
131	> > to figure out a specific order to make things work.
132	> >
133	> > Think of the following:
134	> >
135	> > \|\| ( X Y )
136	> >
137	> > with X being masked on profile A, Y being masked on profile B. You
138	> > just can't get it right then.
139	> >
140	> > Of course, this is a bit unrealistic but I think a case when a
141	> > preferred (newer) provider is masked on some architectures is
142	> > realistic. I don't think we want to prefer a worse solution (or go
143	> > back to applying package.use workarounds) because of fringe arches.
144	> >
145	> > Another question is whether we could solve this specific problem via
146	> > applying some kind of 'if forced/masked' logic into the AST.
147	>
148	> I think we're having communication interferences here :) That's exactly
149	> what I'm talking about.
150	>
151	> More specifically:
152	>
153	> For each profile:
154	> formula=replace_masks_and_force_by_constants(required_use)
155	> simplify(formula)
156	> nsolve(formula)
157	>
158	> In your example above, we'd call 'nsolve("\|\| ( X )")' and 'nsolve("\|\|
159	> ( Y )")' (or even simpler, depending on how simplify() is defined). If
160	> both X and Y are masked on a profile, then that'd reduce to
161	> 'nsolve("False")' which would rant.
162
163	So you're talking about reducing prior to transforming? Yes, that'd
164	work. As I mentioned in one of the first mails wrt my reference
165	implementation, I've used reordering (stable sort) instead of reducing
166	since it was simpler.
167
168	If you reduce (simplify), you need to account for special cases like
169	getting '\|\| ()' etc. If you reorder only, things just fail the normal
170	way.
171
172	> I think that for solve(), reordering is basically equivalent since we
173	> always favour leftmost. However, we have to somehow guarantee that
174	> solve() will never try to enable a masked flag. This is guaranteed by
175	> the above checks on profiles if enforced by repoman. If we apply the
176	> same logic for solve that'd basically be built-in (masked/forced flags
177	> wont appear in solve()'s input anymore), which I believe is better. Keep
178	> in mind that users can mask and force some useflags, so the repoman
179	> check won't catch that.
180	>
181	>
182	> > > > > So, the very first thing to do is to agree that the above solver
183	> > > > > (the trueify function) is what we want to implement and set
184	> > > > > this in stone. There's no point in implementing a proper requse
185	> > > > > checker if the algorithm is meant to change. Having a formal
186	> > > > > definition will also be necessary to mandate that in future
187	> > > > > EAPIs.
188	> > > > >
189	> > > > > Then, and only then, we'd need to have the above solver
190	> > > > > implemented into portage (hidden under a FEATURES) and import
191	> > > > > my nsolve into repoman (after due cleanup).
192	> > > > >
193	> > > >
194	> > > > Yes, that's my goal. However, before we can set the algorithm in
195	> > > > stone we need to verify that it will work in all of the supported
196	> > > > cases.
197	> > >
198	> > > Yep, that's the point of nsolve/classify :)
199	> > >
200	> > > > Preferably it should also be as simple as possible to avoid
201	> > > > putting too much complexity in the spec.
202	> > >
203	> > > Yes; not sure if anything simpler than the above can be achieved
204	> > > though.
205	> >
206	> > Actually, I think we should roughly proceed in the following way:
207	> >
208	> > 1. Design the solve() algorithm.
209	> >
210	> > 1a. Test solve() on isolated test cases.
211	> >
212	> > 1b. Integrate solve() into Portage (as optional feature) and encourage
213	> > wider testing.
214	> >
215	> > 2. Design the nsolve() verification based on solve().
216	> >
217	> > 2a. Test nsolve() separately.
218	> >
219	> > 2b. Start fixing simple cases that do not bring any controversy (like
220	> > most of the reorderings).
221	> >
222	> > 2c. Integrate nsolve() into repoman.
223	> >
224	> > 3. Spec the whole thing and decide how to go next.
225	> >
226	> >
227	> > Assuming that 1* and 2* is to be done simultaneously, and each
228	> > subpoint implies that if we hit any walls, we can go back to a
229	> > previous point and fix the design.
230	> >
231	> > I think we're mostly past 1a and 2a now. We should try to think a bit
232	> > more about the corner cases right now, and when we're done with that,
233	> > the next step would be to:
234	> >
235	> > A. start fixing some cases ('need topo sort' at least?) to improve
236	> > testing area for the implementation,
237	> >
238	> > B. integrate solve() into Portage,
239	> >
240	> > C. maybe optionally integrate nsolve() into repoman.
241	> >
242	> > However, I don't want to set the spec into stone until we have some
243	> > live testing via Portage with FEATURES=solve-required-use or alike.
244	> > For repoman, we can add the verification optionally (i.e. with the
245	> > above feature or via a separate command-line option) but I don't
246	> > think we should enable it by default until we verify it.
247	> >
248	>
249	> Yes, I agree. Since solve() and nsolve() are very related we should
250	> carefully think how to organize the code then in order to avoid
251	> duplication or them getting out of sync.
252	>
253	> As for enabling it into repoman, I'm not sure which is best. Safest way
254	> is obviously as you suggest, but a warning is not a blocker, so
255	> enabling it by default can have a nice effect of receiving hate mails
256	> from angry developers if something is wrong.
257	>
258
259	There's wisdom in this as well. I'm just a little bit worried that we're
260	first make developers change their REQUIRED_USE, and then have to change
261	the algorithm anyway ;-).
262
263	--
264	Best regards,
265	Michał Górny