Re: [gentoo-dev] Pre-GLEP RFC: Automated enforcing of REQUIRED_USE constraints - gentoo-dev

From:	Alexis Ballier <aballier@g.o>
To:	gentoo-dev@l.g.o
Subject:	Re: [gentoo-dev] Pre-GLEP RFC: Automated enforcing of REQUIRED_USE constraints
Date:	Sat, 08 Jul 2017 20:34:29
Message-Id:	`20170708223416.071583a0@gentoo.org`
In Reply to:	Re: [gentoo-dev] Pre-GLEP RFC: Automated enforcing of REQUIRED_USE constraints by "Michał Górny"

1

On Sat, 08 Jul 2017 20:44:24 +0200

2

Michał Górny <mgorny@g.o> wrote:

3

4

> On sob, 2017-07-08 at 16:12 +0200, Alexis Ballier wrote:

5

> > On Sat, 08 Jul 2017 11:43:39 +0200

6

> > Michał Górny <mgorny@g.o> wrote:

7

> >

8

> > > Hi, everyone.

9

> > >

10

> > > I think the affairs have settled enough and I've finished filling

11

> > > in the pre-GLEP for REQUIRED_USE auto-enforcing. It's got all

12

> > > the algorithms, rationale and separated reference implementation.

13

> > >

14

> > > If there are no major concerns raised, I will soon start working

15

> > > on writing an optimized implementation for pkgcore/pkgcheck

16

> > > and integrating the verification algos with the CI.

17

> > >

18

> > > The pre-GLEP for review is here:

19

> > >

20

> > > https://wiki.gentoo.org/wiki/User:MGorny/GLEP:ReqUse

21

> >

22

> >

23

> > Constraint group reordering algorithm

24

> >

25

> > I really think we should only consider REQUIRED_USE with

26

> > forced/masked useflags instantiated there. And ban (in repoman)

27

> > REQUIRED_USE that contain some "False": "a? ( b )" with 'a' free

28

> > and 'b' masked is perfectly ok now but it hides a serious problem

29

> > in the package/profile. Instantiating this would give: "a? ( False

30

> > )" and be an error just like we have depend.bad & co. This is

31

> > independent of auto solving or not, it's already wrong.

32

>

33

> As I've already explained you multiple times, this obtains *exactly

34

> the same* effect. However, it's much simpler when it's done like this

35

> because it makes it possible to reuse the already defined algorithms

36

> instead of having to precisely define how to preprocess REQUIRED_USE

37

> for this and cover all the corner cases.

38

39

Simpler??? I don't think so. What I wrote clearly pinpoints that:

40

When you'll write the algorithm for "Verifying that the constraints do

41

not alter immutable flags" you'll notice this is exactly that and can

42

be put as a preprocessing step and then you can drop all the corner

43

cases considerations for immutable flags. I never understood why you're

44

insisting that much on immutables: they're really not natural, not

45

simple, not standard, and carrying them all over seems to be a burden

46

to me.

47

48

> > Reordering is a dangerous path as we've already seen since it can

49

> > create unexpected loops for the solver.

50

>

51

> Freeform reordering is dangerous, and I've removed that already.

52

> Reordering restricted to immutables can not cause any issues that any

53

> other solution wouldn't cause.

54

55

You're very likely right there. Any proof? Esp. any proof that the

56

checker still guarantees the existence of a solution in all cases?

57

I'm not asking for a formal proof, but simply a bit more details than

58

just an assertion saying it's fine.

59

60

> > Working on instantiated REQUIRED_USE constraints would probably

61

> > simplify quite a bit your GLEP too: you already have the "Verifying

62

> > that the constraints do not alter immutable flags" part that roughly

63

> > does the same thing as instantiating, except if you assume it's

64

> > already true you can skip the reordering.

65

>

66

> Except that the reordering can be described in 2 points, and so can be

67

> the immutability verification. Please prove that you can provide

68

> a simpler explanation that doesn't fail in any of the corner cases.

69

70

Except reordering is an invention and immutable checking is simply

71

applying boolean logic rules to your implication and check that no

72

"False" can appear. You can simply start by applying boolean logic and

73

forget about reordering.

74

75

76

> > Concept for transforming REQUIRED_USE into implications

77

> >

78

> > Ok, now I probably understand better the concept of common prefix.

79

> > I'm definitely biased here, but I would feel better with a more

80

> > recursive presentation of it. Assume we have 'validate(list of

81

> > clauses)'; basically, the common prefix idea is that for an

82

> > implication 'foo? ( consequences = list of clauses )' you first

83

> > validate the consequences as if it were a REQUIRED_USE (so that the

84

> > subtree rooted at foo is not self-conflicting) and then consider

85

> > the whole thing as a clause. The idea would then be to have similar

86

> > checks as to what you've written but working on trees (ASTs)

87

> > instead of flattened clauses. This would avoid having to deal with

88

> > unique identities (these would come for free) and IMHO would be

89

> > easier to understand. I'm not sure how to do this though, I'll ping

90

> > you when I have some idea.

91

>

92

> Well, the problem of common prefix is quite complex, and I'm not even

93

> sure if it's really worth more consideration. After all, we're

94

> prettych much talking about doing:

95

>

96

>   a? ( !a ... )

97

>

98

> which has extremely low usability and even lower likeness of

99

> occurring.

100

101

102

Hmm. I think you came up with more valid cases. Can't remember a

103

precise one though.

104

105

106

> > One big advantage of working on ASTs is that it becomes trivial to

107

> > suggest a proper reordering.

108

>

109

> Reordering is never a trivial problem. Unless I'm missing something,

110

> it is technically possible that a 'reordering' will actually require

111

> a sub- item being moved out of the containing group.

112

113

Not if done at the AST level.

114

115

> And to be honest, I find the output of the verification script in this

116

> regard quite useful. That is, it saying 'X affects Y, so it needs to

117

> go before it' is quite clear to me. I don't think most developers

118

> would actually need to script to pinpoint a specific location for

119

> every single constraint.

120

121

In most cases this is sufficient.

122

Think of a more complex case:

123

A -> B

124

B -> C

125

A -> D

126

D -> C

127

128

   |-> B -|

129

A -|      |->C

130

   |-> D -|

131

132

It's starting to be a more complex mental exercise to get the proper

133

ordering when given the 1st form only.

134

135

136

Actually, considering people rant against git merges because they want

137

linear history in the graph log but fail to understand 'git log' is

138

precisely about displaying such a linear ordering, I'm ready to bet

139

someone will rant :)

140

141

142

> > -------

143

> >

144

> > Restrictions on REQUIRED_USE format

145

> >

146

> > I still fail to see the point here. One can simply apply the

147

> > rewriting you suggest below and be done with it. Rationale is not

148

> > very convincing to me:

149

> >

150

> > - avoiding unpredictable results of automatic flag adjustments:

151

> > 	A deterministic algorithm is, by definition, predictable.

152

>

153

> s/unpredictable/surprising/?

154

>

155

> The goal is for it do something that the developer *not reading

156

> the spec* could reasonably predict happening.

157

158

159

There is a huge gap between failing to solve a constraint voluntarily

160

because it has one too much nesting level and having a repoman warning

161

telling 'it is not recommended you write it that way'. The latter

162

ensures said developer is able to predict what is happening, the former

163

just adds annoyance onto users for no real reason.

164

165

166

> > - improving readability of REQUIRED_USE constraints:

167

> > 	No need for a restriction for that. If people want to shoot

168

> > 	themselves in the foot, it is not a PMS problem. I see that

169

> > 	like proposing death penalty for those who commit

170

> > suicide :)

171

>

172

> This is not PMS. This is a GLEP which serves both the purpose of

173

> a technical specification with extended rationale and a policy

174

> document.

175

176

Isn't the goal of it to have it in a future EAPI ?

177

178

179

> > - keeping the specification and implementation relatively simple:

180

> > 	You already define everything for working without

181

> > restriction. Plus, unlimited implication nesting has the same

182

> > complexity.

183

>

184

> No, I don't. I don't cover the meaning of nested groups and things

185

> just explode when they come into game.

186

187

188

You mostly do with the rewriting part, you're only refusing to admit

189

it :)

190

191

192

> > -------

193

> >

194

> >

195

> > Do you have numbers on the checker run on all inputs from

196

> > gentoo-x86 ? Since we're dealing with heuristics those are

197

> > particularly important to validate we're not rejecting too many

198

> > constructs.

199

>

200

> I think I've mentioned that a few times in the GLEP. The validation

201

> and verification bits were tested on all REQUIRED_USE cases from a few

202

> days ago. For the former, all cases are listed in rationale. For

203

> the latter, the reference implementation has test cases for every

204

> construct that triggered an issue, including the false positives that

205

> were fixed. I've only skipped a few that were completely redundant

206

> (e.g. different versions of the same package that added 1-2

207

> irrelevant items).

208

209

Unless I'm missing something, rationale seems more about cases rejected

210

by the restricted syntax. Numbers I'm talking about is the # of rejected

211

constraints vs accepted (and assumed solvable) ones.

212

213

214

> The solver was tested on all combinations of most of the corner

215

> cases. I have only skipped those that would require humongous number

216

> of combinations.

217

218

That's great!

219

220

> I haven't tested all combinations of masked/forced flags from

221

> the profiles though. I plan to do this by implementing it in pkgcheck.

222

> I will also have better performance reports based on optimized

223

> algorithms and real use.

224

225

Makes sense.

226

227

228

Alexis.

Gentoo Archives: gentoo-dev

Replies

1	On Sat, 08 Jul 2017 20:44:24 +0200
2	Michał Górny <mgorny@g.o> wrote:
3
4	> On sob, 2017-07-08 at 16:12 +0200, Alexis Ballier wrote:
5	> > On Sat, 08 Jul 2017 11:43:39 +0200
6	> > Michał Górny <mgorny@g.o> wrote:
7	> >
8	> > > Hi, everyone.
9	> > >
10	> > > I think the affairs have settled enough and I've finished filling
11	> > > in the pre-GLEP for REQUIRED_USE auto-enforcing. It's got all
12	> > > the algorithms, rationale and separated reference implementation.
13	> > >
14	> > > If there are no major concerns raised, I will soon start working
15	> > > on writing an optimized implementation for pkgcore/pkgcheck
16	> > > and integrating the verification algos with the CI.
17	> > >
18	> > > The pre-GLEP for review is here:
19	> > >
20	> > > https://wiki.gentoo.org/wiki/User:MGorny/GLEP:ReqUse
21	> >
22	> >
23	> > Constraint group reordering algorithm
24	> >
25	> > I really think we should only consider REQUIRED_USE with
26	> > forced/masked useflags instantiated there. And ban (in repoman)
27	> > REQUIRED_USE that contain some "False": "a? ( b )" with 'a' free
28	> > and 'b' masked is perfectly ok now but it hides a serious problem
29	> > in the package/profile. Instantiating this would give: "a? ( False
30	> > )" and be an error just like we have depend.bad & co. This is
31	> > independent of auto solving or not, it's already wrong.
32	>
33	> As I've already explained you multiple times, this obtains *exactly
34	> the same* effect. However, it's much simpler when it's done like this
35	> because it makes it possible to reuse the already defined algorithms
36	> instead of having to precisely define how to preprocess REQUIRED_USE
37	> for this and cover all the corner cases.
38
39	Simpler??? I don't think so. What I wrote clearly pinpoints that:
40	When you'll write the algorithm for "Verifying that the constraints do
41	not alter immutable flags" you'll notice this is exactly that and can
42	be put as a preprocessing step and then you can drop all the corner
43	cases considerations for immutable flags. I never understood why you're
44	insisting that much on immutables: they're really not natural, not
45	simple, not standard, and carrying them all over seems to be a burden
46	to me.
47
48	> > Reordering is a dangerous path as we've already seen since it can
49	> > create unexpected loops for the solver.
50	>
51	> Freeform reordering is dangerous, and I've removed that already.
52	> Reordering restricted to immutables can not cause any issues that any
53	> other solution wouldn't cause.
54
55	You're very likely right there. Any proof? Esp. any proof that the
56	checker still guarantees the existence of a solution in all cases?
57	I'm not asking for a formal proof, but simply a bit more details than
58	just an assertion saying it's fine.
59
60	> > Working on instantiated REQUIRED_USE constraints would probably
61	> > simplify quite a bit your GLEP too: you already have the "Verifying
62	> > that the constraints do not alter immutable flags" part that roughly
63	> > does the same thing as instantiating, except if you assume it's
64	> > already true you can skip the reordering.
65	>
66	> Except that the reordering can be described in 2 points, and so can be
67	> the immutability verification. Please prove that you can provide
68	> a simpler explanation that doesn't fail in any of the corner cases.
69
70	Except reordering is an invention and immutable checking is simply
71	applying boolean logic rules to your implication and check that no
72	"False" can appear. You can simply start by applying boolean logic and
73	forget about reordering.
74
75
76	> > Concept for transforming REQUIRED_USE into implications
77	> >
78	> > Ok, now I probably understand better the concept of common prefix.
79	> > I'm definitely biased here, but I would feel better with a more
80	> > recursive presentation of it. Assume we have 'validate(list of
81	> > clauses)'; basically, the common prefix idea is that for an
82	> > implication 'foo? ( consequences = list of clauses )' you first
83	> > validate the consequences as if it were a REQUIRED_USE (so that the
84	> > subtree rooted at foo is not self-conflicting) and then consider
85	> > the whole thing as a clause. The idea would then be to have similar
86	> > checks as to what you've written but working on trees (ASTs)
87	> > instead of flattened clauses. This would avoid having to deal with
88	> > unique identities (these would come for free) and IMHO would be
89	> > easier to understand. I'm not sure how to do this though, I'll ping
90	> > you when I have some idea.
91	>
92	> Well, the problem of common prefix is quite complex, and I'm not even
93	> sure if it's really worth more consideration. After all, we're
94	> prettych much talking about doing:
95	>
96	> a? ( !a ... )
97	>
98	> which has extremely low usability and even lower likeness of
99	> occurring.
100
101
102	Hmm. I think you came up with more valid cases. Can't remember a
103	precise one though.
104
105
106	> > One big advantage of working on ASTs is that it becomes trivial to
107	> > suggest a proper reordering.
108	>
109	> Reordering is never a trivial problem. Unless I'm missing something,
110	> it is technically possible that a 'reordering' will actually require
111	> a sub- item being moved out of the containing group.
112
113	Not if done at the AST level.
114
115	> And to be honest, I find the output of the verification script in this
116	> regard quite useful. That is, it saying 'X affects Y, so it needs to
117	> go before it' is quite clear to me. I don't think most developers
118	> would actually need to script to pinpoint a specific location for
119	> every single constraint.
120
121	In most cases this is sufficient.
122	Think of a more complex case:
123	A -> B
124	B -> C
125	A -> D
126	D -> C
127
128	\|-> B -\|
129	A -\| \|->C
130	\|-> D -\|
131
132	It's starting to be a more complex mental exercise to get the proper
133	ordering when given the 1st form only.
134
135
136	Actually, considering people rant against git merges because they want
137	linear history in the graph log but fail to understand 'git log' is
138	precisely about displaying such a linear ordering, I'm ready to bet
139	someone will rant :)
140
141
142	> > -------
143	> >
144	> > Restrictions on REQUIRED_USE format
145	> >
146	> > I still fail to see the point here. One can simply apply the
147	> > rewriting you suggest below and be done with it. Rationale is not
148	> > very convincing to me:
149	> >
150	> > - avoiding unpredictable results of automatic flag adjustments:
151	> > A deterministic algorithm is, by definition, predictable.
152	>
153	> s/unpredictable/surprising/?
154	>
155	> The goal is for it do something that the developer *not reading
156	> the spec* could reasonably predict happening.
157
158
159	There is a huge gap between failing to solve a constraint voluntarily
160	because it has one too much nesting level and having a repoman warning
161	telling 'it is not recommended you write it that way'. The latter
162	ensures said developer is able to predict what is happening, the former
163	just adds annoyance onto users for no real reason.
164
165
166	> > - improving readability of REQUIRED_USE constraints:
167	> > No need for a restriction for that. If people want to shoot
168	> > themselves in the foot, it is not a PMS problem. I see that
169	> > like proposing death penalty for those who commit
170	> > suicide :)
171	>
172	> This is not PMS. This is a GLEP which serves both the purpose of
173	> a technical specification with extended rationale and a policy
174	> document.
175
176	Isn't the goal of it to have it in a future EAPI ?
177
178
179	> > - keeping the specification and implementation relatively simple:
180	> > You already define everything for working without
181	> > restriction. Plus, unlimited implication nesting has the same
182	> > complexity.
183	>
184	> No, I don't. I don't cover the meaning of nested groups and things
185	> just explode when they come into game.
186
187
188	You mostly do with the rewriting part, you're only refusing to admit
189	it :)
190
191
192	> > -------
193	> >
194	> >
195	> > Do you have numbers on the checker run on all inputs from
196	> > gentoo-x86 ? Since we're dealing with heuristics those are
197	> > particularly important to validate we're not rejecting too many
198	> > constructs.
199	>
200	> I think I've mentioned that a few times in the GLEP. The validation
201	> and verification bits were tested on all REQUIRED_USE cases from a few
202	> days ago. For the former, all cases are listed in rationale. For
203	> the latter, the reference implementation has test cases for every
204	> construct that triggered an issue, including the false positives that
205	> were fixed. I've only skipped a few that were completely redundant
206	> (e.g. different versions of the same package that added 1-2
207	> irrelevant items).
208
209	Unless I'm missing something, rationale seems more about cases rejected
210	by the restricted syntax. Numbers I'm talking about is the # of rejected
211	constraints vs accepted (and assumed solvable) ones.
212
213
214	> The solver was tested on all combinations of most of the corner
215	> cases. I have only skipped those that would require humongous number
216	> of combinations.
217
218	That's great!
219
220	> I haven't tested all combinations of masked/forced flags from
221	> the profiles though. I plan to do this by implementing it in pkgcheck.
222	> I will also have better performance reports based on optimized
223	> algorithms and real use.
224
225	Makes sense.
226
227
228	Alexis.