Re: [gentoo-dev] [RFC] Forced/automatic USE flag constraints (codename: ENFORCED_USE) - gentoo-dev

From:	Alexis Ballier <aballier@g.o>
To:	gentoo-dev@l.g.o
Subject:	Re: [gentoo-dev] [RFC] Forced/automatic USE flag constraints (codename: ENFORCED_USE)
Date:	Wed, 31 May 2017 08:38:36
Message-Id:	`20170531103819.417c2420@gentoo.org`
In Reply to:	Re: [gentoo-dev] [RFC] Forced/automatic USE flag constraints (codename: ENFORCED_USE) by "Michał Górny"

1

On Wed, 31 May 2017 10:03:12 +0200

2

Michał Górny <mgorny@g.o> wrote:

3

4

> On śro, 2017-05-31 at 09:32 +0200, Alexis Ballier wrote:

5

> > On Wed, 31 May 2017 08:55:17 +0200

6

> > Michał Górny <mgorny@g.o> wrote:

7

> >

8

> > > On wto, 2017-05-30 at 20:46 +0200, Alexis Ballier wrote:

9

> > > > On Tue, 30 May 2017 20:11:38 +0200

10

> > > > Michał Górny <mgorny@g.o> wrote:

11

> > > > [...]

12

> > > > > > > Of course, we could just validate all the possible cases

13

> > > > > > > via repoman, and reject the ebuild if there's at least one

14

> > > > > > > conflict between them. Not sure how to express that

15

> > > > > > > properly in the spec though. Not sure how it would work

16

> > > > > > > practically.

17

> > > > > >

18

> > > > > > Adding a 2^n check to repoman isn't gonna work well.

19

> > > > > >

20

> > > > >

21

> > > > > I'm not saying it's the most optimal algorithm of verifying

22

> > > > > the correctness of the constraints. It's just the one that's

23

> > > > > quite obvious -- relatively simple and reliable. If someone

24

> > > > > can come up with something better that covers at least the

25

> > > > > most common cases, I'm all for it.

26

> > > > >

27

> > > > > That said, this wouldn't be that much of a problem if we can

28

> > > > > keep the n low. For a start, we can rule out all flags that

29

> > > > > don't appear in REQUIRED_USE at all. Furthermore, I think we

30

> > > > > could also ignore the constraints for flags that don't appear

31

> > > > > there at least 'twice', and so on.

32

> > > >

33

> > > > :)

34

> > > >

35

> > > > You're applying classical techniques to lower the size of a SAT

36

> > > > instance so that your exponential algorithm does not explode,

37

> > > > but it's still hard.

38

> > > >

39

> > > > I'm not sure what you want: If you want to detect that there is

40

> > > > an impossible constraint, well, ebuild writer will notice soon

41

> > > > enough when testing it. If you want to detect that there is a

42

> > > > way to have a conflict between useflags, then there will be

43

> > > > valid cases where this will happen.

44

> > > >

45

> > > > That said, assuming we have REQUIRED_USE in CNF form, its

46

> > > > negation is in DNF form. Solving SAT on DNF formulas is easy

47

> > > > (linear I think), and this would give you an assignment of

48

> > > > useflags triggering an impossible constraint. e.g. 'foo? ( bar

49

> > > > )' with USE='foo -bar' in make.conf. This could be used to

50

> > > > trigger a repoman warning but basically every single ebuild

51

> > > > would trigger those.

52

> > >

53

> > > Not sure if we understand each other.

54

> > >

55

> > > I'd like the constraints to be plain straightforward, to the

56

> > > point of having only one acceptable solution. No special

57

> > > Portage-style algorithms that attempt to provide a reasonable

58

> > > solution to unreasonable input, resulting in horrible solutions

59

> > > that need 20 more hacks every few months.

60

> >

61

> > Yes, we definitely agree here. For that, you need to kill the SAT

62

> > solver and define (spec) what is the straightforward solution, aka a

63

> > deterministic and straightforward (*) algorithm. Otherwise, you fall

64

> > into the problems Ciaran explained in an earlier email.

65

> >

66

> >

67

> > (*) It's better for the algorithm to be simple enough so that

68

> > REQUIRED_USE can be written easily for achieving a given behavior.

69

> >

70

> >

71

> > > For example:

72

> > >

73

> > >   foo? ( bar )

74

> > >

75

> > > would mean 'if you have USE=foo, then USE=bar is enabled as

76

> > > well'. Not 'find some random solution which satisfies this'. In

77

> > > other words, here changing USE=foo into USE=-foo is not an

78

> > > acceptable solution.

79

> >

80

> >

81

> > What if I specifically set USE=-bar in make.conf ? Do we really

82

> > want PM to override that without telling me ?

83

>

84

> Yes. Unless you specifically and explicitly disable that (globally or

85

> for USE=bar), in which case the PM would just reject to proceed.

86

87

88

Then could you please explain how to get the list of useflags the

89

solver is allowed to toggle ? Preferably in a PMS-friendly way (aka no

90

USE=foo emerge). It's not clear to me what this would be and is

91

mandatory for determinism.

92

93

Note that most definitions are acceptable, but there must be one.

94

95

> > I believe that, from the ebuild POV, the ternary useflag model is

96

> > more appropriate: You have a whole bunch of ways to specify

97

> > useflags with portage (make.conf, package.use, profiles, command

98

> > line, ...). From the ebuild those are collapsed into 'user input'.

99

> > You only have IUSE (with its defaults) and that's what the

100

> > auto-solver should play with: those are the flags that can be

101

> > toggled.

102

>

103

> I see your point. However, it's merely a preference problem and we

104

> really don't want the constraints to become ternary and/or PM try to

105

> force the reverse solutions. That's an easy way to lose

106

> predictability.

107

108

They're not ternary anymore after processing ebuild: IUSE="foo +bar"

109

means instantiate foo as -foo if not specified, and bar as +bar.

110

The way I see it, ternary model is useful in the sense that the 3rd

111

undefined state is what the solver can toggle, the others are fixed

112

(either by user input or e.g. use.mask).

113

114

Basically, I see automatic solving of REQUIRED_USE as dynamic IUSE

115

defaults. But see above, this might not be the best way.

116

117

> Besides, I should point out that USE_EXPAND in make.profile

118

> and make.conf are strictly binary.

119

120

Last I checked IUSE="+use_expand_foo use_expand_bar" worked just as

121

useflags.

122

123

> Furthermore, the ternary idea

124

> starts becoming blurry when you have to deal with profiles that you

125

> explicitly want to disable in user configuration.

126

127

I'm not following you here.

128

129

> > > Now, this also means that every constraint that can't be solved in

130

> > > this easy fashion is invalid. We want to detect that, and warn the

131

> > > developer. Some of those could be tricky. Simple example:

132

> > >

133

> > >   foo? ( baz ) bar? ( !baz )

134

> > >

135

> > > This one is invalid because USE='foo bar' requires both 'baz' and

136

> > > '!baz' as a solution. Remember that we don't want to do any

137

> > > changes besides what's explicitly written there, no guessing.

138

> >

139

> > Besides that, what makes it invalid ?

140

>

141

> What makes it invalid is that you can't solve it in a predictable way.

142

143

You can fail in a predictable way and ebuild writer can adjust it to

144

avoid that. Again, you *need* to process the constraints in order. '!a?

145

( b ) !b? ( a )' is not deterministic when none of a and b are

146

enabled otherwise.

147

148

You'll get a message like:

149

"

150

 The constraint bar? ( !baz )' is violated.

151

    bar is enabled because $reason (say, make.conf)

152

    baz is enabled because of the constraint 'foo? ( baz )'

153

			      foo is enabled because $reason

154

"

155

156

> >  How is it more invalid than '?? ( foo bar )' ?

157

>

158

> This would go into:

159

>

160

>   foo? ( !bar )

161

162

Just to be clear: Are you suggesting banning '??' from the syntax or

163

simply an internal rewrite for the solver ?

164

165

> which gives a single predictable solution:

166

167

Then ebuild writer should not write 'foo? ( baz ) bar? ( !baz )' but

168

rather: 'foo? ( !bar baz ) bar? ( !baz )', which should cover more

169

cases. With USE="foo bar", the message would then be:

170

"

171

  The constraint 'foo? ( !bar )' is violated.

172

      foo is enabled because $reason

173

      bar is enabled because $reason

174

"

175

which is similar to the above except there's one less step for

176

explaining the reasons. It's not dramatic but it is, indeed, desirable

177

to have simple & clear reasons. I'd say that's more to the argument for

178

specifying completely how to solve that and leave those small

179

improvements to ebuild writers and/or QA.

180

181

[...]

182

> > > However, the

183

> > > following should be valid:

184

> > >

185

> > >   foo? ( baz ) bar? ( !foo !baz )

186

> > >

187

> > > Because now we clearly indicate that USE=bar disables USE=foo,

188

> > > and therefore makes the first constraint inapplicable. It clearly

189

> > > indicates course of action for all combinations:

190

> >

191

> > Ok, I now think you're aiming for giving full power to the solver,

192

> > overriding user inputs if necessary. Before going further, I think

193

> > we should first agree on what are the useflags such a solver can

194

> > toggle. I'm not sure 'USE=foo emerge blah' should disable foo

195

> > instead of failing for example.

196

> >

197

>

198

> As I said, it's a matter of configuration to decide which flags should

199

> be touched, and which not. Of course if we find that necessary, we may

200

> go into defining a specific set in the profiles or metadata.

201

>

202

> However, I would rather focus on getting a PoC solver and checker

203

> first, and play with existing constraints to see how it all works.

204

205

The solver seems on good tracks, at least from the algorithmic POV. The

206

checker, however, is not clear at all to me. The main reason is that to

207

determine if the solver will be able to solve it, it needs to know what

208

the solver can toggle and what not.

209

210

Alexis.

Gentoo Archives: gentoo-dev

Replies

1	On Wed, 31 May 2017 10:03:12 +0200
2	Michał Górny <mgorny@g.o> wrote:
3
4	> On śro, 2017-05-31 at 09:32 +0200, Alexis Ballier wrote:
5	> > On Wed, 31 May 2017 08:55:17 +0200
6	> > Michał Górny <mgorny@g.o> wrote:
7	> >
8	> > > On wto, 2017-05-30 at 20:46 +0200, Alexis Ballier wrote:
9	> > > > On Tue, 30 May 2017 20:11:38 +0200
10	> > > > Michał Górny <mgorny@g.o> wrote:
11	> > > > [...]
12	> > > > > > > Of course, we could just validate all the possible cases
13	> > > > > > > via repoman, and reject the ebuild if there's at least one
14	> > > > > > > conflict between them. Not sure how to express that
15	> > > > > > > properly in the spec though. Not sure how it would work
16	> > > > > > > practically.
17	> > > > > >
18	> > > > > > Adding a 2^n check to repoman isn't gonna work well.
19	> > > > > >
20	> > > > >
21	> > > > > I'm not saying it's the most optimal algorithm of verifying
22	> > > > > the correctness of the constraints. It's just the one that's
23	> > > > > quite obvious -- relatively simple and reliable. If someone
24	> > > > > can come up with something better that covers at least the
25	> > > > > most common cases, I'm all for it.
26	> > > > >
27	> > > > > That said, this wouldn't be that much of a problem if we can
28	> > > > > keep the n low. For a start, we can rule out all flags that
29	> > > > > don't appear in REQUIRED_USE at all. Furthermore, I think we
30	> > > > > could also ignore the constraints for flags that don't appear
31	> > > > > there at least 'twice', and so on.
32	> > > >
33	> > > > :)
34	> > > >
35	> > > > You're applying classical techniques to lower the size of a SAT
36	> > > > instance so that your exponential algorithm does not explode,
37	> > > > but it's still hard.
38	> > > >
39	> > > > I'm not sure what you want: If you want to detect that there is
40	> > > > an impossible constraint, well, ebuild writer will notice soon
41	> > > > enough when testing it. If you want to detect that there is a
42	> > > > way to have a conflict between useflags, then there will be
43	> > > > valid cases where this will happen.
44	> > > >
45	> > > > That said, assuming we have REQUIRED_USE in CNF form, its
46	> > > > negation is in DNF form. Solving SAT on DNF formulas is easy
47	> > > > (linear I think), and this would give you an assignment of
48	> > > > useflags triggering an impossible constraint. e.g. 'foo? ( bar
49	> > > > )' with USE='foo -bar' in make.conf. This could be used to
50	> > > > trigger a repoman warning but basically every single ebuild
51	> > > > would trigger those.
52	> > >
53	> > > Not sure if we understand each other.
54	> > >
55	> > > I'd like the constraints to be plain straightforward, to the
56	> > > point of having only one acceptable solution. No special
57	> > > Portage-style algorithms that attempt to provide a reasonable
58	> > > solution to unreasonable input, resulting in horrible solutions
59	> > > that need 20 more hacks every few months.
60	> >
61	> > Yes, we definitely agree here. For that, you need to kill the SAT
62	> > solver and define (spec) what is the straightforward solution, aka a
63	> > deterministic and straightforward (*) algorithm. Otherwise, you fall
64	> > into the problems Ciaran explained in an earlier email.
65	> >
66	> >
67	> > (*) It's better for the algorithm to be simple enough so that
68	> > REQUIRED_USE can be written easily for achieving a given behavior.
69	> >
70	> >
71	> > > For example:
72	> > >
73	> > > foo? ( bar )
74	> > >
75	> > > would mean 'if you have USE=foo, then USE=bar is enabled as
76	> > > well'. Not 'find some random solution which satisfies this'. In
77	> > > other words, here changing USE=foo into USE=-foo is not an
78	> > > acceptable solution.
79	> >
80	> >
81	> > What if I specifically set USE=-bar in make.conf ? Do we really
82	> > want PM to override that without telling me ?
83	>
84	> Yes. Unless you specifically and explicitly disable that (globally or
85	> for USE=bar), in which case the PM would just reject to proceed.
86
87
88	Then could you please explain how to get the list of useflags the
89	solver is allowed to toggle ? Preferably in a PMS-friendly way (aka no
90	USE=foo emerge). It's not clear to me what this would be and is
91	mandatory for determinism.
92
93	Note that most definitions are acceptable, but there must be one.
94
95	> > I believe that, from the ebuild POV, the ternary useflag model is
96	> > more appropriate: You have a whole bunch of ways to specify
97	> > useflags with portage (make.conf, package.use, profiles, command
98	> > line, ...). From the ebuild those are collapsed into 'user input'.
99	> > You only have IUSE (with its defaults) and that's what the
100	> > auto-solver should play with: those are the flags that can be
101	> > toggled.
102	>
103	> I see your point. However, it's merely a preference problem and we
104	> really don't want the constraints to become ternary and/or PM try to
105	> force the reverse solutions. That's an easy way to lose
106	> predictability.
107
108	They're not ternary anymore after processing ebuild: IUSE="foo +bar"
109	means instantiate foo as -foo if not specified, and bar as +bar.
110	The way I see it, ternary model is useful in the sense that the 3rd
111	undefined state is what the solver can toggle, the others are fixed
112	(either by user input or e.g. use.mask).
113
114	Basically, I see automatic solving of REQUIRED_USE as dynamic IUSE
115	defaults. But see above, this might not be the best way.
116
117	> Besides, I should point out that USE_EXPAND in make.profile
118	> and make.conf are strictly binary.
119
120	Last I checked IUSE="+use_expand_foo use_expand_bar" worked just as
121	useflags.
122
123	> Furthermore, the ternary idea
124	> starts becoming blurry when you have to deal with profiles that you
125	> explicitly want to disable in user configuration.
126
127	I'm not following you here.
128
129	> > > Now, this also means that every constraint that can't be solved in
130	> > > this easy fashion is invalid. We want to detect that, and warn the
131	> > > developer. Some of those could be tricky. Simple example:
132	> > >
133	> > > foo? ( baz ) bar? ( !baz )
134	> > >
135	> > > This one is invalid because USE='foo bar' requires both 'baz' and
136	> > > '!baz' as a solution. Remember that we don't want to do any
137	> > > changes besides what's explicitly written there, no guessing.
138	> >
139	> > Besides that, what makes it invalid ?
140	>
141	> What makes it invalid is that you can't solve it in a predictable way.
142
143	You can fail in a predictable way and ebuild writer can adjust it to
144	avoid that. Again, you need to process the constraints in order. '!a?
145	( b ) !b? ( a )' is not deterministic when none of a and b are
146	enabled otherwise.
147
148	You'll get a message like:
149	"
150	The constraint bar? ( !baz )' is violated.
151	bar is enabled because $reason (say, make.conf)
152	baz is enabled because of the constraint 'foo? ( baz )'
153	foo is enabled because $reason
154	"
155
156	> > How is it more invalid than '?? ( foo bar )' ?
157	>
158	> This would go into:
159	>
160	> foo? ( !bar )
161
162	Just to be clear: Are you suggesting banning '??' from the syntax or
163	simply an internal rewrite for the solver ?
164
165	> which gives a single predictable solution:
166
167	Then ebuild writer should not write 'foo? ( baz ) bar? ( !baz )' but
168	rather: 'foo? ( !bar baz ) bar? ( !baz )', which should cover more
169	cases. With USE="foo bar", the message would then be:
170	"
171	The constraint 'foo? ( !bar )' is violated.
172	foo is enabled because $reason
173	bar is enabled because $reason
174	"
175	which is similar to the above except there's one less step for
176	explaining the reasons. It's not dramatic but it is, indeed, desirable
177	to have simple & clear reasons. I'd say that's more to the argument for
178	specifying completely how to solve that and leave those small
179	improvements to ebuild writers and/or QA.
180
181	[...]
182	> > > However, the
183	> > > following should be valid:
184	> > >
185	> > > foo? ( baz ) bar? ( !foo !baz )
186	> > >
187	> > > Because now we clearly indicate that USE=bar disables USE=foo,
188	> > > and therefore makes the first constraint inapplicable. It clearly
189	> > > indicates course of action for all combinations:
190	> >
191	> > Ok, I now think you're aiming for giving full power to the solver,
192	> > overriding user inputs if necessary. Before going further, I think
193	> > we should first agree on what are the useflags such a solver can
194	> > toggle. I'm not sure 'USE=foo emerge blah' should disable foo
195	> > instead of failing for example.
196	> >
197	>
198	> As I said, it's a matter of configuration to decide which flags should
199	> be touched, and which not. Of course if we find that necessary, we may
200	> go into defining a specific set in the profiles or metadata.
201	>
202	> However, I would rather focus on getting a PoC solver and checker
203	> first, and play with existing constraints to see how it all works.
204
205	The solver seems on good tracks, at least from the algorithmic POV. The
206	checker, however, is not clear at all to me. The main reason is that to
207	determine if the solver will be able to solve it, it needs to know what
208	the solver can toggle and what not.
209
210	Alexis.