Re: [gentoo-catalyst] catalyst changes for improving automation - gentoo-catalyst

From:	Daniel Cordero <gentoo.catalyst@××××.ws>
To:	gentoo-catalyst@l.g.o
Cc:	gentoo-releng@l.g.o
Subject:	Re: [gentoo-catalyst] catalyst changes for improving automation
Date:	Wed, 04 Nov 2020 10:47:35
Message-Id:	`20201104104654.GB3468@dysnomia.localdomain`
In Reply to:	Re: [gentoo-catalyst] catalyst changes for improving automation by Matt Turner

1

On Tue, Nov 03, 2020 at 01:19:51PM -0500, Matt Turner wrote:

2

> On Tue, Nov 3, 2020 at 5:56 AM Daniel Cordero wrote:

3

> >

4

> > On Mon, Nov 02, 2020 at 10:44:07PM -0500, Matt Turner wrote:

5

> > > The catalyst-auto automation scripts live in a repo separate from

6

> > > catalyst. That increases the difficulty of changing catalyst's

7

> > > interface, and it doesn't seem to offer any advantages otherwise.

8

> > > (Keeping build specs in a separate repo allows them to be updated

9

> > > independent of catalyst and that is valuable). Additionally, since the

10

> > > primary way catalyst is used is via this automation, it makes sense to

11

> > > support this workflow in catalyst directly.

12

> > >

13

> >

14

> > What would be more heavily impacted are those users who may not already

15

> > have infra set up to do builds or just starting out using catalyst for

16

> > the first time and haven't written their own automation.

17

> >

18

> > I suggest prioritising the collection of up-to-date documentation,

19

> > especially regarding running catalyst manually, since it'll be

20

> > completely different to the literature that's currently out there.

21

>

22

> I'm a bit unsure what you mean. Do you suggest prioritizing

23

> documenting the current method of running catalyst before changing it?

24

>

25

26

I'm suggesting that documentation is more important than any trivial changes

27

to catalyst, especially with the large amount of changes that have

28

happened recently. We'll still be running scripts on top of catalyst

29

that can handle these tasks on a day-to-day basis.

30

31

> > > But to get there, there are some changes to catalyst that I think are

32

> > > improvements on their own and simplify the path to integrating

33

> > > automation capabilities directly into catalyst. That's what I'd like

34

> > > to discuss here.

35

> > >

36

> > > I'd like to:

37

> > >

38

> > >  1) Replace the custom .spec file format with TOML

39

> > >

40

> >

41

> > Fine. Aside from the extra quotes and commas, I'd be happy with any well

42

> > defined format that can handle strings and lists properly.

43

> >

44

> > >  2) Combine .spec file sequences (e.g., stage1 -> stage2 -> stage3 ->

45

> > > livecd-stage1 -> livecd-stage2) into a single file. I suggest naming

46

> > > this a ".build" file. This will also allow us to remove the redundant

47

> > > information that currently has to be specified in stage1.spec,

48

> > > stage2.spec, stage3.spec, like rel_type, version, profile, etc. It

49

> > > also means that we remove the nonsensical ability to change settings

50

> > > from one stage to the next that should not change (e.g., rel_type,

51

> > > version).

52

> > >

53

> >

54

> > How would a target that depends on a different rel_type work? Forks in

55

> > the dependency tree.

56

> >

57

>

58

> I haven't given that a lot of thought yet, but it's something I would

59

> like to have a plan for.

60

>

61

> We build 32-bit and 64-bit systemd and non-systemd stages on SPARC, as

62

> well as a bootable ISO.

63

>

64

> 32-bit     systemd: stage1 -> stage3

65

> 32-bit non-systemd: stage1 -> stage3

66

> 64-bit     systemd: stage1 -> stage3

67

> 64-bit non-systemd: stage1 -> stage3 -> livecd-stage1 -> livecd-stage2

68

> (We skip stage2)

69

>

70

> This means that we have some build chains that are entirely

71

> independent from one another and could actually run in parallel. E.g.,

72

> a 32-bit build could happen at the same time a 64-bit build runs

73

> without any conflicts. Our SPARC system has 256 threads, so it would

74

> like to build in parallel if possible.

75

>

76

> Similarly, a stage1 build from one of the 32-bit build chains could

77

> happen in parallel with a stage3 build from the other. We wouldn't

78

> want to run the same type of build concurrently if they share a binary

79

> package cache, because we would inevitably spend CPU cycles doing

80

> duplicate work. E.g., the systemd stage3 build running in parallel

81

> with the non-systemd stage3.

82

>

83

> Whether all of those build chains should be specified in the same

84

> ".build" file... I don't know. It seems like it could get a bit

85

> unwieldy.

86

>

87

> Maybe we could have a top-level ".build" file that references each of

88

> these build chains, described in other files? If we did that, that

89

> would certainly allow us to specify a different rel_type per chain.

90

>

91

> I'm not aware of cases where we'd want different rel_types in the same

92

> chain. Do you know of such a case?

93

>

94

95

Well, rel_type is just a text field. I use it to create a server

96

(non-GUI) systemd stage4 and also a full KDE Plasma/systemd stage4.

97

98

They're both systemd stages, but they would otherwise use the same

99

output tarball name, so they get separated out into their own rel_type.

100

101

https://wiki.gentoo.org/wiki/File:Substrate_Stage_Paths.svg

102

103

Do both target chains define the stage1/3 without rebuilding it multiple

104

times? I imagine that a singular .spec file will still be runnable, but

105

I am not really in a position to implement a dependency graph calculator

106

into catalyst.

107

108

> > >  3) Add ability to denote which stage builds produce artifacts we care

109

> > > about (and want to save and/or upload) and which are just temporary.

110

> > > If they're temporary (e.g., a stage1 build) we can delete the artifact

111

> > > after the build sequence has no further use of it, and we can skip

112

> > > compressing the result, etc.

113

> > >

114

> >

115

> > This feature should (haven't tested) already exist - it's just not

116

> > documented.

117

> >

118

> > compression_mode: rsync

119

> > options=['seedcache']

120

>

121

> Hah! I was completely unaware of this. Thanks.

122

>

123

124

I only figured this out because I've been so deep into the compression

125

code.

126

127

> > or don't call 'capture' and/or 'remove_chroot' in action_/finish_sequence.

128

> >

129

> > >

130

> > > To that end, I'm starting by figuring out what I would like the new

131

> > > spec file format to look like. Below are some open questions and then

132

> > > a strawman new-style spec file.

133

> > >

134

> > > • The .spec files in releng.git are really templates that are not

135

> > > directly usable without sed'ing @REPO_DIR@ and @TIMESTAMP@. It would

136

> > > be nice if they were directly usable as that would reduce confusion

137

> > > from users.

138

> > >   • Can we make them directly usable?

139

> > >   • Perhaps we can make catalyst handle the replacements directly?

140

> > >     • Calculating @TIMESTAMP@ is trivially doable—we do it today (see below)

141

> >

142

> > Maybe a strftime() template, or even fstring-like tokens?

143

> > (e.g. "{year}-{month}-{day}")

144

>

145

> One goal I have is to make it more transparent what is actually in a

146

> particular stage tarball or ISO and along with that to make it easier

147

> to reproduce the result.

148

>

149

> Obviously we'll want to keep the ability to specify a particular

150

> version, as you describe, but I think for Gentoo releases we will want

151

> to continue using a timestamp that's unambiguously tied to the git

152

> SHA1 of gentoo.git as is possible.

153

>

154

> > >     • We could configure @REPO_DIR@ in catalyst.conf and let catalyst

155

> > > do the replacement, or we could just make the field relative to some

156

> > > path specified in catalyst.conf?

157

> > >

158

> >

159

> > While nice to have, I don't agree with locking users into a particular

160

> > repository layout.

161

>

162

> Can you explain what you mean? I don't know how what I said would

163

> require a particular repository layout.

164

>

165

> Perhaps you're confused by the @REPO_DIR@ name? It is the path to the

166

> releng.git repository (containing the .specs and the /etc/portage/

167

> files) on the build machine and is not in any way connected with the

168

> ebuild repositories.

169

>

170

171

I was just thinking that there could be more files outside of @REPO_DIR@

172

or /var/tmp/catalyst (or whereever) that may need to be referenced.

173

In practice, this might be limited; I have been wanting a feature like

174

this to exist - as long at it's configurable enough.

175

For me, I'd really just like paths to be relative to the current working

176

directory...

177

178

> The name predates my involvement, so don't blame me :)

179

>

180

> > > • In the current automation scripts, we generate a value for

181

> > > @TIMESTAMP@ from the git HEAD used in creating the snapshot.

182

> > >   • Would be nice to remove the dependence on the squashfs snapshot

183

> > > generation—not difficult to do

184

> > >

185

> >

186

> > I have no comment on this.

187

> >

188

> > > • Can we generate and upload a .build file with replacements done to

189

> > > make stage builds more easily reproducible? Seems easy.

190

> > >

191

> >

192

> > These can just be artifacts from the build.

193

>

194

> Yes, that's what I'm thinking too.

195

>

1	On Tue, Nov 03, 2020 at 01:19:51PM -0500, Matt Turner wrote:
2	> On Tue, Nov 3, 2020 at 5:56 AM Daniel Cordero wrote:
3	> >
4	> > On Mon, Nov 02, 2020 at 10:44:07PM -0500, Matt Turner wrote:
5	> > > The catalyst-auto automation scripts live in a repo separate from
6	> > > catalyst. That increases the difficulty of changing catalyst's
7	> > > interface, and it doesn't seem to offer any advantages otherwise.
8	> > > (Keeping build specs in a separate repo allows them to be updated
9	> > > independent of catalyst and that is valuable). Additionally, since the
10	> > > primary way catalyst is used is via this automation, it makes sense to
11	> > > support this workflow in catalyst directly.
12	> > >
13	> >
14	> > What would be more heavily impacted are those users who may not already
15	> > have infra set up to do builds or just starting out using catalyst for
16	> > the first time and haven't written their own automation.
17	> >
18	> > I suggest prioritising the collection of up-to-date documentation,
19	> > especially regarding running catalyst manually, since it'll be
20	> > completely different to the literature that's currently out there.
21	>
22	> I'm a bit unsure what you mean. Do you suggest prioritizing
23	> documenting the current method of running catalyst before changing it?
24	>
25
26	I'm suggesting that documentation is more important than any trivial changes
27	to catalyst, especially with the large amount of changes that have
28	happened recently. We'll still be running scripts on top of catalyst
29	that can handle these tasks on a day-to-day basis.
30
31	> > > But to get there, there are some changes to catalyst that I think are
32	> > > improvements on their own and simplify the path to integrating
33	> > > automation capabilities directly into catalyst. That's what I'd like
34	> > > to discuss here.
35	> > >
36	> > > I'd like to:
37	> > >
38	> > > 1) Replace the custom .spec file format with TOML
39	> > >
40	> >
41	> > Fine. Aside from the extra quotes and commas, I'd be happy with any well
42	> > defined format that can handle strings and lists properly.
43	> >
44	> > > 2) Combine .spec file sequences (e.g., stage1 -> stage2 -> stage3 ->
45	> > > livecd-stage1 -> livecd-stage2) into a single file. I suggest naming
46	> > > this a ".build" file. This will also allow us to remove the redundant
47	> > > information that currently has to be specified in stage1.spec,
48	> > > stage2.spec, stage3.spec, like rel_type, version, profile, etc. It
49	> > > also means that we remove the nonsensical ability to change settings
50	> > > from one stage to the next that should not change (e.g., rel_type,
51	> > > version).
52	> > >
53	> >
54	> > How would a target that depends on a different rel_type work? Forks in
55	> > the dependency tree.
56	> >
57	>
58	> I haven't given that a lot of thought yet, but it's something I would
59	> like to have a plan for.
60	>
61	> We build 32-bit and 64-bit systemd and non-systemd stages on SPARC, as
62	> well as a bootable ISO.
63	>
64	> 32-bit systemd: stage1 -> stage3
65	> 32-bit non-systemd: stage1 -> stage3
66	> 64-bit systemd: stage1 -> stage3
67	> 64-bit non-systemd: stage1 -> stage3 -> livecd-stage1 -> livecd-stage2
68	> (We skip stage2)
69	>
70	> This means that we have some build chains that are entirely
71	> independent from one another and could actually run in parallel. E.g.,
72	> a 32-bit build could happen at the same time a 64-bit build runs
73	> without any conflicts. Our SPARC system has 256 threads, so it would
74	> like to build in parallel if possible.
75	>
76	> Similarly, a stage1 build from one of the 32-bit build chains could
77	> happen in parallel with a stage3 build from the other. We wouldn't
78	> want to run the same type of build concurrently if they share a binary
79	> package cache, because we would inevitably spend CPU cycles doing
80	> duplicate work. E.g., the systemd stage3 build running in parallel
81	> with the non-systemd stage3.
82	>
83	> Whether all of those build chains should be specified in the same
84	> ".build" file... I don't know. It seems like it could get a bit
85	> unwieldy.
86	>
87	> Maybe we could have a top-level ".build" file that references each of
88	> these build chains, described in other files? If we did that, that
89	> would certainly allow us to specify a different rel_type per chain.
90	>
91	> I'm not aware of cases where we'd want different rel_types in the same
92	> chain. Do you know of such a case?
93	>
94
95	Well, rel_type is just a text field. I use it to create a server
96	(non-GUI) systemd stage4 and also a full KDE Plasma/systemd stage4.
97
98	They're both systemd stages, but they would otherwise use the same
99	output tarball name, so they get separated out into their own rel_type.
100
101	https://wiki.gentoo.org/wiki/File:Substrate_Stage_Paths.svg
102
103	Do both target chains define the stage1/3 without rebuilding it multiple
104	times? I imagine that a singular .spec file will still be runnable, but
105	I am not really in a position to implement a dependency graph calculator
106	into catalyst.
107
108	> > > 3) Add ability to denote which stage builds produce artifacts we care
109	> > > about (and want to save and/or upload) and which are just temporary.
110	> > > If they're temporary (e.g., a stage1 build) we can delete the artifact
111	> > > after the build sequence has no further use of it, and we can skip
112	> > > compressing the result, etc.
113	> > >
114	> >
115	> > This feature should (haven't tested) already exist - it's just not
116	> > documented.
117	> >
118	> > compression_mode: rsync
119	> > options=['seedcache']
120	>
121	> Hah! I was completely unaware of this. Thanks.
122	>
123
124	I only figured this out because I've been so deep into the compression
125	code.
126
127	> > or don't call 'capture' and/or 'remove_chroot' in action_/finish_sequence.
128	> >
129	> > >
130	> > > To that end, I'm starting by figuring out what I would like the new
131	> > > spec file format to look like. Below are some open questions and then
132	> > > a strawman new-style spec file.
133	> > >
134	> > > • The .spec files in releng.git are really templates that are not
135	> > > directly usable without sed'ing @REPO_DIR@ and @TIMESTAMP@. It would
136	> > > be nice if they were directly usable as that would reduce confusion
137	> > > from users.
138	> > > • Can we make them directly usable?
139	> > > • Perhaps we can make catalyst handle the replacements directly?
140	> > > • Calculating @TIMESTAMP@ is trivially doable—we do it today (see below)
141	> >
142	> > Maybe a strftime() template, or even fstring-like tokens?
143	> > (e.g. "{year}-{month}-{day}")
144	>
145	> One goal I have is to make it more transparent what is actually in a
146	> particular stage tarball or ISO and along with that to make it easier
147	> to reproduce the result.
148	>
149	> Obviously we'll want to keep the ability to specify a particular
150	> version, as you describe, but I think for Gentoo releases we will want
151	> to continue using a timestamp that's unambiguously tied to the git
152	> SHA1 of gentoo.git as is possible.
153	>
154	> > > • We could configure @REPO_DIR@ in catalyst.conf and let catalyst
155	> > > do the replacement, or we could just make the field relative to some
156	> > > path specified in catalyst.conf?
157	> > >
158	> >
159	> > While nice to have, I don't agree with locking users into a particular
160	> > repository layout.
161	>
162	> Can you explain what you mean? I don't know how what I said would
163	> require a particular repository layout.
164	>
165	> Perhaps you're confused by the @REPO_DIR@ name? It is the path to the
166	> releng.git repository (containing the .specs and the /etc/portage/
167	> files) on the build machine and is not in any way connected with the
168	> ebuild repositories.
169	>
170
171	I was just thinking that there could be more files outside of @REPO_DIR@
172	or /var/tmp/catalyst (or whereever) that may need to be referenced.
173	In practice, this might be limited; I have been wanting a feature like
174	this to exist - as long at it's configurable enough.
175	For me, I'd really just like paths to be relative to the current working
176	directory...
177
178	> The name predates my involvement, so don't blame me :)
179	>
180	> > > • In the current automation scripts, we generate a value for
181	> > > @TIMESTAMP@ from the git HEAD used in creating the snapshot.
182	> > > • Would be nice to remove the dependence on the squashfs snapshot
183	> > > generation—not difficult to do
184	> > >
185	> >
186	> > I have no comment on this.
187	> >
188	> > > • Can we generate and upload a .build file with replacements done to
189	> > > make stage builds more easily reproducible? Seems easy.
190	> > >
191	> >
192	> > These can just be artifacts from the build.
193	>
194	> Yes, that's what I'm thinking too.
195	>

Gentoo Archives: gentoo-catalyst