Gentoo Archives: gentoo-dev

From: "Michał Górny" <mgorny@g.o>
To: gentoo-dev <gentoo-dev@l.g.o>
Subject: [gentoo-dev] [RFC pre-GLEP] Gentoo Git Workflow
Date: Tue, 25 Jul 2017 08:05:19
Message-Id: 1500969906.1206.1.camel@gentoo.org
1 Hi, everyone.
2
3 There have been multiple attempts at grasping this but none so far
4 resulted in something official and indisputable. At the same time, we
5 end having to point our users at semi-official guides which change
6 in unpredictable ways.
7
8 Here's the current draft:
9 https://wiki.gentoo.org/wiki/User:MGorny/GLEP:Git
10
11 The basic idea is that the GLEP provides basic guidelines for using git,
12 and then we write a proper manual on top of it (right now, all the pages
13 about it end up as a mix of requirements and a partial git manual).
14
15 What do you think about it? Is there anything else that needs being
16 covered?
17
18 Copy of the markup for inline comments follows.
19
20 ---
21
22 {{GLEP
23 |Number=xx
24 |Title=Gentoo Git Workflow
25 |Type=Standards Track
26 |Status=Draft
27 |Author=Michał Górny <mgorny@g.o>
28 }}
29
30 ==Abstract==
31 This GLEP specifies basic standards and recommendations for using git
32 with the Gentoo ebuild repository. It covers only Gentoo-specific
33 policies, and is not meant to be a complete guide.
34
35 ==Motivation==
36 Although the main Gentoo repository is using git for two years already,
37 developers still lack official documentation on how to use git
38 consistently. Most of the developers learn spoken standards from others
39 and follow them. This eventually brings consistency to some extent but
40 is suboptimal. Furthermore, it results in users having to learn things
41 the hard way instead of having proper documentation to follow.
42
43 There were a few attempts to standardize git use over the time. Most
44 noteworthy are [[Gentoo git workflow]] and [[Gentoo GitHub]] articles.
45 However, they are not any kind of official standards, and they have too
46 broad focus to become one. There was also an initial GLEP attempt but it
47 never even reached the draft stage.
48
49 This GLEP aims to finally provide basic standardization for the use of
50 git in the Gentoo repository. It aims to focus purely on Gentoo-specific
51 standards and not git usage in general. It doesn't mean to be a complete
52 guide but a formal basis on top of which official guides could be
53 created.
54
55 ==Specification==
56 ===Branching model===
57 The main branch of the Gentoo repository is the <kbd>master</kbd>
58 branch. All Gentoo developers push their work straight to the master
59 branch, provided that the commits meet the minimal quality standards.
60 The master branch is also used straight for continous user repository
61 deployment.
62
63 Since multiple developers work on master concurrently, they may be
64 required to rebase multiple times before being able to push. Developers
65 are requested not to use workflows that could prevent others from
66 pushing, e.g. pushing single commits frequently instead of staging them
67 and using a single push.
68
69 Developers can use additional branches to facilitate review and testing
70 of long-term projects of larger scale. However, since git fetches all
71 branches by default, they should be used scarcely. For smaller projects,
72 local branches or repository forks are preferred.
73
74 Unless stated otherwise, the rules set by this specification apply to
75 the master branch only. The development branches can use relaxed rules.
76
77 Rewriting history (i.e. force pushes) of the master branch is forbidden.
78
79 ===Merge commits===
80 The use of merge commits in the Gentoo repository is strongly
81 discouraged. Usually it is preferable to rebase instead. However, the
82 developers are allowed to use merge commits in justified cases. Merge
83 commits can be only used to merge additional branches, the use of
84 implicit <kbd>git pull</kbd> merges is entirely forbidden.
85
86 In a merge commit that is committed straight to the Gentoo repository,
87 the first parent is expected to reference an actual Gentoo commit
88 preceding the merge, while the remaining parents can be used to
89 reference external repositories. The commits following the first parent
90 are required to conform to this specification alike regular Gentoo
91 commits. The additional commits following other parents can use relaxed
92 rules.
93
94 ===OpenPGP signatures===
95 Each commit in the Gentoo repository must be signed using the
96 committer's OpenPGP key. Furthermore, each push to the repository must
97 be signed using the key belonging to the developer performing the push
98 (matched via the SSH key).
99
100 The requirements for OpenPGP keys are covered by [[GLEP:63|GLEP 63]].
101
102 ===Splitting commits===
103 Git commits are lightweight, and the developers are encouraged to split
104 their commits to improve readability and the ability of reverting
105 specific sub-changes. When choosing how to split the commits, the
106 developers should consider the following three rules:
107 # Use atomic commits — one commit per logical change.
108 # Split commits at logical unit (package, eclass, profile…) boundaries.
109 # Avoid creating commits that are 'broken' — e.g. are incomplete, have
110 uninstallable packages.
111
112 It is technically impossible to always respect all of the three rules,
113 so developers have to balance between them at their own discretion. Side
114 changes that are implied by other change (e.g. revbump due to some
115 change) should be included in the first commit requiring them. Commits
116 should be ordered to avoid breakage, and follow logical ordering
117 whenever possible.
118
119 Examples:
120 * When doing a version bump, it is usually not reasonable to split every
121 necessary logical change into separate commit since the interim commits
122 would correspond to a broken package. However, if the package has a live
123 ebuild, it ''might'' be reasonable to perform split logical changes on
124 the live ebuild, then create a release as another logical step.
125 * When doing one or more changes that require a revision bump, bump the
126 revision in the commit including the first change. Split the changes
127 into multiple logical commits without further revision bumps — since
128 they are going to be pushed in a single push, the user will not be
129 exposed to interim state.
130 * When adding a new version of a package that should be masked, you can
131 include the {{Path|package.mask}} edit in the commit adding it.
132 Alternatively, you can add the mask in a split commit ''preceding'' the
133 bump.
134 * When doing a minor change to a large number of packages, it is
135 reasonable to do so in a single commit. However, when doing a major
136 change (e.g. a version bump), it is better to split commits on package
137 boundaries.
138
139 ===Commit messages===
140 A standard git commit message consists of three parts, in order: a
141 summary line, an optional body and an optional set of tags. The parts
142 are separated by a single empty line.
143
144 The summary line is included in the short logs (<kbd>git log --
145 oneline</kbd>, gitweb, GitHub, mail subject) and therefore should
146 provide a short yet accurate description of the change. The summary line
147 starts with a logical unit name, followed by a colon, a space and a
148 short description of the most important changes. If a bug is associated
149 with a change, then it should be included in the summary line as
150 <kbd>#nnnnnn</kbd> or likewise. The summary line must not exceed 69
151 characters, and must not be wrapped.
152
153 The suggested logical unit name formats are:
154 * for a package, <kbd>category/package: …</kbd>;
155 * for an eclass, <kbd>name.eclass: …</kbd>;
156 * for other directories or files, their path or filename (as long as a
157 developer reading the commit messages is able to figure out what it is)
158 — e.g. <kbd>licenses/foo: …</kbd>, <kbd>package.mask: …</kbd>.
159
160 The body is included in the full commit log (<kbd>git log</kbd>,
161 detailed commit info on gitweb/GitHub, mail body). It is optional, and
162 it can be used to describe the commit in more detail if the summary line
163 is not sufficient. It is generally a good idea to repeat the information
164 contained in the summary (except for the logical unit) since the summary
165 is frequently formatted as a title. The body should be wrapped at 72
166 characters. It can contain multiple paragraphs, separated by empty
167 lines.
168
169 The tag part is included in the full commit log as an extension to the
170 body. It consists of one or more lines consisting of key, followed by a
171 colon and a space, followed by value. Git does not enforce any
172 standardization of the keys, and the tag format is ''not'' meant for
173 machine processing.
174
175 A few tags of common use are:
176 * user-related tags:
177 ** <kbd>Acked-by: Full Name <email@×××××××.com></kbd> — commit approved
178 by another person (usually without detailed review),
179 ** <kbd>Reported-by: Full Name <email@×××××××.com></kbd>,
180 ** <kbd>Reviewed-by: Full Name <email@×××××××.com></kbd> — usually
181 indicates full review,
182 ** <kbd>Signed-off-by: Full Name <email@×××××××.com></kbd> — DCO
183 approval (not used in Gentoo right now),
184 ** <kbd>Suggested-by: Full Name <email@×××××××.com></kbd>, 
185 ** <kbd>Tested-by: Full Name <email@×××××××.com></kbd>.
186 * commit-related tags:
187 ** <kbd>Fixes: commit-id (commit message)</kbd> — to indicate fixing a
188 previous commit,
189 ** <kbd>Reverts: commit-id (commit message)</kbd> — to indicate
190 reverting a previous commit,
191 * bug tracker-related tags:
192 ** <kbd>Bug: <nowiki>https://bugs.gentoo.org/NNNNNN</nowiki></kbd>; — to
193 reference a bug,
194 ** <kbd>Closes: <nowiki>https://github.com/gentoo/gentoo/pull/NNNN</nowi
195 ki></kbd>; — to automatically close a GitHub pull request,
196 ** <kbd>Fixes: <nowiki>https://bugs.gentoo.org/NNNNNN</nowiki></kbd>; —
197 to indicate a fixed bug,
198 * package manager tags:
199 ** <kbd>Package-Manager: …</kbd> — used by repoman to indicate Portage
200 version,
201 ** <kbd>RepoMan-Options: …</kbd> — used by repoman to indicate repoman
202 options.
203
204 The bug tracker-related tags can be used to extend the body message.
205 However, they should be skipped if the bug number is already provided in
206 the summary and there is no explicit body.
207
208 ==Rationale==
209 ===Branching model===
210 The model of multiple developers pushing concurrently to the repository
211 containing all packages is preserved from CVS. The developers have
212 discussed the possibility of using other models, in particular of using
213 multiple branches for developers that are afterwards automatically
214 merged into the master branch. However, it was determined that there is
215 no need to use a more complex model at the moment and the potential
216 problems with them outweighed the benefits.
217
218 The necessity of rebasing is a natural consequence of concurrent work,
219 along with the ban of reverse merge commits. Since rebasing a number of
220 commits can take a few seconds or even more, another developer sometimes
221 commits during that time, enforcing another rebase.
222
223 In the past, there were cases of developers using automated scripts
224 which created single commits, ran repoman and pushed them straight to
225 the repository. This resulted in pushes from a single developer every
226 10-15 seconds which made it impossible for other developers to rebase
227 larger commit batches. This kind of workflow is therefore strongly
228 discouraged.
229
230 Creating multiple short-time branches is discouraged as it implies
231 additional transfer for users cloning the repository and additional
232 maintenance burden. Since the git migration, the developers have created
233 a few branches on the repository, and did not maintain them. The Infra
234 had to query the developers about the state of the branches and clean
235 them up. Keeping branches local or hosting them outside Gentoo Infra
236 (e.g. on GitHub) reduces the burden on our users, even if the developers
237 do not clean after themselves.
238
239 ===Merge commits===
240 Merge commits have been debated multiple times in various media, in
241 particular IRC. They have very verbose opponents whose main argument is
242 that they make history unreadable. At the same time, it has been
243 frequently pointed out that merge commits have valid use cases. To
244 satisfy both groups, this specification strongly discourages merge
245 commits but allows their use in justified cases.
246
247 Most importantly, the implicit merge commits created by <kbd>git
248 pull</kbd> are forbbiden. Those merges have no real value or justified
249 use case, and since they are created implicitly by default there have
250 been historical cases where developers pushed them unintentionally. They
251 are banned explicitly to emphasize the necessity of adjusting git
252 configuration to the developers.
253
254 When processing merge commits, it is important to explicitly distinguish
255 the parent that represents 'real' Gentoo history from the one(s) that
256 represent external branches. The former can either be an existing Gentoo
257 commit or a commit that the developer has prepared (on top of existing
258 Gentoo history) before merging the branch. For this reason, it is
259 important to enforce the full set of Gentoo policies on this parent and
260 the commits preceding it. On the other hand, the external branches can
261 be treated similarly to development branches. Relaxing the rules for
262 external branches also makes it possible to merge user contributions
263 with original user OpenPGP signatures, while adding a final developer
264 signature on top of the merge commit.
265
266 When using <kbd>git merge ''foo''</kbd>, the first parent represents the
267 current <kbd>HEAD</kbd> and the second one the merged branch. This is
268 the model used by the specification.
269
270 ===OpenPGP signatures===
271 The signature requirements strictly correspond to the git setup deployed
272 by the Infrastructure team.
273
274 The commit signatures provide an ability to verify the authenticity of
275 all commits throughout the Gentoo repository history (to the point of
276 git conversion). The push signatures mostly serve the purpose of
277 additional authentication for the developer pushing a specific set of
278 commits.
279
280 ===Splitting commits===
281 The goal of the commit splitting rules is to make the best use of git
282 while avoiding enforcing too much overhead on the developer and
283 optimizing to avoid interim broken commits.
284
285 Splitting commits by logical changes improves the readability and makes
286 it easier to revert a specific change while preserving the remaining
287 (irrelevant) changes. The changes done by a developer are easier to
288 comprehend when the reviewer can follow them in the specific order done
289 by the author, rather than combined with other changes.
290
291 Splitting commits on logical unit boundary was used since CVS times.
292 Mostly it improves readability via making it possible to include the
293 unit (package, eclass…) name in the commit message — so that developers
294 perceive what specific packages are affected by the change without
295 having to look into diffstat.
296
297 Requiring commits to be non-'broken' is meant to preserve a good quality
298 git history of the repository. This means that the users can checkout an
299 interim commit without risking a major problem such as a missing
300 dependency that is being added by the commit following it. It also makes
301 it safer to revert the most recent changes with reduced risk of exposing
302 a breakage.
303
304 Those rules partially overlap, and if that is the case, the developers
305 are expected to use common sense to determine the course of action that
306 gives the best result. Furthermore, requiring the strict following of
307 the rules would mean a lot of additional work for developers and a lot
308 of additional commits for no real benefit.
309
310 The examples are provided to make it possible for the developers to get
311 a 'feeling' how to work with the rules.
312
313 ===Commit messages===
314 The basic commit message format is similar to the one used by other
315 projects, and provides for reasonably predictable display of results.
316
317 The summary line is meant to provide a good concise summary of the
318 changes. It is included in the short logs, and should include all the
319 information to help developer determine whether he is interested in
320 looking into the commit details. Including the logical unit name
321 accounts for the fact that most of the Gentoo commits are specific to
322 those units (e.g. packages). The length limit is meant to avoid wrapping
323 the shortlog — which could result in unreadable <kbd>git log --
324 oneline</kbd> or ugly mid-word ellipsis on GitHub.
325
326 The body is meant to provide the detailed information for a commit. It
327 is usually displayed verbatim, and the use of paragraphs along with line
328 wrapping is meant to improve readability. The body should include the
329 information contained in the summary since the two are sometimes really
330 disjoint, and expecting the user to read body as a continuation of
331 summary is confusing. For example, in <kbd>git send-email</kbd>, the
332 summary line is used to construct the mail's summary and is therefore
333 disjoint from the body.
334
335 The tag section is a traditional way of expressing quasi-machine-
336 readable data. However, the commit messages are not really suited for
337 machine use and only a few tags are actually processed by scripts. The
338 specification tries to provide a concise set of potentially useful tags
339 collected from various projects (the Linux kernel, X.org). Those tags
340 can be used interchangeably with plaintext explanation in the body.
341
342 The only tag defined by git itself is the <kbd>Signed-off-by</kbd> line,
343 that is created by <kbd>git commit -s</kbd>. However, Gentoo does not
344 currently enforce a DCO consistently, and therefore it is meaningless.
345
346 The only tag subject to machine processing is the <kbd>Closes</kbd> line
347 that is used by GitHub to automatically close pull requests (and issues
348 — however, Gentoo does not use GitHub's issue tracker).
349
350 All the remaining tags serve purely as a user convenience.
351
352 Historically, Gentoo has been using a few tags starting with <kbd>X-
353 </kbd>. However, this practice was abandoned once it has been pointed
354 out that git does not enforce any standard set of tags, and therefore
355 indicating non-standard tags is meaningless.
356
357 Gentoo developers are still frequently using <kbd>Gentoo-Bug</kbd> tag,
358 sometimes followed by <kbd>Gentoo-Bug-URL</kbd>. Using both
359 simultaneously is meaningless (they are redundant), and using the former
360 has no advantages over using the classic <kbd>#nnnnnn</kbd> form in the
361 summary or the body.
362
363 ==Backwards Compatibility==
364 Most of the new policy will apply to the commits following its approval.
365 Backwards compatibility is not relevant there.
366
367 One particular point that affects commits retroactively is the OpenPGP
368 signing. However, it has been an obligatory requirement enforced by the
369 infrastructure since the git switch. Therefore, all the git history
370 conforms to that.
371
372 ==Reference implementation==
373 All of the elements requiring explicit implementation on the git
374 infrastructure are implemented already. In particular this includes:
375 * blocking force pushes on the <kbd>master</kbd> branch,
376 * requiring signed commits on the <kbd>master</kbd> branch,
377 * requiring signed pushes to the repository.
378
379 The remaining elements are either non-obligatory or non-enforceable at
380 infrastructure level.
381
382 RepoMan suggests starting the commit message with package name since
383 commit [https://gitweb.gentoo.org/proj/portage.git/commit/?id=46dafadff5
384 8da0220511f20480b73ad09f913430
385 46dafadff58da0220511f20480b73ad09f913430].
386
387 ==Acknowledgements==
388 Most of the foundations for this specification were laid out by
389 [[User:Hasufell|Julian Ospald (hasufell)]] in his initial version of
390 [[Gentoo git workflow]] article.
391
392 ==Copyright==
393
394 This work is licensed under the Creative Commons Attribution-ShareAlike
395 3.0 Unported License. To view a copy of this license, visit http://creat
396 ivecommons.org/licenses/by-sa/3.0/.
397
398 --
399 Best regards,
400 Michał Górny

Attachments

File name MIME type
signature.asc application/pgp-signature

Replies