Gentoo Archives: gentoo-dev

From: "Michał Górny" <mgorny@g.o>
To: gentoo-dev <gentoo-dev@l.g.o>
Subject: [gentoo-dev] GLEP 66 [git workflow] for another review round
Date: Wed, 11 Oct 2017 21:19:39
Message-Id: 1507756766.29616.9.camel@gentoo.org
1 Hi,
2
3 Since the git pre-GLEP was approved by the Council, I've converted it to
4 reST and did a few cosmetic changes (rewording, clarification). Please
5 take a one more look at the current wording:
6
7 Text: https://dev.gentoo.org/~mgorny/tmp/glep-0066.rst
8 HTML: https://dev.gentoo.org/~mgorny/tmp/glep-0066.html
9
10 Changes since the previous version ([1]):
11
12 - reworded the 'merge commits' section, attempting to use more
13 unambiguous terms ('ancestry path') and clarifying that we are concerned
14 about ancestors up to the previous merge commit,
15
16 - switched to 'bug #nnnnnn' for summary line variant to trigger
17 willikins,
18
19 - removed the suggestion that Bug/Closes should be skipped if there's no
20 body and bugno is in summary -- as that obviously wouldn't trigger
21 the hooks,
22
23 - added more rationale for using full URLs in various bug-related tags
24 (as established by the most recent discussion).
25
26 A copy of the text is included below to facilitate inline comments.
27
28 [1]:https://wiki.gentoo.org/index.php?title=User:MGorny/GLEP:Git&oldid=674320
29
30
31 ---
32 GLEP: 66
33 Title: Gentoo Git Workflow
34 Author: Michał Górny <mgorny@g.o>
35 Type: Standards Track
36 Status: Draft
37 Version: 1
38 Created: 2017-07-24
39 Last-Modified: 2017-10-09
40 Post-History: 2017-07-25, 2017-09-28, 2017-10-11
41 Content-Type: text/x-rst
42 Requires:
43 Replaces:
44 ---
45
46 Abstract
47 ========
48 This GLEP specifies basic standards and recommendations for using git
49 with the Gentoo ebuild repository. It covers only Gentoo-specific
50 policies, and is not meant to be a complete guide.
51
52
53 Motivation
54 ==========
55 Although the main Gentoo repository is using git for two years already,
56 developers still lack official documentation on how to use git
57 consistently. Most of the developers learn spoken standards from others
58 and follow them. This eventually brings consistency to some extent
59 but is suboptimal. Furthermore, it results in users having to learn
60 things the hard way instead of having proper documentation to follow.
61
62 There were a few attempts to standardize git use over the time. Most
63 noteworthy are Gentoo git workflow [#GENTOO_GIT_WORKFLOW]_
64 and Gentoo GitHub [#GENTOO_GITHUB]_ articles. However, they are not any
65 kind of official standards, and they have too broad focus to become one.
66 There was also an initial GLEP attempt but it never even reached
67 the draft stage.
68
69 This GLEP aims to finally provide basic standardization for the use of
70 git in the Gentoo repository. It aims to focus purely
71 on Gentoo-specific standards and not git usage in general. It doesn't
72 mean to be a complete guide but a formal basis on top of which official
73 guides could be created.
74
75
76 Specification
77 =============
78
79 Branching model
80 ---------------
81 The main branch of the Gentoo repository is the ``master`` branch. All
82 Gentoo developers push their work straight to the master branch,
83 provided that the commits meet the minimal quality standards.
84 The master branch is also used straight for continuous user repository
85 deployment.
86
87 Since multiple developers work on master concurrently, they may be
88 required to rebase multiple times before being able to push. Developers
89 are requested not to use workflows that could prevent others from
90 pushing, e.g. pushing single commits frequently instead of staging them
91 and using a single push.
92
93 Developers can use additional branches to facilitate review and testing
94 of long-term projects of larger scale. However, since git fetches all
95 branches by default, they should be used scarcely. For smaller
96 projects, local branches or repository forks are preferred.
97
98 Unless stated otherwise, the rules set by this specification apply to
99 the master branch only. The development branches can use relaxed rules.
100
101 Rewriting history (i.e. force pushes) of the master branch is forbidden.
102
103
104 Merge commits
105 -------------
106 The use of merge commits in the Gentoo repository is strongly
107 discouraged. Usually it is preferable to rebase instead. However,
108 the developers are allowed to use merge commits in justified cases.
109 Merge commits can be only used to merge additional branches, the use
110 of implicit ``git pull`` merges is entirely forbidden.
111
112 In a merge commit that is committed straight to the Gentoo repository,
113 the first parent is expected to reference an actual Gentoo commit
114 preceding the merge, while the remaining parents can be used to
115 reference commits originating from external repositories. The commits
116 on the ancestry path of the first parent (up to the next merge commit)
117 are required to conform to this specification. The commits on remaining
118 ancestry paths can use relaxed rules.
119
120
121 OpenPGP signatures
122 ------------------
123 Each commit in the Gentoo repository must be signed using
124 the committer's OpenPGP key. Furthermore, each push to the repository
125 must be signed using the key belonging to the developer performing
126 the push (matched via the SSH key).
127
128 The requirements for OpenPGP keys are covered by GLEP 63 [#GLEP63]_.
129
130
131 Splitting commits
132 -----------------
133 Git commits are lightweight, and the developers are encouraged to split
134 their commits to improve readability and the ability of reverting
135 specific sub-changes. When choosing how to split the commits,
136 the developers should consider the following three rules:
137
138 1. Use atomic commits — one commit per logical change.
139 2. Split commits at logical unit (package, eclass, profile…) boundaries.
140 3. Avoid creating commits that are 'broken' — e.g. are incomplete, have
141 uninstallable packages.
142
143 It is technically impossible to always respect all of the three rules,
144 so developers have to balance between them at their own discretion.
145 Side changes that are implied by other change (e.g. revbump due to some
146 change) should be included in the first commit requiring them. Commits
147 should be ordered to avoid breakage, and follow logical ordering
148 whenever possible.
149
150 Examples:
151
152 - When doing a version bump, it is usually not reasonable to split every
153 necessary logical change into separate commit since the interim
154 commits would correspond to a broken package. However, if the package
155 has a live ebuild, it *might* be reasonable to perform split logical
156 changes on the live ebuild, then create a release as another logical
157 step.
158
159 - When doing one or more changes that require a revision bump, bump
160 the revision in the commit including the first change. Split
161 the changes into multiple logical commits without further revision
162 bumps — since they are going to be pushed in a single push, the user
163 will not be exposed to interim state.
164
165 - When adding a new version of a package that should be masked, you can
166 include the ``package.mask`` edit in the commit adding it.
167 Alternatively, you can add the mask in a split commit *preceding*
168 the bump.
169
170 - When doing a minor change to a large number of packages, it is
171 reasonable to do so in a single commit. However, when doing a major
172 change (e.g. a version bump), it is better to split commits on package
173 boundaries.
174
175
176 Commit messages
177 ---------------
178 A standard git commit message consists of three parts, in order:
179 a summary line, an optional body and an optional set of tags. The parts
180 are separated by a single empty line.
181
182 The summary line is included in the short logs (``git log --oneline``,
183 gitweb, GitHub, mail subject) and therefore should provide a short yet
184 accurate description of the change. The summary line starts with
185 a logical unit name, followed by a colon, a space and a short
186 description of the most important changes. If a bug is associated with
187 a change, then it can be included in the summary line
188 as ``bug #nnnnnn``. The summary line must not exceed 69 characters,
189 and must not be wrapped.
190
191 The suggested logical unit name formats are:
192
193 - for a package, ``category/package: …``;
194 - for an eclass, ``name.eclass: …``;
195 - for other directories or files, their path or filename (as long
196 as a developer reading the commit messages is able to figure out what
197 it is) — e.g. ``licenses/foo: …``, ``package.mask: …``.
198
199 The body is included in the full commit log (``git log``, detailed
200 commit info on gitweb/GitHub, mail body). It is optional, and it can be
201 used to describe the commit in more detail if the summary line is not
202 sufficient. It is generally a good idea to repeat the information
203 contained in the summary (except for the logical unit) since the summary
204 is frequently formatted as a title and not adjacent to the body.
205 The body should be wrapped at 72 characters. It can contain multiple
206 paragraphs, separated by empty lines.
207
208 The tag part is included in the full commit log as an extension to
209 the body. It consists of one or more lines consisting of a key,
210 followed by a colon and a space, followed by value. Git does not
211 enforce any standardization of the keys, and the tag format is *not*
212 meant for machine processing.
213
214 A few tags of common use are:
215
216 - user-related tags:
217
218 - ``Acked-by: Full Name <email@×××××××.com>`` — commit approved
219 by another person (usually without detailed review),
220 - ``Reported-by: Full Name <email@×××××××.com>`` — usually indicates
221 full review,
222 - ``Signed-off-by: Full Name <email@×××××××.com>`` — DCO approval (not
223 used in Gentoo right now),
224 - ``Suggested-by: Full Name <email@×××××××.com>``,
225 - ``Tested-by: Full Name <email@×××××××.com>``.
226
227 - commit-related tags:
228
229 - ``Fixes: commit-id (commit message)`` — to indicate fixing
230 an earlier commit,
231 - ``Reverts: commit-id (commit message)`` — to indicate reverting
232 an earlier commit,
233
234 - bug tracker-related tags:
235
236 - ``Bug: https://bugs.gentoo.org/NNNNNN`` — to reference a bug;
237 the commit will be linked in a comment,
238 - ``Closes: https://bugs.gentoo.org/NNNNNN`` — to automatically close
239 a Gentoo bug (RESOLVED/FIXED, linking the commit),
240 - ``Closes: https://github.com/gentoo/gentoo/pull/NNNN``
241 to automatically close a pull request on GitHub, GitLab, BitBucket
242 or a compatible service (where the commits are mirrored),
243
244 - package manager tags:
245
246 - ``Package-Manager: …`` — used by repoman to indicate Portage
247 version,
248 - ``RepoMan-Options: …`` — used by repoman to indicate repoman
249 options.
250
251
252 Rationale
253 =========
254
255 Branching model
256 ---------------
257 The model of multiple developers pushing concurrently to the repository
258 containing all packages is preserved from CVS. The developers have
259 discussed the possibility of using other models, in particular of using
260 multiple branches for developers that are afterwards automatically
261 merged into the master branch. However, it was determined that there is
262 no need to use a more complex model at the moment and the potential
263 problems with them outweighed the benefits.
264
265 The necessity of rebasing is a natural consequence of concurrent work,
266 along with the ban of reverse merge commits. Since rebasing a number
267 of commits can take a few seconds or even more, another developer
268 sometimes commits during that time, enforcing another rebase.
269
270 In the past, there were cases of developers using automated scripts
271 which created single commits, ran repoman and pushed them straight to
272 the repository. This resulted in pushes from a single developer every
273 10-15 seconds which made it impossible for other developers to rebase
274 larger commit batches. This kind of workflow is therefore strongly
275 discouraged.
276
277 Creating multiple short-time branches is discouraged as it implies
278 additional transfer for users cloning the repository and additional
279 maintenance burden. Since the git migration, the developers have
280 created a few branches on the repository, and did not maintain them.
281 The Infra team had to query the developers about the state
282 of the branches and clean them up. Keeping branches local or hosting
283 them outside Gentoo Infra (e.g. on GitHub) reduces the burden on our
284 users, even if the developers do not clean after themselves.
285
286
287 Merge commits
288 -------------
289 Merge commits have been debated multiple times in various media,
290 in particular IRC. They have very verbose opponents whose main argument
291 is that they make history unreadable. At the same time, it has been
292 frequently pointed out that merge commits have valid use cases.
293 To satisfy both groups, this specification strongly discourages merge
294 commits but allows their use in justified cases.
295
296 Most importantly, the implicit merge commits created by ``git pull``
297 are forbidden. Those merges have no real value or justified use case,
298 and since they are created implicitly by default there have been
299 historical cases where developers pushed them unintentionally. They are
300 banned explicitly to emphasize the necessity of adjusting git
301 configuration to the developers.
302
303 When processing merge commits, it is important to explicitly distinguish
304 the parent that represents 'real' Gentoo history from the one(s) that
305 represent external branches. The former can either be an existing
306 Gentoo commit or a commit that the developer has prepared (on top of
307 existing Gentoo history) before merging the branch. For this reason, it
308 is important to enforce the full set of Gentoo policies on this parent
309 and the commits preceding it. On the other hand, the external branches
310 can be treated similarly to development branches. Relaxing the rules
311 for external branches also makes it possible to merge user contributions
312 with original user OpenPGP signatures, while adding a final developer
313 signature on top of the merge commit.
314
315 When using ``git merge foo``, the first parent represents the current
316 ``HEAD`` and the second one the merged branch. This is the model
317 used by the specification.
318
319
320 OpenPGP signatures
321 ------------------
322 The signature requirements strictly correspond to the git setup deployed
323 by the Infrastructure team.
324
325 The commit signatures provide an ability to verify the authenticity
326 of all commits throughout the Gentoo repository history (to the point
327 of git conversion). The push signatures mostly serve the purpose
328 of additional authentication for the developer pushing a specific set
329 of commits.
330
331
332 Splitting commits
333 -----------------
334 The goal of the commit splitting rules is to make the best use of git
335 while avoiding enforcing too much overhead on the developer
336 and optimizing to avoid interim broken commits.
337
338 Splitting commits by logical changes improves the readability and makes
339 it easier to revert a specific change while preserving the remaining
340 (irrelevant) changes. The changes done by a developer are easier
341 to comprehend when the reviewer can follow them in the specific order
342 done by the author, rather than combined with other changes.
343
344 Splitting commits on logical unit boundary was used since CVS times.
345 Mostly it improves readability via making it possible to include
346 the unit (package, eclass…) name in the commit message — so that
347 developers perceive what specific packages are affected by the change
348 without having to look into diffstat.
349
350 Requiring commits to be non-'broken' is meant to preserve a good quality
351 git history of the repository. This means that the users can check
352 an interim commit out without risking a major problem such as a missing
353 dependency that is being added by the commit following it. It also
354 makes it safer to revert the most recent changes with reduced risk
355 of exposing a breakage.
356
357 Those rules partially overlap, and if that is the case, the developers
358 are expected to use common sense to determine the course of action that
359 gives the best result. Furthermore, requiring the strict following
360 of the rules would mean a lot of additional work for developers
361 and a lot of additional commits for no real benefit.
362
363 The examples are provided to make it possible for the developers to get
364 a 'feeling' how to work with the rules.
365
366
367 Commit messages
368 ---------------
369 The basic commit message format is similar to the one used by other
370 projects, and provides for reasonably predictable display of results.
371
372 The summary line is meant to provide a good concise summary
373 of the changes. It is included in the short logs, and should include
374 all the information to help developer determine whether he is interested
375 in looking into the commit details. Including the logical unit name
376 accounts for the fact that most of the Gentoo commits are specific
377 to those units (e.g. packages). The length limit is meant to avoid
378 wrapping the shortlog — which could result in unreadable ``git log
379 --oneline`` or ugly mid-word ellipsis on GitHub.
380
381 The body is meant to provide the detailed information for a commit.
382 It is usually displayed verbatim, and the use of paragraphs along with
383 line wrapping is meant to improve readability. The body should include
384 the information contained in the summary since the two are sometimes
385 really disjoint, and expecting the user to read body as a continuation
386 of summary is confusing. For example, in ``git send-email``,
387 the summary line is used to construct the mail's subject
388 and is therefore disjoint from the body.
389
390 The tag section is a traditional way of expressing quasi-machine-
391 readable data. However, the commit messages are not really suited
392 for machine use and only a few tags are actually processed by scripts.
393 The specification tries to provide a concise set of potentially useful
394 tags collected from various projects (the Linux kernel, X.org). Those
395 tags can be used interchangeably with plaintext explanation in the body.
396
397 The only tag defined by git itself is the ``Signed-off-by`` line,
398 that is created by ``git commit -s``. However, Gentoo does not
399 currently enforce a DCO consistently, and therefore it is meaningless.
400
401 The tags subject to machine processing are the ``Bug`` and ``Closes``
402 lines. Both are used by git.gentoo.org to handle Gentoo Bugzilla
403 and the latter is also used by GitHub to automatically close pull
404 requests (and issues — however, Gentoo does not use GitHub's issue
405 tracker). GitHub, GitLab, Bitbucket, git.gentoo.org also support
406 ``Fixes`` and ``Resolves`` tags (and the first three also some
407 variations of them), however ``Closes`` has been already established
408 in Gentoo and is used for consistency.
409
410 All the remaining tags serve purely as a user convenience.
411
412 Historically, Gentoo has been using a few tags starting with ``X-``.
413 However, this practice was abandoned once it has been pointed out that
414 git does not enforce any standard set of tags, and therefore indicating
415 non-standard tags is meaningless.
416
417 Gentoo developers are still frequently using ``Gentoo-Bug`` tag,
418 sometimes followed by ``Gentoo-Bug-URL``. Using both
419 simultaneously is meaningless (they are redundant), and using the former
420 has no advantages over using the classic ``#nnnnnn`` form in the summary
421 or the body.
422
423 Using full URLs in ``Closes`` is necessary to properly namespace
424 the action to the Gentoo services and avoid accidentally closing
425 incorrect issues or pull requests when the commit is mirrored or cherry-
426 picked into another repository. For consistency, they are also used
427 for ``Bug`` and should be used for any future tags that might be
428 introduced. This also ensures that the URLs are automatically converted
429 into hyperlinks by various tools.
430
431 Including the bug number in the summary of the commit message causes
432 willikins to automatically expand on the bug on ``#gentoo-commits``.
433
434
435 Backwards Compatibility
436 =======================
437 Most of the new policy will apply to the commits following its approval.
438 Backwards compatibility is not relevant there.
439
440 One particular point that affects commits retroactively is the OpenPGP
441 signing. However, it has been an obligatory requirement enforced by
442 the infrastructure since the git switch. Therefore, all the git history
443 conforms to that.
444
445
446 Reference Implementation
447 ========================
448 All of the elements requiring explicit implementation on the git
449 infrastructure are implemented already. In particular this includes:
450
451 - blocking force pushes on the ``master`` branch,
452 - requiring signed commits on the ``master`` branch,
453 - requiring signed pushes to the repository.
454
455 The remaining elements are either non-obligatory or non-enforceable
456 at infrastructure level.
457
458 RepoMan suggests starting the commit message with package name since
459 commit 46dafadff58da0 [#REPOMAN_PKG_NAME_COMMIT]_.
460
461 Acknowledgements
462 ================
463 Most of the foundations for this specification were laid out by Julian
464 Ospald (hasufell) in his initial version of Gentoo git workflow
465 [#GENTOO_GIT_WORKFLOW]_ article.
466
467
468 References
469 ==========
470 .. [#GENTOO_GIT_WORKFLOW] Gentoo Git Workflow (on Gentoo Wiki)
471 https://wiki.gentoo.org/wiki/Gentoo_git_workflow
472
473 .. [#GENTOO_GITHUB] Gentoo GitHub (on Gentoo Wiki)
474 https://wiki.gentoo.org/wiki/Gentoo_GitHub
475
476 .. TODO: verify this
477 .. [#GLEP63] GLEP 63: Gentoo GPG key policies
478 https://www.gentoo.org/glep/glep-0063.html
479
480 .. [#REPOMAN_PKG_NAME_COMMIT]
481 https://gitweb.gentoo.org/proj/portage.git/commit/?id=46dafadff58da0220511f20480b73ad09f913430
482
483
484 Copyright
485 =========
486 This work is licensed under the Creative Commons Attribution-ShareAlike 3.0
487 Unported License. To view a copy of this license, visit
488 http://creativecommons.org/licenses/by-sa/3.0/.
489
490
491 --
492 Best regards,
493 Michał Górny

Replies

Subject Author
Re: [gentoo-dev] GLEP 66 [git workflow] for another review round Mike Pagano <mpagano@g.o>