1 |
Hi, |
2 |
|
3 |
Since the git pre-GLEP was approved by the Council, I've converted it to |
4 |
reST and did a few cosmetic changes (rewording, clarification). Please |
5 |
take a one more look at the current wording: |
6 |
|
7 |
Text: https://dev.gentoo.org/~mgorny/tmp/glep-0066.rst |
8 |
HTML: https://dev.gentoo.org/~mgorny/tmp/glep-0066.html |
9 |
|
10 |
Changes since the previous version ([1]): |
11 |
|
12 |
- reworded the 'merge commits' section, attempting to use more |
13 |
unambiguous terms ('ancestry path') and clarifying that we are concerned |
14 |
about ancestors up to the previous merge commit, |
15 |
|
16 |
- switched to 'bug #nnnnnn' for summary line variant to trigger |
17 |
willikins, |
18 |
|
19 |
- removed the suggestion that Bug/Closes should be skipped if there's no |
20 |
body and bugno is in summary -- as that obviously wouldn't trigger |
21 |
the hooks, |
22 |
|
23 |
- added more rationale for using full URLs in various bug-related tags |
24 |
(as established by the most recent discussion). |
25 |
|
26 |
A copy of the text is included below to facilitate inline comments. |
27 |
|
28 |
[1]:https://wiki.gentoo.org/index.php?title=User:MGorny/GLEP:Git&oldid=674320 |
29 |
|
30 |
|
31 |
--- |
32 |
GLEP: 66 |
33 |
Title: Gentoo Git Workflow |
34 |
Author: Michał Górny <mgorny@g.o> |
35 |
Type: Standards Track |
36 |
Status: Draft |
37 |
Version: 1 |
38 |
Created: 2017-07-24 |
39 |
Last-Modified: 2017-10-09 |
40 |
Post-History: 2017-07-25, 2017-09-28, 2017-10-11 |
41 |
Content-Type: text/x-rst |
42 |
Requires: |
43 |
Replaces: |
44 |
--- |
45 |
|
46 |
Abstract |
47 |
======== |
48 |
This GLEP specifies basic standards and recommendations for using git |
49 |
with the Gentoo ebuild repository. It covers only Gentoo-specific |
50 |
policies, and is not meant to be a complete guide. |
51 |
|
52 |
|
53 |
Motivation |
54 |
========== |
55 |
Although the main Gentoo repository is using git for two years already, |
56 |
developers still lack official documentation on how to use git |
57 |
consistently. Most of the developers learn spoken standards from others |
58 |
and follow them. This eventually brings consistency to some extent |
59 |
but is suboptimal. Furthermore, it results in users having to learn |
60 |
things the hard way instead of having proper documentation to follow. |
61 |
|
62 |
There were a few attempts to standardize git use over the time. Most |
63 |
noteworthy are Gentoo git workflow [#GENTOO_GIT_WORKFLOW]_ |
64 |
and Gentoo GitHub [#GENTOO_GITHUB]_ articles. However, they are not any |
65 |
kind of official standards, and they have too broad focus to become one. |
66 |
There was also an initial GLEP attempt but it never even reached |
67 |
the draft stage. |
68 |
|
69 |
This GLEP aims to finally provide basic standardization for the use of |
70 |
git in the Gentoo repository. It aims to focus purely |
71 |
on Gentoo-specific standards and not git usage in general. It doesn't |
72 |
mean to be a complete guide but a formal basis on top of which official |
73 |
guides could be created. |
74 |
|
75 |
|
76 |
Specification |
77 |
============= |
78 |
|
79 |
Branching model |
80 |
--------------- |
81 |
The main branch of the Gentoo repository is the ``master`` branch. All |
82 |
Gentoo developers push their work straight to the master branch, |
83 |
provided that the commits meet the minimal quality standards. |
84 |
The master branch is also used straight for continuous user repository |
85 |
deployment. |
86 |
|
87 |
Since multiple developers work on master concurrently, they may be |
88 |
required to rebase multiple times before being able to push. Developers |
89 |
are requested not to use workflows that could prevent others from |
90 |
pushing, e.g. pushing single commits frequently instead of staging them |
91 |
and using a single push. |
92 |
|
93 |
Developers can use additional branches to facilitate review and testing |
94 |
of long-term projects of larger scale. However, since git fetches all |
95 |
branches by default, they should be used scarcely. For smaller |
96 |
projects, local branches or repository forks are preferred. |
97 |
|
98 |
Unless stated otherwise, the rules set by this specification apply to |
99 |
the master branch only. The development branches can use relaxed rules. |
100 |
|
101 |
Rewriting history (i.e. force pushes) of the master branch is forbidden. |
102 |
|
103 |
|
104 |
Merge commits |
105 |
------------- |
106 |
The use of merge commits in the Gentoo repository is strongly |
107 |
discouraged. Usually it is preferable to rebase instead. However, |
108 |
the developers are allowed to use merge commits in justified cases. |
109 |
Merge commits can be only used to merge additional branches, the use |
110 |
of implicit ``git pull`` merges is entirely forbidden. |
111 |
|
112 |
In a merge commit that is committed straight to the Gentoo repository, |
113 |
the first parent is expected to reference an actual Gentoo commit |
114 |
preceding the merge, while the remaining parents can be used to |
115 |
reference commits originating from external repositories. The commits |
116 |
on the ancestry path of the first parent (up to the next merge commit) |
117 |
are required to conform to this specification. The commits on remaining |
118 |
ancestry paths can use relaxed rules. |
119 |
|
120 |
|
121 |
OpenPGP signatures |
122 |
------------------ |
123 |
Each commit in the Gentoo repository must be signed using |
124 |
the committer's OpenPGP key. Furthermore, each push to the repository |
125 |
must be signed using the key belonging to the developer performing |
126 |
the push (matched via the SSH key). |
127 |
|
128 |
The requirements for OpenPGP keys are covered by GLEP 63 [#GLEP63]_. |
129 |
|
130 |
|
131 |
Splitting commits |
132 |
----------------- |
133 |
Git commits are lightweight, and the developers are encouraged to split |
134 |
their commits to improve readability and the ability of reverting |
135 |
specific sub-changes. When choosing how to split the commits, |
136 |
the developers should consider the following three rules: |
137 |
|
138 |
1. Use atomic commits — one commit per logical change. |
139 |
2. Split commits at logical unit (package, eclass, profile…) boundaries. |
140 |
3. Avoid creating commits that are 'broken' — e.g. are incomplete, have |
141 |
uninstallable packages. |
142 |
|
143 |
It is technically impossible to always respect all of the three rules, |
144 |
so developers have to balance between them at their own discretion. |
145 |
Side changes that are implied by other change (e.g. revbump due to some |
146 |
change) should be included in the first commit requiring them. Commits |
147 |
should be ordered to avoid breakage, and follow logical ordering |
148 |
whenever possible. |
149 |
|
150 |
Examples: |
151 |
|
152 |
- When doing a version bump, it is usually not reasonable to split every |
153 |
necessary logical change into separate commit since the interim |
154 |
commits would correspond to a broken package. However, if the package |
155 |
has a live ebuild, it *might* be reasonable to perform split logical |
156 |
changes on the live ebuild, then create a release as another logical |
157 |
step. |
158 |
|
159 |
- When doing one or more changes that require a revision bump, bump |
160 |
the revision in the commit including the first change. Split |
161 |
the changes into multiple logical commits without further revision |
162 |
bumps — since they are going to be pushed in a single push, the user |
163 |
will not be exposed to interim state. |
164 |
|
165 |
- When adding a new version of a package that should be masked, you can |
166 |
include the ``package.mask`` edit in the commit adding it. |
167 |
Alternatively, you can add the mask in a split commit *preceding* |
168 |
the bump. |
169 |
|
170 |
- When doing a minor change to a large number of packages, it is |
171 |
reasonable to do so in a single commit. However, when doing a major |
172 |
change (e.g. a version bump), it is better to split commits on package |
173 |
boundaries. |
174 |
|
175 |
|
176 |
Commit messages |
177 |
--------------- |
178 |
A standard git commit message consists of three parts, in order: |
179 |
a summary line, an optional body and an optional set of tags. The parts |
180 |
are separated by a single empty line. |
181 |
|
182 |
The summary line is included in the short logs (``git log --oneline``, |
183 |
gitweb, GitHub, mail subject) and therefore should provide a short yet |
184 |
accurate description of the change. The summary line starts with |
185 |
a logical unit name, followed by a colon, a space and a short |
186 |
description of the most important changes. If a bug is associated with |
187 |
a change, then it can be included in the summary line |
188 |
as ``bug #nnnnnn``. The summary line must not exceed 69 characters, |
189 |
and must not be wrapped. |
190 |
|
191 |
The suggested logical unit name formats are: |
192 |
|
193 |
- for a package, ``category/package: …``; |
194 |
- for an eclass, ``name.eclass: …``; |
195 |
- for other directories or files, their path or filename (as long |
196 |
as a developer reading the commit messages is able to figure out what |
197 |
it is) — e.g. ``licenses/foo: …``, ``package.mask: …``. |
198 |
|
199 |
The body is included in the full commit log (``git log``, detailed |
200 |
commit info on gitweb/GitHub, mail body). It is optional, and it can be |
201 |
used to describe the commit in more detail if the summary line is not |
202 |
sufficient. It is generally a good idea to repeat the information |
203 |
contained in the summary (except for the logical unit) since the summary |
204 |
is frequently formatted as a title and not adjacent to the body. |
205 |
The body should be wrapped at 72 characters. It can contain multiple |
206 |
paragraphs, separated by empty lines. |
207 |
|
208 |
The tag part is included in the full commit log as an extension to |
209 |
the body. It consists of one or more lines consisting of a key, |
210 |
followed by a colon and a space, followed by value. Git does not |
211 |
enforce any standardization of the keys, and the tag format is *not* |
212 |
meant for machine processing. |
213 |
|
214 |
A few tags of common use are: |
215 |
|
216 |
- user-related tags: |
217 |
|
218 |
- ``Acked-by: Full Name <email@×××××××.com>`` — commit approved |
219 |
by another person (usually without detailed review), |
220 |
- ``Reported-by: Full Name <email@×××××××.com>`` — usually indicates |
221 |
full review, |
222 |
- ``Signed-off-by: Full Name <email@×××××××.com>`` — DCO approval (not |
223 |
used in Gentoo right now), |
224 |
- ``Suggested-by: Full Name <email@×××××××.com>``, |
225 |
- ``Tested-by: Full Name <email@×××××××.com>``. |
226 |
|
227 |
- commit-related tags: |
228 |
|
229 |
- ``Fixes: commit-id (commit message)`` — to indicate fixing |
230 |
an earlier commit, |
231 |
- ``Reverts: commit-id (commit message)`` — to indicate reverting |
232 |
an earlier commit, |
233 |
|
234 |
- bug tracker-related tags: |
235 |
|
236 |
- ``Bug: https://bugs.gentoo.org/NNNNNN`` — to reference a bug; |
237 |
the commit will be linked in a comment, |
238 |
- ``Closes: https://bugs.gentoo.org/NNNNNN`` — to automatically close |
239 |
a Gentoo bug (RESOLVED/FIXED, linking the commit), |
240 |
- ``Closes: https://github.com/gentoo/gentoo/pull/NNNN`` — |
241 |
to automatically close a pull request on GitHub, GitLab, BitBucket |
242 |
or a compatible service (where the commits are mirrored), |
243 |
|
244 |
- package manager tags: |
245 |
|
246 |
- ``Package-Manager: …`` — used by repoman to indicate Portage |
247 |
version, |
248 |
- ``RepoMan-Options: …`` — used by repoman to indicate repoman |
249 |
options. |
250 |
|
251 |
|
252 |
Rationale |
253 |
========= |
254 |
|
255 |
Branching model |
256 |
--------------- |
257 |
The model of multiple developers pushing concurrently to the repository |
258 |
containing all packages is preserved from CVS. The developers have |
259 |
discussed the possibility of using other models, in particular of using |
260 |
multiple branches for developers that are afterwards automatically |
261 |
merged into the master branch. However, it was determined that there is |
262 |
no need to use a more complex model at the moment and the potential |
263 |
problems with them outweighed the benefits. |
264 |
|
265 |
The necessity of rebasing is a natural consequence of concurrent work, |
266 |
along with the ban of reverse merge commits. Since rebasing a number |
267 |
of commits can take a few seconds or even more, another developer |
268 |
sometimes commits during that time, enforcing another rebase. |
269 |
|
270 |
In the past, there were cases of developers using automated scripts |
271 |
which created single commits, ran repoman and pushed them straight to |
272 |
the repository. This resulted in pushes from a single developer every |
273 |
10-15 seconds which made it impossible for other developers to rebase |
274 |
larger commit batches. This kind of workflow is therefore strongly |
275 |
discouraged. |
276 |
|
277 |
Creating multiple short-time branches is discouraged as it implies |
278 |
additional transfer for users cloning the repository and additional |
279 |
maintenance burden. Since the git migration, the developers have |
280 |
created a few branches on the repository, and did not maintain them. |
281 |
The Infra team had to query the developers about the state |
282 |
of the branches and clean them up. Keeping branches local or hosting |
283 |
them outside Gentoo Infra (e.g. on GitHub) reduces the burden on our |
284 |
users, even if the developers do not clean after themselves. |
285 |
|
286 |
|
287 |
Merge commits |
288 |
------------- |
289 |
Merge commits have been debated multiple times in various media, |
290 |
in particular IRC. They have very verbose opponents whose main argument |
291 |
is that they make history unreadable. At the same time, it has been |
292 |
frequently pointed out that merge commits have valid use cases. |
293 |
To satisfy both groups, this specification strongly discourages merge |
294 |
commits but allows their use in justified cases. |
295 |
|
296 |
Most importantly, the implicit merge commits created by ``git pull`` |
297 |
are forbidden. Those merges have no real value or justified use case, |
298 |
and since they are created implicitly by default there have been |
299 |
historical cases where developers pushed them unintentionally. They are |
300 |
banned explicitly to emphasize the necessity of adjusting git |
301 |
configuration to the developers. |
302 |
|
303 |
When processing merge commits, it is important to explicitly distinguish |
304 |
the parent that represents 'real' Gentoo history from the one(s) that |
305 |
represent external branches. The former can either be an existing |
306 |
Gentoo commit or a commit that the developer has prepared (on top of |
307 |
existing Gentoo history) before merging the branch. For this reason, it |
308 |
is important to enforce the full set of Gentoo policies on this parent |
309 |
and the commits preceding it. On the other hand, the external branches |
310 |
can be treated similarly to development branches. Relaxing the rules |
311 |
for external branches also makes it possible to merge user contributions |
312 |
with original user OpenPGP signatures, while adding a final developer |
313 |
signature on top of the merge commit. |
314 |
|
315 |
When using ``git merge foo``, the first parent represents the current |
316 |
``HEAD`` and the second one the merged branch. This is the model |
317 |
used by the specification. |
318 |
|
319 |
|
320 |
OpenPGP signatures |
321 |
------------------ |
322 |
The signature requirements strictly correspond to the git setup deployed |
323 |
by the Infrastructure team. |
324 |
|
325 |
The commit signatures provide an ability to verify the authenticity |
326 |
of all commits throughout the Gentoo repository history (to the point |
327 |
of git conversion). The push signatures mostly serve the purpose |
328 |
of additional authentication for the developer pushing a specific set |
329 |
of commits. |
330 |
|
331 |
|
332 |
Splitting commits |
333 |
----------------- |
334 |
The goal of the commit splitting rules is to make the best use of git |
335 |
while avoiding enforcing too much overhead on the developer |
336 |
and optimizing to avoid interim broken commits. |
337 |
|
338 |
Splitting commits by logical changes improves the readability and makes |
339 |
it easier to revert a specific change while preserving the remaining |
340 |
(irrelevant) changes. The changes done by a developer are easier |
341 |
to comprehend when the reviewer can follow them in the specific order |
342 |
done by the author, rather than combined with other changes. |
343 |
|
344 |
Splitting commits on logical unit boundary was used since CVS times. |
345 |
Mostly it improves readability via making it possible to include |
346 |
the unit (package, eclass…) name in the commit message — so that |
347 |
developers perceive what specific packages are affected by the change |
348 |
without having to look into diffstat. |
349 |
|
350 |
Requiring commits to be non-'broken' is meant to preserve a good quality |
351 |
git history of the repository. This means that the users can check |
352 |
an interim commit out without risking a major problem such as a missing |
353 |
dependency that is being added by the commit following it. It also |
354 |
makes it safer to revert the most recent changes with reduced risk |
355 |
of exposing a breakage. |
356 |
|
357 |
Those rules partially overlap, and if that is the case, the developers |
358 |
are expected to use common sense to determine the course of action that |
359 |
gives the best result. Furthermore, requiring the strict following |
360 |
of the rules would mean a lot of additional work for developers |
361 |
and a lot of additional commits for no real benefit. |
362 |
|
363 |
The examples are provided to make it possible for the developers to get |
364 |
a 'feeling' how to work with the rules. |
365 |
|
366 |
|
367 |
Commit messages |
368 |
--------------- |
369 |
The basic commit message format is similar to the one used by other |
370 |
projects, and provides for reasonably predictable display of results. |
371 |
|
372 |
The summary line is meant to provide a good concise summary |
373 |
of the changes. It is included in the short logs, and should include |
374 |
all the information to help developer determine whether he is interested |
375 |
in looking into the commit details. Including the logical unit name |
376 |
accounts for the fact that most of the Gentoo commits are specific |
377 |
to those units (e.g. packages). The length limit is meant to avoid |
378 |
wrapping the shortlog — which could result in unreadable ``git log |
379 |
--oneline`` or ugly mid-word ellipsis on GitHub. |
380 |
|
381 |
The body is meant to provide the detailed information for a commit. |
382 |
It is usually displayed verbatim, and the use of paragraphs along with |
383 |
line wrapping is meant to improve readability. The body should include |
384 |
the information contained in the summary since the two are sometimes |
385 |
really disjoint, and expecting the user to read body as a continuation |
386 |
of summary is confusing. For example, in ``git send-email``, |
387 |
the summary line is used to construct the mail's subject |
388 |
and is therefore disjoint from the body. |
389 |
|
390 |
The tag section is a traditional way of expressing quasi-machine- |
391 |
readable data. However, the commit messages are not really suited |
392 |
for machine use and only a few tags are actually processed by scripts. |
393 |
The specification tries to provide a concise set of potentially useful |
394 |
tags collected from various projects (the Linux kernel, X.org). Those |
395 |
tags can be used interchangeably with plaintext explanation in the body. |
396 |
|
397 |
The only tag defined by git itself is the ``Signed-off-by`` line, |
398 |
that is created by ``git commit -s``. However, Gentoo does not |
399 |
currently enforce a DCO consistently, and therefore it is meaningless. |
400 |
|
401 |
The tags subject to machine processing are the ``Bug`` and ``Closes`` |
402 |
lines. Both are used by git.gentoo.org to handle Gentoo Bugzilla |
403 |
and the latter is also used by GitHub to automatically close pull |
404 |
requests (and issues — however, Gentoo does not use GitHub's issue |
405 |
tracker). GitHub, GitLab, Bitbucket, git.gentoo.org also support |
406 |
``Fixes`` and ``Resolves`` tags (and the first three also some |
407 |
variations of them), however ``Closes`` has been already established |
408 |
in Gentoo and is used for consistency. |
409 |
|
410 |
All the remaining tags serve purely as a user convenience. |
411 |
|
412 |
Historically, Gentoo has been using a few tags starting with ``X-``. |
413 |
However, this practice was abandoned once it has been pointed out that |
414 |
git does not enforce any standard set of tags, and therefore indicating |
415 |
non-standard tags is meaningless. |
416 |
|
417 |
Gentoo developers are still frequently using ``Gentoo-Bug`` tag, |
418 |
sometimes followed by ``Gentoo-Bug-URL``. Using both |
419 |
simultaneously is meaningless (they are redundant), and using the former |
420 |
has no advantages over using the classic ``#nnnnnn`` form in the summary |
421 |
or the body. |
422 |
|
423 |
Using full URLs in ``Closes`` is necessary to properly namespace |
424 |
the action to the Gentoo services and avoid accidentally closing |
425 |
incorrect issues or pull requests when the commit is mirrored or cherry- |
426 |
picked into another repository. For consistency, they are also used |
427 |
for ``Bug`` and should be used for any future tags that might be |
428 |
introduced. This also ensures that the URLs are automatically converted |
429 |
into hyperlinks by various tools. |
430 |
|
431 |
Including the bug number in the summary of the commit message causes |
432 |
willikins to automatically expand on the bug on ``#gentoo-commits``. |
433 |
|
434 |
|
435 |
Backwards Compatibility |
436 |
======================= |
437 |
Most of the new policy will apply to the commits following its approval. |
438 |
Backwards compatibility is not relevant there. |
439 |
|
440 |
One particular point that affects commits retroactively is the OpenPGP |
441 |
signing. However, it has been an obligatory requirement enforced by |
442 |
the infrastructure since the git switch. Therefore, all the git history |
443 |
conforms to that. |
444 |
|
445 |
|
446 |
Reference Implementation |
447 |
======================== |
448 |
All of the elements requiring explicit implementation on the git |
449 |
infrastructure are implemented already. In particular this includes: |
450 |
|
451 |
- blocking force pushes on the ``master`` branch, |
452 |
- requiring signed commits on the ``master`` branch, |
453 |
- requiring signed pushes to the repository. |
454 |
|
455 |
The remaining elements are either non-obligatory or non-enforceable |
456 |
at infrastructure level. |
457 |
|
458 |
RepoMan suggests starting the commit message with package name since |
459 |
commit 46dafadff58da0 [#REPOMAN_PKG_NAME_COMMIT]_. |
460 |
|
461 |
Acknowledgements |
462 |
================ |
463 |
Most of the foundations for this specification were laid out by Julian |
464 |
Ospald (hasufell) in his initial version of Gentoo git workflow |
465 |
[#GENTOO_GIT_WORKFLOW]_ article. |
466 |
|
467 |
|
468 |
References |
469 |
========== |
470 |
.. [#GENTOO_GIT_WORKFLOW] Gentoo Git Workflow (on Gentoo Wiki) |
471 |
https://wiki.gentoo.org/wiki/Gentoo_git_workflow |
472 |
|
473 |
.. [#GENTOO_GITHUB] Gentoo GitHub (on Gentoo Wiki) |
474 |
https://wiki.gentoo.org/wiki/Gentoo_GitHub |
475 |
|
476 |
.. TODO: verify this |
477 |
.. [#GLEP63] GLEP 63: Gentoo GPG key policies |
478 |
https://www.gentoo.org/glep/glep-0063.html |
479 |
|
480 |
.. [#REPOMAN_PKG_NAME_COMMIT] |
481 |
https://gitweb.gentoo.org/proj/portage.git/commit/?id=46dafadff58da0220511f20480b73ad09f913430 |
482 |
|
483 |
|
484 |
Copyright |
485 |
========= |
486 |
This work is licensed under the Creative Commons Attribution-ShareAlike 3.0 |
487 |
Unported License. To view a copy of this license, visit |
488 |
http://creativecommons.org/licenses/by-sa/3.0/. |
489 |
|
490 |
|
491 |
-- |
492 |
Best regards, |
493 |
Michał Górny |