1 |
Hi, everyone. |
2 |
|
3 |
There have been multiple attempts at grasping this but none so far |
4 |
resulted in something official and indisputable. At the same time, we |
5 |
end having to point our users at semi-official guides which change |
6 |
in unpredictable ways. |
7 |
|
8 |
Here's the current draft: |
9 |
https://wiki.gentoo.org/wiki/User:MGorny/GLEP:Git |
10 |
|
11 |
The basic idea is that the GLEP provides basic guidelines for using git, |
12 |
and then we write a proper manual on top of it (right now, all the pages |
13 |
about it end up as a mix of requirements and a partial git manual). |
14 |
|
15 |
What do you think about it? Is there anything else that needs being |
16 |
covered? |
17 |
|
18 |
Copy of the markup for inline comments follows. |
19 |
|
20 |
--- |
21 |
|
22 |
{{GLEP |
23 |
|Number=xx |
24 |
|Title=Gentoo Git Workflow |
25 |
|Type=Standards Track |
26 |
|Status=Draft |
27 |
|Author=Michał Górny <mgorny@g.o> |
28 |
}} |
29 |
|
30 |
==Abstract== |
31 |
This GLEP specifies basic standards and recommendations for using git |
32 |
with the Gentoo ebuild repository. It covers only Gentoo-specific |
33 |
policies, and is not meant to be a complete guide. |
34 |
|
35 |
==Motivation== |
36 |
Although the main Gentoo repository is using git for two years already, |
37 |
developers still lack official documentation on how to use git |
38 |
consistently. Most of the developers learn spoken standards from others |
39 |
and follow them. This eventually brings consistency to some extent but |
40 |
is suboptimal. Furthermore, it results in users having to learn things |
41 |
the hard way instead of having proper documentation to follow. |
42 |
|
43 |
There were a few attempts to standardize git use over the time. Most |
44 |
noteworthy are [[Gentoo git workflow]] and [[Gentoo GitHub]] articles. |
45 |
However, they are not any kind of official standards, and they have too |
46 |
broad focus to become one. There was also an initial GLEP attempt but it |
47 |
never even reached the draft stage. |
48 |
|
49 |
This GLEP aims to finally provide basic standardization for the use of |
50 |
git in the Gentoo repository. It aims to focus purely on Gentoo-specific |
51 |
standards and not git usage in general. It doesn't mean to be a complete |
52 |
guide but a formal basis on top of which official guides could be |
53 |
created. |
54 |
|
55 |
==Specification== |
56 |
===Branching model=== |
57 |
The main branch of the Gentoo repository is the <kbd>master</kbd> |
58 |
branch. All Gentoo developers push their work straight to the master |
59 |
branch, provided that the commits meet the minimal quality standards. |
60 |
The master branch is also used straight for continous user repository |
61 |
deployment. |
62 |
|
63 |
Since multiple developers work on master concurrently, they may be |
64 |
required to rebase multiple times before being able to push. Developers |
65 |
are requested not to use workflows that could prevent others from |
66 |
pushing, e.g. pushing single commits frequently instead of staging them |
67 |
and using a single push. |
68 |
|
69 |
Developers can use additional branches to facilitate review and testing |
70 |
of long-term projects of larger scale. However, since git fetches all |
71 |
branches by default, they should be used scarcely. For smaller projects, |
72 |
local branches or repository forks are preferred. |
73 |
|
74 |
Unless stated otherwise, the rules set by this specification apply to |
75 |
the master branch only. The development branches can use relaxed rules. |
76 |
|
77 |
Rewriting history (i.e. force pushes) of the master branch is forbidden. |
78 |
|
79 |
===Merge commits=== |
80 |
The use of merge commits in the Gentoo repository is strongly |
81 |
discouraged. Usually it is preferable to rebase instead. However, the |
82 |
developers are allowed to use merge commits in justified cases. Merge |
83 |
commits can be only used to merge additional branches, the use of |
84 |
implicit <kbd>git pull</kbd> merges is entirely forbidden. |
85 |
|
86 |
In a merge commit that is committed straight to the Gentoo repository, |
87 |
the first parent is expected to reference an actual Gentoo commit |
88 |
preceding the merge, while the remaining parents can be used to |
89 |
reference external repositories. The commits following the first parent |
90 |
are required to conform to this specification alike regular Gentoo |
91 |
commits. The additional commits following other parents can use relaxed |
92 |
rules. |
93 |
|
94 |
===OpenPGP signatures=== |
95 |
Each commit in the Gentoo repository must be signed using the |
96 |
committer's OpenPGP key. Furthermore, each push to the repository must |
97 |
be signed using the key belonging to the developer performing the push |
98 |
(matched via the SSH key). |
99 |
|
100 |
The requirements for OpenPGP keys are covered by [[GLEP:63|GLEP 63]]. |
101 |
|
102 |
===Splitting commits=== |
103 |
Git commits are lightweight, and the developers are encouraged to split |
104 |
their commits to improve readability and the ability of reverting |
105 |
specific sub-changes. When choosing how to split the commits, the |
106 |
developers should consider the following three rules: |
107 |
# Use atomic commits — one commit per logical change. |
108 |
# Split commits at logical unit (package, eclass, profile…) boundaries. |
109 |
# Avoid creating commits that are 'broken' — e.g. are incomplete, have |
110 |
uninstallable packages. |
111 |
|
112 |
It is technically impossible to always respect all of the three rules, |
113 |
so developers have to balance between them at their own discretion. Side |
114 |
changes that are implied by other change (e.g. revbump due to some |
115 |
change) should be included in the first commit requiring them. Commits |
116 |
should be ordered to avoid breakage, and follow logical ordering |
117 |
whenever possible. |
118 |
|
119 |
Examples: |
120 |
* When doing a version bump, it is usually not reasonable to split every |
121 |
necessary logical change into separate commit since the interim commits |
122 |
would correspond to a broken package. However, if the package has a live |
123 |
ebuild, it ''might'' be reasonable to perform split logical changes on |
124 |
the live ebuild, then create a release as another logical step. |
125 |
* When doing one or more changes that require a revision bump, bump the |
126 |
revision in the commit including the first change. Split the changes |
127 |
into multiple logical commits without further revision bumps — since |
128 |
they are going to be pushed in a single push, the user will not be |
129 |
exposed to interim state. |
130 |
* When adding a new version of a package that should be masked, you can |
131 |
include the {{Path|package.mask}} edit in the commit adding it. |
132 |
Alternatively, you can add the mask in a split commit ''preceding'' the |
133 |
bump. |
134 |
* When doing a minor change to a large number of packages, it is |
135 |
reasonable to do so in a single commit. However, when doing a major |
136 |
change (e.g. a version bump), it is better to split commits on package |
137 |
boundaries. |
138 |
|
139 |
===Commit messages=== |
140 |
A standard git commit message consists of three parts, in order: a |
141 |
summary line, an optional body and an optional set of tags. The parts |
142 |
are separated by a single empty line. |
143 |
|
144 |
The summary line is included in the short logs (<kbd>git log -- |
145 |
oneline</kbd>, gitweb, GitHub, mail subject) and therefore should |
146 |
provide a short yet accurate description of the change. The summary line |
147 |
starts with a logical unit name, followed by a colon, a space and a |
148 |
short description of the most important changes. If a bug is associated |
149 |
with a change, then it should be included in the summary line as |
150 |
<kbd>#nnnnnn</kbd> or likewise. The summary line must not exceed 69 |
151 |
characters, and must not be wrapped. |
152 |
|
153 |
The suggested logical unit name formats are: |
154 |
* for a package, <kbd>category/package: …</kbd>; |
155 |
* for an eclass, <kbd>name.eclass: …</kbd>; |
156 |
* for other directories or files, their path or filename (as long as a |
157 |
developer reading the commit messages is able to figure out what it is) |
158 |
— e.g. <kbd>licenses/foo: …</kbd>, <kbd>package.mask: …</kbd>. |
159 |
|
160 |
The body is included in the full commit log (<kbd>git log</kbd>, |
161 |
detailed commit info on gitweb/GitHub, mail body). It is optional, and |
162 |
it can be used to describe the commit in more detail if the summary line |
163 |
is not sufficient. It is generally a good idea to repeat the information |
164 |
contained in the summary (except for the logical unit) since the summary |
165 |
is frequently formatted as a title. The body should be wrapped at 72 |
166 |
characters. It can contain multiple paragraphs, separated by empty |
167 |
lines. |
168 |
|
169 |
The tag part is included in the full commit log as an extension to the |
170 |
body. It consists of one or more lines consisting of key, followed by a |
171 |
colon and a space, followed by value. Git does not enforce any |
172 |
standardization of the keys, and the tag format is ''not'' meant for |
173 |
machine processing. |
174 |
|
175 |
A few tags of common use are: |
176 |
* user-related tags: |
177 |
** <kbd>Acked-by: Full Name <email@×××××××.com></kbd> — commit approved |
178 |
by another person (usually without detailed review), |
179 |
** <kbd>Reported-by: Full Name <email@×××××××.com></kbd>, |
180 |
** <kbd>Reviewed-by: Full Name <email@×××××××.com></kbd> — usually |
181 |
indicates full review, |
182 |
** <kbd>Signed-off-by: Full Name <email@×××××××.com></kbd> — DCO |
183 |
approval (not used in Gentoo right now), |
184 |
** <kbd>Suggested-by: Full Name <email@×××××××.com></kbd>, |
185 |
** <kbd>Tested-by: Full Name <email@×××××××.com></kbd>. |
186 |
* commit-related tags: |
187 |
** <kbd>Fixes: commit-id (commit message)</kbd> — to indicate fixing a |
188 |
previous commit, |
189 |
** <kbd>Reverts: commit-id (commit message)</kbd> — to indicate |
190 |
reverting a previous commit, |
191 |
* bug tracker-related tags: |
192 |
** <kbd>Bug: <nowiki>https://bugs.gentoo.org/NNNNNN</nowiki></kbd>; — to |
193 |
reference a bug, |
194 |
** <kbd>Closes: <nowiki>https://github.com/gentoo/gentoo/pull/NNNN</nowi |
195 |
ki></kbd>; — to automatically close a GitHub pull request, |
196 |
** <kbd>Fixes: <nowiki>https://bugs.gentoo.org/NNNNNN</nowiki></kbd>; — |
197 |
to indicate a fixed bug, |
198 |
* package manager tags: |
199 |
** <kbd>Package-Manager: …</kbd> — used by repoman to indicate Portage |
200 |
version, |
201 |
** <kbd>RepoMan-Options: …</kbd> — used by repoman to indicate repoman |
202 |
options. |
203 |
|
204 |
The bug tracker-related tags can be used to extend the body message. |
205 |
However, they should be skipped if the bug number is already provided in |
206 |
the summary and there is no explicit body. |
207 |
|
208 |
==Rationale== |
209 |
===Branching model=== |
210 |
The model of multiple developers pushing concurrently to the repository |
211 |
containing all packages is preserved from CVS. The developers have |
212 |
discussed the possibility of using other models, in particular of using |
213 |
multiple branches for developers that are afterwards automatically |
214 |
merged into the master branch. However, it was determined that there is |
215 |
no need to use a more complex model at the moment and the potential |
216 |
problems with them outweighed the benefits. |
217 |
|
218 |
The necessity of rebasing is a natural consequence of concurrent work, |
219 |
along with the ban of reverse merge commits. Since rebasing a number of |
220 |
commits can take a few seconds or even more, another developer sometimes |
221 |
commits during that time, enforcing another rebase. |
222 |
|
223 |
In the past, there were cases of developers using automated scripts |
224 |
which created single commits, ran repoman and pushed them straight to |
225 |
the repository. This resulted in pushes from a single developer every |
226 |
10-15 seconds which made it impossible for other developers to rebase |
227 |
larger commit batches. This kind of workflow is therefore strongly |
228 |
discouraged. |
229 |
|
230 |
Creating multiple short-time branches is discouraged as it implies |
231 |
additional transfer for users cloning the repository and additional |
232 |
maintenance burden. Since the git migration, the developers have created |
233 |
a few branches on the repository, and did not maintain them. The Infra |
234 |
had to query the developers about the state of the branches and clean |
235 |
them up. Keeping branches local or hosting them outside Gentoo Infra |
236 |
(e.g. on GitHub) reduces the burden on our users, even if the developers |
237 |
do not clean after themselves. |
238 |
|
239 |
===Merge commits=== |
240 |
Merge commits have been debated multiple times in various media, in |
241 |
particular IRC. They have very verbose opponents whose main argument is |
242 |
that they make history unreadable. At the same time, it has been |
243 |
frequently pointed out that merge commits have valid use cases. To |
244 |
satisfy both groups, this specification strongly discourages merge |
245 |
commits but allows their use in justified cases. |
246 |
|
247 |
Most importantly, the implicit merge commits created by <kbd>git |
248 |
pull</kbd> are forbbiden. Those merges have no real value or justified |
249 |
use case, and since they are created implicitly by default there have |
250 |
been historical cases where developers pushed them unintentionally. They |
251 |
are banned explicitly to emphasize the necessity of adjusting git |
252 |
configuration to the developers. |
253 |
|
254 |
When processing merge commits, it is important to explicitly distinguish |
255 |
the parent that represents 'real' Gentoo history from the one(s) that |
256 |
represent external branches. The former can either be an existing Gentoo |
257 |
commit or a commit that the developer has prepared (on top of existing |
258 |
Gentoo history) before merging the branch. For this reason, it is |
259 |
important to enforce the full set of Gentoo policies on this parent and |
260 |
the commits preceding it. On the other hand, the external branches can |
261 |
be treated similarly to development branches. Relaxing the rules for |
262 |
external branches also makes it possible to merge user contributions |
263 |
with original user OpenPGP signatures, while adding a final developer |
264 |
signature on top of the merge commit. |
265 |
|
266 |
When using <kbd>git merge ''foo''</kbd>, the first parent represents the |
267 |
current <kbd>HEAD</kbd> and the second one the merged branch. This is |
268 |
the model used by the specification. |
269 |
|
270 |
===OpenPGP signatures=== |
271 |
The signature requirements strictly correspond to the git setup deployed |
272 |
by the Infrastructure team. |
273 |
|
274 |
The commit signatures provide an ability to verify the authenticity of |
275 |
all commits throughout the Gentoo repository history (to the point of |
276 |
git conversion). The push signatures mostly serve the purpose of |
277 |
additional authentication for the developer pushing a specific set of |
278 |
commits. |
279 |
|
280 |
===Splitting commits=== |
281 |
The goal of the commit splitting rules is to make the best use of git |
282 |
while avoiding enforcing too much overhead on the developer and |
283 |
optimizing to avoid interim broken commits. |
284 |
|
285 |
Splitting commits by logical changes improves the readability and makes |
286 |
it easier to revert a specific change while preserving the remaining |
287 |
(irrelevant) changes. The changes done by a developer are easier to |
288 |
comprehend when the reviewer can follow them in the specific order done |
289 |
by the author, rather than combined with other changes. |
290 |
|
291 |
Splitting commits on logical unit boundary was used since CVS times. |
292 |
Mostly it improves readability via making it possible to include the |
293 |
unit (package, eclass…) name in the commit message — so that developers |
294 |
perceive what specific packages are affected by the change without |
295 |
having to look into diffstat. |
296 |
|
297 |
Requiring commits to be non-'broken' is meant to preserve a good quality |
298 |
git history of the repository. This means that the users can checkout an |
299 |
interim commit without risking a major problem such as a missing |
300 |
dependency that is being added by the commit following it. It also makes |
301 |
it safer to revert the most recent changes with reduced risk of exposing |
302 |
a breakage. |
303 |
|
304 |
Those rules partially overlap, and if that is the case, the developers |
305 |
are expected to use common sense to determine the course of action that |
306 |
gives the best result. Furthermore, requiring the strict following of |
307 |
the rules would mean a lot of additional work for developers and a lot |
308 |
of additional commits for no real benefit. |
309 |
|
310 |
The examples are provided to make it possible for the developers to get |
311 |
a 'feeling' how to work with the rules. |
312 |
|
313 |
===Commit messages=== |
314 |
The basic commit message format is similar to the one used by other |
315 |
projects, and provides for reasonably predictable display of results. |
316 |
|
317 |
The summary line is meant to provide a good concise summary of the |
318 |
changes. It is included in the short logs, and should include all the |
319 |
information to help developer determine whether he is interested in |
320 |
looking into the commit details. Including the logical unit name |
321 |
accounts for the fact that most of the Gentoo commits are specific to |
322 |
those units (e.g. packages). The length limit is meant to avoid wrapping |
323 |
the shortlog — which could result in unreadable <kbd>git log -- |
324 |
oneline</kbd> or ugly mid-word ellipsis on GitHub. |
325 |
|
326 |
The body is meant to provide the detailed information for a commit. It |
327 |
is usually displayed verbatim, and the use of paragraphs along with line |
328 |
wrapping is meant to improve readability. The body should include the |
329 |
information contained in the summary since the two are sometimes really |
330 |
disjoint, and expecting the user to read body as a continuation of |
331 |
summary is confusing. For example, in <kbd>git send-email</kbd>, the |
332 |
summary line is used to construct the mail's summary and is therefore |
333 |
disjoint from the body. |
334 |
|
335 |
The tag section is a traditional way of expressing quasi-machine- |
336 |
readable data. However, the commit messages are not really suited for |
337 |
machine use and only a few tags are actually processed by scripts. The |
338 |
specification tries to provide a concise set of potentially useful tags |
339 |
collected from various projects (the Linux kernel, X.org). Those tags |
340 |
can be used interchangeably with plaintext explanation in the body. |
341 |
|
342 |
The only tag defined by git itself is the <kbd>Signed-off-by</kbd> line, |
343 |
that is created by <kbd>git commit -s</kbd>. However, Gentoo does not |
344 |
currently enforce a DCO consistently, and therefore it is meaningless. |
345 |
|
346 |
The only tag subject to machine processing is the <kbd>Closes</kbd> line |
347 |
that is used by GitHub to automatically close pull requests (and issues |
348 |
— however, Gentoo does not use GitHub's issue tracker). |
349 |
|
350 |
All the remaining tags serve purely as a user convenience. |
351 |
|
352 |
Historically, Gentoo has been using a few tags starting with <kbd>X- |
353 |
</kbd>. However, this practice was abandoned once it has been pointed |
354 |
out that git does not enforce any standard set of tags, and therefore |
355 |
indicating non-standard tags is meaningless. |
356 |
|
357 |
Gentoo developers are still frequently using <kbd>Gentoo-Bug</kbd> tag, |
358 |
sometimes followed by <kbd>Gentoo-Bug-URL</kbd>. Using both |
359 |
simultaneously is meaningless (they are redundant), and using the former |
360 |
has no advantages over using the classic <kbd>#nnnnnn</kbd> form in the |
361 |
summary or the body. |
362 |
|
363 |
==Backwards Compatibility== |
364 |
Most of the new policy will apply to the commits following its approval. |
365 |
Backwards compatibility is not relevant there. |
366 |
|
367 |
One particular point that affects commits retroactively is the OpenPGP |
368 |
signing. However, it has been an obligatory requirement enforced by the |
369 |
infrastructure since the git switch. Therefore, all the git history |
370 |
conforms to that. |
371 |
|
372 |
==Reference implementation== |
373 |
All of the elements requiring explicit implementation on the git |
374 |
infrastructure are implemented already. In particular this includes: |
375 |
* blocking force pushes on the <kbd>master</kbd> branch, |
376 |
* requiring signed commits on the <kbd>master</kbd> branch, |
377 |
* requiring signed pushes to the repository. |
378 |
|
379 |
The remaining elements are either non-obligatory or non-enforceable at |
380 |
infrastructure level. |
381 |
|
382 |
RepoMan suggests starting the commit message with package name since |
383 |
commit [https://gitweb.gentoo.org/proj/portage.git/commit/?id=46dafadff5 |
384 |
8da0220511f20480b73ad09f913430 |
385 |
46dafadff58da0220511f20480b73ad09f913430]. |
386 |
|
387 |
==Acknowledgements== |
388 |
Most of the foundations for this specification were laid out by |
389 |
[[User:Hasufell|Julian Ospald (hasufell)]] in his initial version of |
390 |
[[Gentoo git workflow]] article. |
391 |
|
392 |
==Copyright== |
393 |
|
394 |
This work is licensed under the Creative Commons Attribution-ShareAlike |
395 |
3.0 Unported License. To view a copy of this license, visit http://creat |
396 |
ivecommons.org/licenses/by-sa/3.0/. |
397 |
|
398 |
-- |
399 |
Best regards, |
400 |
Michał Górny |