1 |
Hello fellow Gentoo developers and subscribers of the gentoo-dev mailing list, |
2 |
|
3 |
I've been wanting to write this email for a while but for some reason never got |
4 |
round to doing it due to lack of motivation and time. |
5 |
|
6 |
I will be discussing many topics in this email revolving around git |
7 |
essentially. I first want to go over some basic concepts about git and GitHub |
8 |
and why we should be doing things differently if we want to avoid cluttering up |
9 |
our repository with useless stuff. |
10 |
|
11 |
* Background |
12 |
|
13 |
As you know, a little while ago we've migrated the main tree to git, the |
14 |
revision control tool which needs no introduction. A few months after the |
15 |
migration, our repository was mirrored over to GitHub to give the project a bit |
16 |
more exposure to what some developers refers to as "the GitHub generation". The |
17 |
response from the community was extordinary and as a result, a massive number |
18 |
of Pull Requests came our way. We soon then started to lend ourselves to the |
19 |
duty of PR triaging and merging, and started to make what's called in the git |
20 |
world "merge commits". Understanding merge commits requires understanding how |
21 |
GitHub considers a contribution. |
22 |
|
23 |
When a contributor sends a PR via GitHub, he will essentially be making a |
24 |
different branch, start working on it and eventually file it. For those of you |
25 |
familiar with git or who've already filed PRs on GitHub, this is old news. |
26 |
However, there's a number of different way to deal with PRs on the receiving |
27 |
end (us) in order to keep a sane log history (graph actually). |
28 |
|
29 |
When we first started working with git, and GitHub, the tendency was to rely on |
30 |
merge commits to merge contributions back into the main repo. In my opinion, |
31 |
this was, and still is, a bad idea. What's so special about merge commits? |
32 |
|
33 |
* A short walk through merge commits |
34 |
|
35 |
As you may know, merging one branch into another often results in creating a |
36 |
new commit. This commit is called a "merge commit" in git jargon. Let's pick |
37 |
for instance cf4cce36684de5e449ec60bde3421fa0e27bac74. I'm not trying to put |
38 |
the blame on a particular developer, we've all used merge commits at one point |
39 |
or another and I was one of the first! In the log graph, this commit is |
40 |
displayed as such: |
41 |
|
42 |
$ git log --graph --oneline master |
43 |
[snip] |
44 |
* | | cf4cce3 Merge remote-tracking branch 'github/pr/1845' |
45 |
|\ \ \ |
46 |
| * | | abf61de net-im/ejabberd: require <dev-lang/erlang-19 |
47 |
* | | | 72c688f app-cdr/xcdroast: remove old revisions |
48 |
* | | | ced099c package.mask: update xcdroast p.mask |
49 |
[snip] |
50 |
|
51 |
The problem here is two fold. First off, we've created a commit which is |
52 |
pretty much meaningless. Merge commits often tell a story which says nothing |
53 |
interesting: Merge remote-tracking branch 'github/pr/1845'. OK, that's great |
54 |
but we care? Not really. |
55 |
|
56 |
The second problem stems from the very nature of merge commits. Indeed, the |
57 |
first parent of a merge commit is the tip of master right when the branch is |
58 |
created, in our case this is when the contributor created his branch and |
59 |
started working on his contribution. However, git log also displays on the left |
60 |
hand side what I shall call "rails" (no, I'm not a Ruby developer). A rail is |
61 |
essentially a path leading back to the parent of a merge commit. It is a meant |
62 |
to be a visual aid to help you work out when two branches veered off and |
63 |
enventually got merged back together. As you might have noticed by running git |
64 |
log yourself in the Gentoo git repo and looking back 6 months or a year ago, |
65 |
there are rails all over the place and overlapping each others. Why does it |
66 |
happen? |
67 |
|
68 |
As I just explained above, the parent of a merge commit is the tip of master. |
69 |
But because PRs i.e. branches are each created at a different time, the tip of |
70 |
master is different for each of them. When merging by using a merge commit, git |
71 |
tries really hard to put this information back together by working out the |
72 |
parent of each merge commit. This results in a gigantic and entangled mess |
73 |
shown by git log. I often joke that it looks like as messy version of the |
74 |
London Tube map: colourful yet upside down. |
75 |
|
76 |
In some open source projects, it makes sense to leverage merge commits. The |
77 |
Linux kernel comes to mind for instance. In this case, merge commits are a good |
78 |
way to track changes coming from a different branch. Given the sheer amount on |
79 |
contributors working on the Linux kernel, this is useful information for |
80 |
someone new willing to tackle a new area of the kernel. Figuring out changes |
81 |
made to a file across several releases is extremely helpful and merge commits |
82 |
definitely fill this gap. Also, the Linux kernel doesn't have to deal with PRs |
83 |
since diffs are sent directly to a mailing list. |
84 |
|
85 |
In the case of Gentoo though, it makes no sense. We should strive for keeping a |
86 |
clean and linear history. I have yet to witness developers creating branches |
87 |
in the Gentoo main repository. Even though the GitHub model considers PRs as |
88 |
branches, they are in fact casual contributions and should be treated as such. |
89 |
|
90 |
By avoiding merge commits, we make sure the history stay linear with no |
91 |
parent/child commits all over the place. It leads us to the two remaining |
92 |
solutions for dealing with PRs in a clean fashion: cherry-picking and git am. |
93 |
These two solutions really shine at keeping a sane history. |
94 |
|
95 |
Cherry-picking is not my go-to solution as far as I'm concerned. It requires a |
96 |
bit of setup and is clearly tedious: you must know in advance the full SHA-1 of |
97 |
commit(s) you want to cherry-pick. You must also set up remote repositories, |
98 |
pull from them every now and then, etc. For a Git newbie, it can be daunting. A |
99 |
few developers often opt for this solution (hi kensington!) which I do not |
100 |
vouch for. |
101 |
|
102 |
Eventually, we're left with git am. My favourite choice if you ask me, since it |
103 |
requires very little to do compared to cherry-picking or making merge commits. |
104 |
You may or may not know about it but a PR can be fetched as a git am-compatible |
105 |
patch. If you've ever read emails sent by the GitHub bots, they point to this |
106 |
URL: |
107 |
|
108 |
https://github.com/gentoo/gentoo/pull/1234.patch |
109 |
|
110 |
Once fetched, using your favourite web crawler, the patch can be directly |
111 |
applied via the git am command onto HEAD of the repository you're dealing with. |
112 |
There's this common idiom for fetching AND applying at patch all at once: |
113 |
|
114 |
$ curl https://github.com/gentoo/gentoo/pull/1234.patch | git am |
115 |
|
116 |
* This is where I'm meant to sell you my solution |
117 |
|
118 |
Ultimately, I've decided to write a tool to leverage this way of fetching PRs |
119 |
and merging them. The tool is called Gentoo-App-Pram and is available in the |
120 |
tree: |
121 |
|
122 |
# emerge Gentoo-App-Pram |
123 |
|
124 |
It is written in Perl, works fairly well and has been used by a fair (growing?) |
125 |
number of developers so far. The tool is CLI-based so you will need to feel at |
126 |
home with the command line. |
127 |
|
128 |
Once emerged, cd into your Gentoo git repo and type `pram' followed by the PR |
129 |
number you wish to merge: |
130 |
|
131 |
$ cd /home/patrice/gentoo |
132 |
$ pram 1234 |
133 |
|
134 |
pram will then fetch the PR as a patch and display it to you in your |
135 |
favourite $EDITOR. At this point, you can make any change to the PR i.e. |
136 |
editing commit message(s), changing code in-line, etc. |
137 |
|
138 |
pram also leverages the "Closes:" header. This header is recognised by GitHub, |
139 |
and Larry the Cow, and will automatically close a PR when parsing it in the |
140 |
body of a commit message. So for instance, the following header will |
141 |
automatically close PR 1234: "Closes: https://github.com/gentoo/gentoo/pull/1234". |
142 |
You don't need to manually add it as pram will do this for you. |
143 |
|
144 |
After saving and getting out of $EDITOR, pram will ask you whether the PR needs |
145 |
merging by asking a yes/no question. "y" will launch git am and merge the |
146 |
patch whereas "n" will abort the operation and clean it up. |
147 |
|
148 |
That's pretty much it. Make sure to read the man page since there are other |
149 |
options available (pram --man). |
150 |
|
151 |
pram wouldn't have been possible without Kent Fredric's help. He's assisted me |
152 |
in releasing the package on CPAN and contributed a few patches. Kudos to him! |
153 |
|
154 |
To wrap up: |
155 |
- Please stop making merge commits. This strategy is not useful in the case of |
156 |
Gentoo and does more harm than good. |
157 |
- Cherry-pick or git-am external contributions such as PRs. |
158 |
- Better yet, use Gentoo-App-Pram. :-) |
159 |
|
160 |
If you want to contribute to Gentoo-App-Pram, send me a PR on GitHub at |
161 |
https://github.com/monsieurp/Gentoo-App-Pram or file a bug report at |
162 |
https://bugs.gentoo.org and assign it to me. |
163 |
|
164 |
Comments and suggestions welcome. |
165 |
|
166 |
Cheers, |
167 |
|
168 |
-- |
169 |
Patrice Clement |
170 |
Gentoo Linux developer |
171 |
http://www.gentoo.org |