Gentoo Archives: gentoo-dev

From: Patrice Clement <monsieurp@g.o>
To: gentoo-dev@l.g.o
Subject: [gentoo-dev] Dealing with GitHub Pull Requests the easy way
Date: Tue, 18 Oct 2016 21:13:38
Message-Id: 20161018211326.GB30579@ultrachro.me
1 Hello fellow Gentoo developers and subscribers of the gentoo-dev mailing list,
2
3 I've been wanting to write this email for a while but for some reason never got
4 round to doing it due to lack of motivation and time.
5
6 I will be discussing many topics in this email revolving around git
7 essentially. I first want to go over some basic concepts about git and GitHub
8 and why we should be doing things differently if we want to avoid cluttering up
9 our repository with useless stuff.
10
11 * Background
12
13 As you know, a little while ago we've migrated the main tree to git, the
14 revision control tool which needs no introduction. A few months after the
15 migration, our repository was mirrored over to GitHub to give the project a bit
16 more exposure to what some developers refers to as "the GitHub generation". The
17 response from the community was extordinary and as a result, a massive number
18 of Pull Requests came our way. We soon then started to lend ourselves to the
19 duty of PR triaging and merging, and started to make what's called in the git
20 world "merge commits". Understanding merge commits requires understanding how
21 GitHub considers a contribution.
22
23 When a contributor sends a PR via GitHub, he will essentially be making a
24 different branch, start working on it and eventually file it. For those of you
25 familiar with git or who've already filed PRs on GitHub, this is old news.
26 However, there's a number of different way to deal with PRs on the receiving
27 end (us) in order to keep a sane log history (graph actually).
28
29 When we first started working with git, and GitHub, the tendency was to rely on
30 merge commits to merge contributions back into the main repo. In my opinion,
31 this was, and still is, a bad idea. What's so special about merge commits?
32
33 * A short walk through merge commits
34
35 As you may know, merging one branch into another often results in creating a
36 new commit. This commit is called a "merge commit" in git jargon. Let's pick
37 for instance cf4cce36684de5e449ec60bde3421fa0e27bac74. I'm not trying to put
38 the blame on a particular developer, we've all used merge commits at one point
39 or another and I was one of the first! In the log graph, this commit is
40 displayed as such:
41
42 $ git log --graph --oneline master
43 [snip]
44 * | | cf4cce3 Merge remote-tracking branch 'github/pr/1845'
45 |\ \ \
46 | * | | abf61de net-im/ejabberd: require <dev-lang/erlang-19
47 * | | | 72c688f app-cdr/xcdroast: remove old revisions
48 * | | | ced099c package.mask: update xcdroast p.mask
49 [snip]
50
51 The problem here is two fold. First off, we've created a commit which is
52 pretty much meaningless. Merge commits often tell a story which says nothing
53 interesting: Merge remote-tracking branch 'github/pr/1845'. OK, that's great
54 but we care? Not really.
55
56 The second problem stems from the very nature of merge commits. Indeed, the
57 first parent of a merge commit is the tip of master right when the branch is
58 created, in our case this is when the contributor created his branch and
59 started working on his contribution. However, git log also displays on the left
60 hand side what I shall call "rails" (no, I'm not a Ruby developer). A rail is
61 essentially a path leading back to the parent of a merge commit. It is a meant
62 to be a visual aid to help you work out when two branches veered off and
63 enventually got merged back together. As you might have noticed by running git
64 log yourself in the Gentoo git repo and looking back 6 months or a year ago,
65 there are rails all over the place and overlapping each others. Why does it
66 happen?
67
68 As I just explained above, the parent of a merge commit is the tip of master.
69 But because PRs i.e. branches are each created at a different time, the tip of
70 master is different for each of them. When merging by using a merge commit, git
71 tries really hard to put this information back together by working out the
72 parent of each merge commit. This results in a gigantic and entangled mess
73 shown by git log. I often joke that it looks like as messy version of the
74 London Tube map: colourful yet upside down.
75
76 In some open source projects, it makes sense to leverage merge commits. The
77 Linux kernel comes to mind for instance. In this case, merge commits are a good
78 way to track changes coming from a different branch. Given the sheer amount on
79 contributors working on the Linux kernel, this is useful information for
80 someone new willing to tackle a new area of the kernel. Figuring out changes
81 made to a file across several releases is extremely helpful and merge commits
82 definitely fill this gap. Also, the Linux kernel doesn't have to deal with PRs
83 since diffs are sent directly to a mailing list.
84
85 In the case of Gentoo though, it makes no sense. We should strive for keeping a
86 clean and linear history. I have yet to witness developers creating branches
87 in the Gentoo main repository. Even though the GitHub model considers PRs as
88 branches, they are in fact casual contributions and should be treated as such.
89
90 By avoiding merge commits, we make sure the history stay linear with no
91 parent/child commits all over the place. It leads us to the two remaining
92 solutions for dealing with PRs in a clean fashion: cherry-picking and git am.
93 These two solutions really shine at keeping a sane history.
94
95 Cherry-picking is not my go-to solution as far as I'm concerned. It requires a
96 bit of setup and is clearly tedious: you must know in advance the full SHA-1 of
97 commit(s) you want to cherry-pick. You must also set up remote repositories,
98 pull from them every now and then, etc. For a Git newbie, it can be daunting. A
99 few developers often opt for this solution (hi kensington!) which I do not
100 vouch for.
101
102 Eventually, we're left with git am. My favourite choice if you ask me, since it
103 requires very little to do compared to cherry-picking or making merge commits.
104 You may or may not know about it but a PR can be fetched as a git am-compatible
105 patch. If you've ever read emails sent by the GitHub bots, they point to this
106 URL:
107
108 https://github.com/gentoo/gentoo/pull/1234.patch
109
110 Once fetched, using your favourite web crawler, the patch can be directly
111 applied via the git am command onto HEAD of the repository you're dealing with.
112 There's this common idiom for fetching AND applying at patch all at once:
113
114 $ curl https://github.com/gentoo/gentoo/pull/1234.patch | git am
115
116 * This is where I'm meant to sell you my solution
117
118 Ultimately, I've decided to write a tool to leverage this way of fetching PRs
119 and merging them. The tool is called Gentoo-App-Pram and is available in the
120 tree:
121
122 # emerge Gentoo-App-Pram
123
124 It is written in Perl, works fairly well and has been used by a fair (growing?)
125 number of developers so far. The tool is CLI-based so you will need to feel at
126 home with the command line.
127
128 Once emerged, cd into your Gentoo git repo and type `pram' followed by the PR
129 number you wish to merge:
130
131 $ cd /home/patrice/gentoo
132 $ pram 1234
133
134 pram will then fetch the PR as a patch and display it to you in your
135 favourite $EDITOR. At this point, you can make any change to the PR i.e.
136 editing commit message(s), changing code in-line, etc.
137
138 pram also leverages the "Closes:" header. This header is recognised by GitHub,
139 and Larry the Cow, and will automatically close a PR when parsing it in the
140 body of a commit message. So for instance, the following header will
141 automatically close PR 1234: "Closes: https://github.com/gentoo/gentoo/pull/1234".
142 You don't need to manually add it as pram will do this for you.
143
144 After saving and getting out of $EDITOR, pram will ask you whether the PR needs
145 merging by asking a yes/no question. "y" will launch git am and merge the
146 patch whereas "n" will abort the operation and clean it up.
147
148 That's pretty much it. Make sure to read the man page since there are other
149 options available (pram --man).
150
151 pram wouldn't have been possible without Kent Fredric's help. He's assisted me
152 in releasing the package on CPAN and contributed a few patches. Kudos to him!
153
154 To wrap up:
155 - Please stop making merge commits. This strategy is not useful in the case of
156 Gentoo and does more harm than good.
157 - Cherry-pick or git-am external contributions such as PRs.
158 - Better yet, use Gentoo-App-Pram. :-)
159
160 If you want to contribute to Gentoo-App-Pram, send me a PR on GitHub at
161 https://github.com/monsieurp/Gentoo-App-Pram or file a bug report at
162 https://bugs.gentoo.org and assign it to me.
163
164 Comments and suggestions welcome.
165
166 Cheers,
167
168 --
169 Patrice Clement
170 Gentoo Linux developer
171 http://www.gentoo.org

Attachments

File name MIME type
signature.asc application/pgp-signature

Replies