Gentoo Archives: gentoo-portage-dev

From: Matt Turner <mattst88@g.o>
To: gentoo-portage-dev@l.g.o
Cc: git@×××××××××××.org
Subject: [gentoo-portage-dev] Re: [RFC PATCH gentoolkit] bin: Add merge-driver-ekeyword
Date: Tue, 22 Dec 2020 22:40:47
Message-Id: CAEdQ38E9Fepp9gmidcf_HvFMacwPZBr0XgPT5HFs8bHw-SJDZQ@mail.gmail.com
In Reply to: [gentoo-portage-dev] [RFC PATCH gentoolkit] bin: Add merge-driver-ekeyword by Matt Turner
1 tl;dr:
2
3 I want to handle conflicts automatically on lines like
4
5 > KEYWORDS="~alpha ~amd64 ~arm ~arm64 ~hppa ~ia64 ~mips ~ppc ~ppc64 ~riscv ~s390 ~sparc ~x86"
6
7 where conflicts frequently happen by adding/removing ~ before the
8 architecture names or adding/removing whole architectures. I don't
9 know if I should use a custom git merge driver or a custom git merge
10 strategy.
11
12
13 So the program in the patch below works, but it's not ideal, because
14 it rejects any hunks that don't touch the KEYWORDS=... assignment.
15
16 As I understand it, a custom git merge driver is intended to be used
17 to merge whole file formats, like JSON. As a result, you configure it
18 via gitattributes on a per-extension basis.
19
20 I really just want to make the default recursive git merge handle
21 KEYWORDS=... conflicts automatically, and I don't expect to be able to
22 make a git merge driver that can handle arbitrary conflicts in
23 *.ebuild files. If the merge driver returns non-zero if it was unable
24 to resolve the conflicts, but when it does so git evidently doesn't
25 fallback and insert the typical <<< HEAD ... === ... >>> markers.
26 Maybe I could make my merge driver insert those like git normally
27 does? Seems like git's logic is probably a bit better about handling
28 some conflicts than my tool would be.
29
30 So... is a git merge strategy the thing I want? I don't know. There
31 doesn't seem to really be any documentation on writing git merge
32 strategies. I've only found [1] and [2].
33
34 Cc'ing git@×××××××××××.org, since I expect that's where the experts
35 are. Hopefully they have suggestions.
36
37
38 [1] https://stackoverflow.com/questions/23140240/git-how-do-i-add-a-custom-merge-strategy
39 [2] https://stackoverflow.com/questions/54528824/any-documentation-for-writing-a-custom-git-merge-strategy
40
41
42 On Sun, Dec 20, 2020 at 10:44 PM Matt Turner <mattst88@g.o> wrote:
43 >
44 > Since the KEYWORDS=... assignment is a single line, git struggles to
45 > handle conflicts. When rebasing a series of commits that modify the
46 > KEYWORDS=... it's usually easier to throw them away and reapply on the
47 > new tree than it is to manually handle conflicts during the rebase.
48 >
49 > git allows a 'merge driver' program to handle conflicts; this program
50 > handles conflicts in the KEYWORDS=... assignment. E.g., given an ebuild
51 > with these keywords:
52 >
53 > KEYWORDS="~alpha ~amd64 ~arm ~arm64 ~hppa ~ia64 ~mips ~ppc ~ppc64 ~riscv ~s390 ~sparc ~x86"
54 >
55 > One developer drops the ~alpha keyword and pushes to gentoo.git, and
56 > another developer stabilizes hppa. Without this merge driver, git
57 > requires the second developer to manually resolve the conflict. With
58 > the custom merge driver, it automatically resolves the conflict.
59 >
60 > gentoo.git/.git/config:
61 >
62 > [core]
63 > ...
64 > attributesfile = ~/.gitattributes
65 > [merge "keywords"]
66 > name = KEYWORDS merge driver
67 > driver = merge-driver-ekeyword %O %A %B
68 >
69 > ~/.gitattributes:
70 >
71 > *.ebuild merge=keywords
72 >
73 > Signed-off-by: Matt Turner <mattst88@g.o>
74 > ---
75 > One annoying wart in the program is due to the fact that ekeyword
76 > won't work on any file not named *.ebuild. I make a symlink (and set up
77 > an atexit handler to remove it) to work around this. I'm not sure we
78 > could make ekeyword handle arbitrary filenames given its complex multi-
79 > argument parameter support. git merge files are named .merge_file_XXXXX
80 > according to git-unpack-file(1), so we could allow those. Thoughts?
81 >
82 > bin/merge-driver-ekeyword | 125 ++++++++++++++++++++++++++++++++++++++
83 > 1 file changed, 125 insertions(+)
84 > create mode 100755 bin/merge-driver-ekeyword
85 >
86 > diff --git a/bin/merge-driver-ekeyword b/bin/merge-driver-ekeyword
87 > new file mode 100755
88 > index 0000000..6e645a9
89 > --- /dev/null
90 > +++ b/bin/merge-driver-ekeyword
91 > @@ -0,0 +1,125 @@
92 > +#!/usr/bin/python
93 > +#
94 > +# Copyright 2020 Gentoo Authors
95 > +# Distributed under the terms of the GNU General Public License v2 or later
96 > +
97 > +"""
98 > +Custom git merge driver for handling conflicts in KEYWORDS assignments
99 > +
100 > +See https://git-scm.com/docs/gitattributes#_defining_a_custom_merge_driver
101 > +"""
102 > +
103 > +import atexit
104 > +import difflib
105 > +import os
106 > +import shutil
107 > +import sys
108 > +
109 > +from typing import List, Optional, Tuple
110 > +
111 > +from gentoolkit.ekeyword import ekeyword
112 > +
113 > +
114 > +def keyword_array(keyword_line: str) -> List[str]:
115 > + # Find indices of string inside the double-quotes
116 > + i1: int = keyword_line.find('"') + 1
117 > + i2: int = keyword_line.rfind('"')
118 > +
119 > + # Split into array of KEYWORDS
120 > + return keyword_line[i1:i2].split(' ')
121 > +
122 > +
123 > +def keyword_line_changes(old: str, new: str) -> List[Tuple[Optional[str],
124 > + Optional[str]]]:
125 > + a: List[str] = keyword_array(old)
126 > + b: List[str] = keyword_array(new)
127 > +
128 > + s = difflib.SequenceMatcher(a=a, b=b)
129 > +
130 > + changes = []
131 > + for tag, i1, i2, j1, j2 in s.opcodes():
132 > + if tag == 'replace':
133 > + changes.append((a[i1:i2], b[j1:j2]),)
134 > + elif tag == 'delete':
135 > + changes.append((a[i1:i2], None),)
136 > + elif tag == 'insert':
137 > + changes.append((None, b[j1:j2]),)
138 > + else:
139 > + assert tag == 'equal'
140 > + return changes
141 > +
142 > +
143 > +def keyword_changes(ebuild1: str, ebuild2: str) -> List[Tuple[Optional[str],
144 > + Optional[str]]]:
145 > + with open(ebuild1) as e1, open(ebuild2) as e2:
146 > + lines1 = e1.readlines()
147 > + lines2 = e2.readlines()
148 > +
149 > + diff = difflib.unified_diff(lines1, lines2, n=0)
150 > + assert next(diff) == '--- \n'
151 > + assert next(diff) == '+++ \n'
152 > +
153 > + hunk: int = 0
154 > + old: str = ''
155 > + new: str = ''
156 > +
157 > + for line in diff:
158 > + if line.startswith('@@ '):
159 > + if hunk > 0: break
160 > + hunk += 1
161 > + elif line.startswith('-'):
162 > + if old or new: break
163 > + old = line
164 > + elif line.startswith('+'):
165 > + if not old or new: break
166 > + new = line
167 > + else:
168 > + if 'KEYWORDS=' in old and 'KEYWORDS=' in new:
169 > + return keyword_line_changes(old, new)
170 > + return None
171 > +
172 > +
173 > +def apply_keyword_changes(ebuild: str,
174 > + changes: List[Tuple[Optional[str],
175 > + Optional[str]]]) -> int:
176 > + # ekeyword will only modify files named *.ebuild, so make a symlink
177 > + ebuild_symlink = ebuild + '.ebuild'
178 > + os.symlink(ebuild, ebuild_symlink)
179 > + atexit.register(lambda: os.remove(ebuild_symlink))
180 > +
181 > + for removals, additions in changes:
182 > + args = []
183 > + for rem in removals:
184 > + # Drop leading '~' and '-' characters and prepend '^'
185 > + i = 1 if rem[0] in ('~', '-') else 0
186 > + args.append('^' + rem[i:])
187 > + if additions:
188 > + args.extend(additions)
189 > + args.append(ebuild_symlink)
190 > +
191 > + result = ekeyword.main(args)
192 > + if result != 0:
193 > + return result
194 > + return 0
195 > +
196 > +
197 > +def main(argv):
198 > + if len(argv) != 4:
199 > + sys.exit(-1)
200 > +
201 > + O = argv[1] # %O - filename of original
202 > + A = argv[2] # %A - filename of our current version
203 > + B = argv[3] # %B - filename of the other branch's version
204 > +
205 > + # Get changes from %O to %B
206 > + changes = keyword_changes(O, B)
207 > + if not changes:
208 > + sys.exit(-1)
209 > +
210 > + # Apply O -> B changes to A
211 > + result: int = apply_keyword_changes(A, changes)
212 > + sys.exit(result)
213 > +
214 > +
215 > +if __name__ == "__main__":
216 > + main(sys.argv)
217 > --
218 > 2.26.2
219 >