Gentoo Archives: gentoo-dev

From: Michael Orlitzky <mjo@g.o>
To: gentoo-dev@l.g.o
Subject: Re: [gentoo-dev] ChangeLog - Infra Response; update 2015/11/11, potential impact to 30min rsync cycle
Date: Wed, 18 Nov 2015 17:55:31
Message-Id: 564CBB79.2030006@gentoo.org
In Reply to: Re: [gentoo-dev] ChangeLog - Infra Response; update 2015/11/11, potential impact to 30min rsync cycle by Peter Stuge
1 On 11/18/2015 09:48 AM, Peter Stuge wrote:
2 > Peter Stuge wrote:
3 >> Robin H. Johnson wrote:
4 >>> However, the largest sticking point, even with parallel threads, is that
5 >>> it seems the base ChangeLog generation is incredibly slow. It averages
6 >>> above 350ms per package right now (at 19k packages in a full cycle, it's
7 >>> a long time), but some packages can take up to 5 seconds so far.
8 >>
9 >> Which code is doing this generation? Sorry - ENOOVERVIEW. :\
10 >
11 > Bump. Does anyone know where I can take a look at this code?
12 >
13
14 I don't know, but since no one else is answering, I'll try to find out.
15 There are a few bugs on b.g.o. (search "changelog") that suggest
16 `egencache --update-changelog` is being used. The egencache command is
17 part of portage, so....
18
19 $ git clone http://anongit.gentoo.org/git/proj/portage.git
20
21 Looking at bin/egencache, you'll find a bunch of indirection, but
22 ultimately, the generate_changelog() method of the GenChangeLogs class
23 is doing the work. The implementation is straightforward. I suspect the
24 slow part is,
25
26 # now grab all the commits
27 revlist_cmd = ['git', self._work_tree, 'rev-list']
28 if self._changelog_reversed:
29 revlist_cmd.append('--reverse')
30 revlist_cmd.extend(['HEAD', '--', '.'])
31 commits = self.grab(revlist_cmd).split()
32
33 where
34
35 @staticmethod
36 def grab(cmd):
37 p = subprocess.Popen(cmd, stdout=subprocess.PIPE)
38 return _unicode_decode(p.communicate()[0],
39 encoding=_encodings['stdio'],
40 errors='strict')
41
42 That's taking about half a second if I run it from the command-line.

Replies