Gentoo Archives: gentoo-dev

From: malc <mlashley@×××××.com>
To: gentoo-dev@l.g.o
Subject: Re: [gentoo-dev] Re: [gentoo-project] Portage repo usage survey and change evaluation
Date: Wed, 02 Mar 2016 19:48:55
Message-Id: CAPkQJpR8jbLhpAJGB=zGVoqDJHoLu1yYjhN5asqsJTT=WydJQA@mail.gmail.com
In Reply to: Re: [gentoo-dev] Re: [gentoo-project] Portage repo usage survey and change evaluation by Ulrich Mueller
1 I still fail to understand the bikeshedding here - you really don't
2 need a git checkout to get something akin to a changelog. Use the
3 github API directly...
4
5 The following 1-liner could be trivially productised (maybe even parse
6 $PWD to set the path argument...)
7
8 curl https://api.github.com/repos/gentoo/gentoo/commits?path=app-admin/eselect
9 | perl -MJSON -e 'foreach $i (@{decode_json(join("",@lines=<STDIN>))})
10 { print "$i->{commit}->{author}->{name} -
11 $i->{commit}->{author}->{date}\n\n $i->{commit}->{message}\n"; }'
12
13 Yeah - it's not quite as pretty as our current Changelog, but date,
14 author/committer, commit-msg etc. are all there and you can filter by
15 path just the same as you would with native git log...
16 You could parse the local $PORTDIR/metadata/timestamp* and add an
17 'until' param to the URL to filter commits beyond where a user has
18 rsync'd up to...
19
20 Cheers,
21 malc.
22
23
24 On Wed, Mar 2, 2016 at 6:14 PM, Ulrich Mueller <ulm@g.o> wrote:
25 >>>>>> On Wed, 2 Mar 2016, Ian Stakenvicius wrote:
26 >
27 >> On 02/03/16 03:50 AM, Ulrich Mueller wrote:
28 >>> How is it possible that we have 52 MiB of ChangeLog entries
29 >>> generated in the 0.5 years since the git conversion, whereas we had
30 >>> only a total of 103 MiB in the 13.5 years since ChangeLogs were
31 >>> introduced in 2002? Certainly our commit rate hasn't increased by
32 >>> more than an order of magnitude in the last half year?
33 >
34 >> The content of a changelog entry from git is a lot bigger than it
35 >> was just from echangelog, isn't it?
36 >
37 > Not by a factor of ten.
38 >
39 > I've investigated a bit, and the main problem seems to be that for git
40 > commits that extend over several directories, the commit message is
41 > duplicated into many ChangeLog entries.
42 >
43 > For example, the message of the initial commit 56bd759 appears in some
44 > 18000 files, which accounts for 25 MiB. Then there is commit eaaface
45 > and its revert 1bfb585, again appearing in almost all ChangeLog files
46 > in the tree. These account for another 9 MiB. Last example, commit
47 > 8849b09, another 2 MiB.
48 >
49 > So about 70% of the size is caused by these 4 tree-wide commits alone.
50 > However, there are many more examples of duplication on a smaller
51 > scale.
52 >
53 > Ulrich

Replies