Gentoo Archives: gentoo-dev

From: Ulrich Mueller <ulm@g.o>
To: gentoo-dev@l.g.o
Subject: Re: [gentoo-dev] Re: [gentoo-project] Portage repo usage survey and change evaluation
Date: Wed, 02 Mar 2016 18:14:28
Message-Id: 22231.11642.809779.509501@a1i15.kph.uni-mainz.de
In Reply to: Re: [gentoo-dev] Re: [gentoo-project] Portage repo usage survey and change evaluation by Ian Stakenvicius
1 >>>>> On Wed, 2 Mar 2016, Ian Stakenvicius wrote:
2
3 > On 02/03/16 03:50 AM, Ulrich Mueller wrote:
4 >> How is it possible that we have 52 MiB of ChangeLog entries
5 >> generated in the 0.5 years since the git conversion, whereas we had
6 >> only a total of 103 MiB in the 13.5 years since ChangeLogs were
7 >> introduced in 2002? Certainly our commit rate hasn't increased by
8 >> more than an order of magnitude in the last half year?
9
10 > The content of a changelog entry from git is a lot bigger than it
11 > was just from echangelog, isn't it?
12
13 Not by a factor of ten.
14
15 I've investigated a bit, and the main problem seems to be that for git
16 commits that extend over several directories, the commit message is
17 duplicated into many ChangeLog entries.
18
19 For example, the message of the initial commit 56bd759 appears in some
20 18000 files, which accounts for 25 MiB. Then there is commit eaaface
21 and its revert 1bfb585, again appearing in almost all ChangeLog files
22 in the tree. These account for another 9 MiB. Last example, commit
23 8849b09, another 2 MiB.
24
25 So about 70% of the size is caused by these 4 tree-wide commits alone.
26 However, there are many more examples of duplication on a smaller
27 scale.
28
29 Ulrich

Replies