1 |
>>>>> On Wed, 2 Mar 2016, Ian Stakenvicius wrote: |
2 |
|
3 |
> On 02/03/16 03:50 AM, Ulrich Mueller wrote: |
4 |
>> How is it possible that we have 52 MiB of ChangeLog entries |
5 |
>> generated in the 0.5 years since the git conversion, whereas we had |
6 |
>> only a total of 103 MiB in the 13.5 years since ChangeLogs were |
7 |
>> introduced in 2002? Certainly our commit rate hasn't increased by |
8 |
>> more than an order of magnitude in the last half year? |
9 |
|
10 |
> The content of a changelog entry from git is a lot bigger than it |
11 |
> was just from echangelog, isn't it? |
12 |
|
13 |
Not by a factor of ten. |
14 |
|
15 |
I've investigated a bit, and the main problem seems to be that for git |
16 |
commits that extend over several directories, the commit message is |
17 |
duplicated into many ChangeLog entries. |
18 |
|
19 |
For example, the message of the initial commit 56bd759 appears in some |
20 |
18000 files, which accounts for 25 MiB. Then there is commit eaaface |
21 |
and its revert 1bfb585, again appearing in almost all ChangeLog files |
22 |
in the tree. These account for another 9 MiB. Last example, commit |
23 |
8849b09, another 2 MiB. |
24 |
|
25 |
So about 70% of the size is caused by these 4 tree-wide commits alone. |
26 |
However, there are many more examples of duplication on a smaller |
27 |
scale. |
28 |
|
29 |
Ulrich |