1 |
Patrick Lauer <patrick@g.o> said: |
2 |
> Hi all, |
3 |
> |
4 |
> I had this random idea that many of our distfiles are .tar.gz while more |
5 |
> efficient compression methods exist. So I did some testing for fun: |
6 |
> |
7 |
> We have ~15k .tar.gz in distfiles. ~6500 .tar.bz2, ~2000 others. |
8 |
> A short run over 477 distfiles spanning 833M gave me 586M of .tar.bz2 - |
9 |
> roughly 30% more efficient! |
10 |
> A comparison run with 7zip gave me 590M files, so bzip2 seems to be |
11 |
> quite good. |
12 |
> |
13 |
> I don't think repackaging every .tar.gz as .tar.bz2 is a reasonable |
14 |
> option (breaks MD5 digests, we lose the fallback download from the |
15 |
> homepage), but maybe this motivates people to save bandwidth and migrate |
16 |
> their packaging to bzip2. |
17 |
|
18 |
Patrick, |
19 |
|
20 |
did you benchmark CPU load? Often bzip2 takes 3x as long to |
21 |
uncompress a package than bzip. Often, the space savings doesn't |
22 |
justify the cost of how long it takes for the cpu to decompress the |
23 |
archive. |
24 |
|
25 |
-ryan |