Gentoo Archives: gentoo-dev

From: Gordon Pettey <petteyg359@×××××.com>
To: gentoo-dev@l.g.o
Subject: Re: [gentoo-dev] RFC: using .xz for doc/man/info compression
Date: Mon, 12 May 2014 22:55:30
Message-Id: CAHY5MedkqVsnrFkoJTGVOwdLO90ctWROEMyJb=eapoFbs_sVDw@mail.gmail.com
In Reply to: Re: [gentoo-dev] RFC: using .xz for doc/man/info compression by Alexander Tsoy
1 On Mon, May 12, 2014 at 5:47 AM, Alexander Tsoy <alexander@××××.me> wrote:
2
3 > В Sun, 11 May 2014 18:26:32 -0500
4 > Gordon Pettey <petteyg359@×××××.com> пишет:
5 >
6 > > A lot of small files (e.g. AUTHORS, ChangeLog
7 > >
8 > > FWIW: On my system, I have 59M of bz2 files in /usr/share/man and
9 > > /usr/share/doc. A short script to decompress those and recompress with xz
10 > > -6e reduced that to 36M.
11 >
12 > Very strange o_O
13 >
14 > Here is my test results. xz options: "--lzma2=preset=6e,dict=4MiB".
15 > Larger dictionary size does not improve compression ratio, I get
16 > even worse results with just "-6e" or "-9e". man-bz2 is a full copy of
17 > my /usr/share/man, man-xz is a recompressed one.
18 >
19 > Size comparison:
20 >
21 > $ du -s man-bz2/ man-xz/
22 > 82032 man-bz2/
23 > 82308 man-xz
24
25
26 Did you skip all the files that weren't bz2 in the first place, and
27 decompress bz2 before compressing with xz? My comparison script does not
28 include uncompressed files. It copies all the bz2 files to a new folder,
29 pipes those through bzip -d to xz -6e to files in another new folder, then
30 compares the total size of those folders. Out of 8576 compressed files,
31 only 464 were larger in xz than in bz2. A very bad timing test I just did
32 showed the total decompression time of all the xz files to be half that of
33 decompressing all the bz2 files. Working on getting that data per-file and
34 averages.