Gentoo Archives: gentoo-dev

From: Alexander Tsoy <alexander@××××.me>
To: gentoo-dev@l.g.o
Subject: Re: [gentoo-dev] RFC: using .xz for doc/man/info compression
Date: Mon, 12 May 2014 10:55:54
Message-Id: 20140512145544.499d9545@work.puleglot
In Reply to: Re: [gentoo-dev] RFC: using .xz for doc/man/info compression by Alexander Tsoy
1 ÷ Mon, 12 May 2014 14:47:36 +0400
2 Alexander Tsoy <alexander@××××.me> ÐÉÛÅÔ:
3
4 > ÷ Sun, 11 May 2014 18:26:32 -0500
5 > Gordon Pettey <petteyg359@×××××.com> ÐÉÛÅÔ:
6 >
7 > > A lot of small files (e.g. AUTHORS, ChangeLog
8 > >
9 > > FWIW: On my system, I have 59M of bz2 files in /usr/share/man and
10 > > /usr/share/doc. A short script to decompress those and recompress with xz
11 > > -6e reduced that to 36M.
12 >
13 > Very strange o_O
14 >
15 > Here is my test results. xz options: "--lzma2=preset=6e,dict=4MiB".
16 > Larger dictionary size does not improve compression ratio, I get
17 > even worse results with just "-6e" or "-9e". man-bz2 is a full copy of
18 > my /usr/share/man, man-xz is a recompressed one.
19 >
20 > Size comparison:
21 >
22 > $ du -s man-bz2/ man-xz/
23 > 82032 man-bz2/
24 > 82308 man-xz/
25
26 Note that a lot of files in these directories are non-compressed text files
27 or symlinks:
28
29 $ find man-bz2/ \( ! -name "*.bz2" -o -type l \) -a ! -type d | wc -l
30 8434
31 $ find man-bz2/ -name "*.bz2" -type f | wc -l
32 11243
33 $ find man-xz/ \( ! -name "*.xz" -o -type l \) -a ! -type d | wc -l
34 8434
35 $ find man-xz/ -name "*.xz" -type f | wc -l
36 11243
37
38 After cleaning them and adding -b option:
39
40 $ du -bs man-bz2/ man-xz/
41 32158286 man-bz2/
42 32550305 man-xz/
43
44 --
45 Alexander Tsoy