Gentoo Archives: gentoo-dev

From: Alexander Tsoy <alexander@××××.me>
To: gentoo-dev@l.g.o
Subject: Re: [gentoo-dev] RFC: using .xz for doc/man/info compression
Date: Sun, 11 May 2014 19:37:52
Message-Id: 20140511233738.447f07cb@home.puleglot
In Reply to: [gentoo-dev] RFC: using .xz for doc/man/info compression by "Michał Górny"
1 В Sun, 11 May 2014 19:46:50 +0200
2 Michał Górny <mgorny@g.o> пишет:
3
4 > Hello, developers.
5 >
6 > I'd like to raise the following item for discussion: making .xz
7 > the default compressor used by portage for documentation, man pages
8 > and info files. That is, the equivalent of:
9 >
10 > PORTAGE_COMPRESS=xz
11 >
12 > in make.globals.
13 >
14 > Rationale: xz-utils is quite widespread nowadays and it is a part
15 > of @system set. It can achieve better compression ratio than bzip2,
16 > and faster decompression at the same time.
17
18 I tried it recently. Actually for doc/man/info and any other relatively
19 small text files xz has worse compression ratio than bzip2. See also:
20
21 https://bugs.gentoo.org/show_bug.cgi?id=372653
22
23 >
24 > I have confirmed that both sys-apps/man and sys-apps/man-db can
25 > handle .xz compressed man pages, and sys-apps/texinfo can handle .xz
26 > compressed info pages. Major text editors and pagers support .xz
27 > alike .bz2 (i.e. usually they support both or neither :)).
28 >
29 > The additional question is: what preset to use? To help discussing
30 > this, I'd like to quote the tables from 'man xz':
31 >
32 > Preset DictSize CompCPU CompMem DecMem
33 > -0 256 KiB 0 3 MiB 1 MiB
34 > -1 1 MiB 1 9 MiB 2 MiB
35 > -2 2 MiB 2 17 MiB 3 MiB
36 > -3 4 MiB 3 32 MiB 5 MiB
37 > -4 4 MiB 4 48 MiB 5 MiB
38 > -5 8 MiB 5 94 MiB 9 MiB
39 > -6 8 MiB 6 94 MiB 9 MiB
40 > -7 16 MiB 6 186 MiB 17 MiB
41 > -8 32 MiB 6 370 MiB 33 MiB
42 > -9 64 MiB 6 674 MiB 65 MiB
43 >
44 > Preset DictSize CompCPU CompMem DecMem
45 > -0e 256 KiB 8 4 MiB 1 MiB
46 > -1e 1 MiB 8 13 MiB 2 MiB
47 > -2e 2 MiB 8 25 MiB 3 MiB
48 > -3e 4 MiB 7 48 MiB 5 MiB
49 > -4e 4 MiB 8 48 MiB 5 MiB
50 > -5e 8 MiB 7 94 MiB 9 MiB
51 > -6e 8 MiB 8 94 MiB 9 MiB
52 > -7e 16 MiB 8 186 MiB 17 MiB
53 > -8e 32 MiB 8 370 MiB 33 MiB
54 > -9e 64 MiB 8 674 MiB 65 MiB
55 >
56 > I'd like to note here that increasing dictionary size over file size
57 > does not improve compression. However, the options involved in CompCPU
58 > may.
59 >
60 > Depending on the expected amount of complexity, I'd either go for:
61 >
62 > 1) -6e (or -6, the default) -- max CompCPU, reasonable use of memory,
63 > and dictionary larger than most (or all?) documents that are going to
64 > be compressed,
65 >
66 > 2) -Ne with minimal 'N' for CompCPU==8 and DictSize > filesize --
67 > still max compression ratio while keeping lowest memory requirements
68 > possible.
69 >
70 > Your thoughts?
71 >
72
73 --
74 Alexander Tsoy