Gentoo Archives: gentoo-dev

From: Pacho Ramos <pacho@g.o>
To: gentoo-dev@l.g.o
Subject: Re: [gentoo-dev] RFC: using .xz for doc/man/info compression
Date: Sun, 11 May 2014 21:27:41
Message-Id: 1399843650.13920.5.camel@belkin5
In Reply to: [gentoo-dev] RFC: using .xz for doc/man/info compression by "Michał Górny"
1 El dom, 11-05-2014 a las 19:46 +0200, Michał Górny escribió:
2 > Hello, developers.
3 >
4 > I'd like to raise the following item for discussion: making .xz
5 > the default compressor used by portage for documentation, man pages
6 > and info files. That is, the equivalent of:
7 >
8 > PORTAGE_COMPRESS=xz
9 >
10 > in make.globals.
11 >
12 > Rationale: xz-utils is quite widespread nowadays and it is a part
13 > of @system set. It can achieve better compression ratio than bzip2,
14 > and faster decompression at the same time.
15 >
16 > I have confirmed that both sys-apps/man and sys-apps/man-db can
17 > handle .xz compressed man pages, and sys-apps/texinfo can handle .xz
18 > compressed info pages. Major text editors and pagers support .xz
19 > alike .bz2 (i.e. usually they support both or neither :)).
20 >
21 > The additional question is: what preset to use? To help discussing
22 > this, I'd like to quote the tables from 'man xz':
23 >
24 > Preset DictSize CompCPU CompMem DecMem
25 > -0 256 KiB 0 3 MiB 1 MiB
26 > -1 1 MiB 1 9 MiB 2 MiB
27 > -2 2 MiB 2 17 MiB 3 MiB
28 > -3 4 MiB 3 32 MiB 5 MiB
29 > -4 4 MiB 4 48 MiB 5 MiB
30 > -5 8 MiB 5 94 MiB 9 MiB
31 > -6 8 MiB 6 94 MiB 9 MiB
32 > -7 16 MiB 6 186 MiB 17 MiB
33 > -8 32 MiB 6 370 MiB 33 MiB
34 > -9 64 MiB 6 674 MiB 65 MiB
35 >
36 > Preset DictSize CompCPU CompMem DecMem
37 > -0e 256 KiB 8 4 MiB 1 MiB
38 > -1e 1 MiB 8 13 MiB 2 MiB
39 > -2e 2 MiB 8 25 MiB 3 MiB
40 > -3e 4 MiB 7 48 MiB 5 MiB
41 > -4e 4 MiB 8 48 MiB 5 MiB
42 > -5e 8 MiB 7 94 MiB 9 MiB
43 > -6e 8 MiB 8 94 MiB 9 MiB
44 > -7e 16 MiB 8 186 MiB 17 MiB
45 > -8e 32 MiB 8 370 MiB 33 MiB
46 > -9e 64 MiB 8 674 MiB 65 MiB
47 >
48 > I'd like to note here that increasing dictionary size over file size
49 > does not improve compression. However, the options involved in CompCPU
50 > may.
51 >
52 > Depending on the expected amount of complexity, I'd either go for:
53 >
54 > 1) -6e (or -6, the default) -- max CompCPU, reasonable use of memory,
55 > and dictionary larger than most (or all?) documents that are going to
56 > be compressed,
57 >
58 > 2) -Ne with minimal 'N' for CompCPU==8 and DictSize > filesize -- still
59 > max compression ratio while keeping lowest memory requirements possible.
60 >
61 > Your thoughts?
62 >
63
64 Per:
65 https://bugs.gentoo.org/show_bug.cgi?id=372653
66
67 Looks like bzip2 was still better for small files :/

Replies

Subject Author
Re: [gentoo-dev] RFC: using .xz for doc/man/info compression Gordon Pettey <petteyg359@×××××.com>
Re: [gentoo-dev] RFC: using .xz for doc/man/info compression "Marcin Mirosław" <marcin@×××××.pl>