Gentoo Archives: gentoo-dev

From: "Michał Górny" <mgorny@g.o>
To: gentoo-dev@l.g.o
Subject: [gentoo-dev] RFC: using .xz for doc/man/info compression
Date: Sun, 11 May 2014 17:47:08
Message-Id: 20140511194650.6e39739b@pomiot.lan
1 Hello, developers.
2
3 I'd like to raise the following item for discussion: making .xz
4 the default compressor used by portage for documentation, man pages
5 and info files. That is, the equivalent of:
6
7 PORTAGE_COMPRESS=xz
8
9 in make.globals.
10
11 Rationale: xz-utils is quite widespread nowadays and it is a part
12 of @system set. It can achieve better compression ratio than bzip2,
13 and faster decompression at the same time.
14
15 I have confirmed that both sys-apps/man and sys-apps/man-db can
16 handle .xz compressed man pages, and sys-apps/texinfo can handle .xz
17 compressed info pages. Major text editors and pagers support .xz
18 alike .bz2 (i.e. usually they support both or neither :)).
19
20 The additional question is: what preset to use? To help discussing
21 this, I'd like to quote the tables from 'man xz':
22
23 Preset DictSize CompCPU CompMem DecMem
24 -0 256 KiB 0 3 MiB 1 MiB
25 -1 1 MiB 1 9 MiB 2 MiB
26 -2 2 MiB 2 17 MiB 3 MiB
27 -3 4 MiB 3 32 MiB 5 MiB
28 -4 4 MiB 4 48 MiB 5 MiB
29 -5 8 MiB 5 94 MiB 9 MiB
30 -6 8 MiB 6 94 MiB 9 MiB
31 -7 16 MiB 6 186 MiB 17 MiB
32 -8 32 MiB 6 370 MiB 33 MiB
33 -9 64 MiB 6 674 MiB 65 MiB
34
35 Preset DictSize CompCPU CompMem DecMem
36 -0e 256 KiB 8 4 MiB 1 MiB
37 -1e 1 MiB 8 13 MiB 2 MiB
38 -2e 2 MiB 8 25 MiB 3 MiB
39 -3e 4 MiB 7 48 MiB 5 MiB
40 -4e 4 MiB 8 48 MiB 5 MiB
41 -5e 8 MiB 7 94 MiB 9 MiB
42 -6e 8 MiB 8 94 MiB 9 MiB
43 -7e 16 MiB 8 186 MiB 17 MiB
44 -8e 32 MiB 8 370 MiB 33 MiB
45 -9e 64 MiB 8 674 MiB 65 MiB
46
47 I'd like to note here that increasing dictionary size over file size
48 does not improve compression. However, the options involved in CompCPU
49 may.
50
51 Depending on the expected amount of complexity, I'd either go for:
52
53 1) -6e (or -6, the default) -- max CompCPU, reasonable use of memory,
54 and dictionary larger than most (or all?) documents that are going to
55 be compressed,
56
57 2) -Ne with minimal 'N' for CompCPU==8 and DictSize > filesize -- still
58 max compression ratio while keeping lowest memory requirements possible.
59
60 Your thoughts?
61
62 --
63 Best regards,
64 Michał Górny

Attachments

File name MIME type
signature.asc application/pgp-signature

Replies