1 |
Hello, |
2 |
|
3 |
On Mon, 12 May 2014 14:47:36 +0400 Alexander Tsoy wrote: |
4 |
> В Sun, 11 May 2014 18:26:32 -0500 |
5 |
> Gordon Pettey <petteyg359@×××××.com> пишет: |
6 |
> |
7 |
> > A lot of small files (e.g. AUTHORS, ChangeLog |
8 |
> > |
9 |
> > FWIW: On my system, I have 59M of bz2 files in /usr/share/man and |
10 |
> > /usr/share/doc. A short script to decompress those and recompress with xz |
11 |
> > -6e reduced that to 36M. |
12 |
> |
13 |
> Very strange o_O |
14 |
> |
15 |
> Here is my test results. xz options: "--lzma2=preset=6e,dict=4MiB". |
16 |
> Larger dictionary size does not improve compression ratio, I get |
17 |
> even worse results with just "-6e" or "-9e". man-bz2 is a full copy of |
18 |
> my /usr/share/man, man-xz is a recompressed one. |
19 |
> |
20 |
> Size comparison: |
21 |
> |
22 |
> $ du -s man-bz2/ man-xz/ |
23 |
> 82032 man-bz2/ |
24 |
> 82308 man-xz/ |
25 |
|
26 |
Please consider that by default du shows block size, not byte size. |
27 |
Than means that if file is actually 1234 bytes large, without -b it |
28 |
will be still accounted for 4096 bytes on 4K-block filesystem. |
29 |
|
30 |
Here are my results: |
31 |
|
32 |
1. With bzip2 -9: |
33 |
find -O3 /usr/share/man -type f -name "*.bz2" -print0 | du -bhc --files0-from - |
34 |
63M |
35 |
find -O3 /usr/share/man -type f -name "*.bz2" -print0 | du -hc --files0-from - |
36 |
146M |
37 |
|
38 |
find -O3 /usr/share/doc -type f -name "*.bz2" -print0 | du -bhc --files0-from - |
39 |
151M total |
40 |
find -O3 /usr/share/doc -type f -name "*.bz2" -print0 | du -hc --files0-from - |
41 |
249M total |
42 |
|
43 |
2. With xz -9e: |
44 |
find -O3 /usr/share/man -type f -name "*.xz" -print0 | du -bhc --files0-from - |
45 |
64M |
46 |
find -O3 /usr/share/man -type f -name "*.xz" -print0 | du -bhc --files0-from - |
47 |
146M |
48 |
|
49 |
find -O3 /usr/share/doc -type f -name "*.xz" -print0 | du -bhc --files0-from - |
50 |
147M total |
51 |
find -O3 /usr/share/doc -type f -name "*.xz" -print0 | du -hc --files0-from - |
52 |
245M total |
53 |
|
54 |
As one can see, on man pages xz is slightly worse or apparent file sizes |
55 |
and has no difference on real disk usage. On docs xz is better for both sizes. |
56 |
|
57 |
As for decompression speed, xz is about twice as good as bzip2 for a large man |
58 |
pages (bash, mplayer, cmake, zshall). Though this speed gain needs to be |
59 |
measured directly for bunzip2 and unxz applications. I'll publish statistically |
60 |
meaningful results later. Both scripting and testing requires time. |
61 |
|
62 |
Best regards, |
63 |
Andrew Savchenko |