Gentoo Archives: gentoo-user

From: David Haller <gentoo@×××××××.de>
To: gentoo-user@l.g.o
Subject: Re: [gentoo-user] OT Best way to compress files with digits
Date: Fri, 31 Oct 2014 18:58:17
Message-Id: 20141031185545.GA536@grusum.endjinn.de
In Reply to: Re: [gentoo-user] OT Best way to compress files with digits by Rich Freeman
1 Hello,
2
3 On Fri, 31 Oct 2014, Rich Freeman wrote:
4 >On Fri, Oct 31, 2014 at 11:59 AM, <meino.cramer@×××.de> wrote:
5 >> I am currently checking the compression tools I know of for the
6 >> best compression ration. But I will definitly miss those I dont
7 >> know...
8 >> And sometimes one can do magic with option and switches of that
9 >> kind of tools I also dont know of.
10
11 With 100k pseudo-random digits from bash's $RANDOM % 10 and a
12 linebreak every 100 digits (in t.lst) I get this (each with --best /
13 -9 / -m5 (rar) compression-level option):
14
15 $ du -b * | sort -rn
16 101000 t.lst
17 61544 t.lzop
18 50733 t.zoo
19 49696 t.zip
20 49609 t.lha
21 49554 t.gz
22 48907 t.Z
23 44942 t.rar
24 44661 t.rzip
25 44638 t.7z
26 44592 t.xz
27 44572 t.bz2
28 44546 t.lzma
29 44543 t.lzip
30
31 What I find remarkable is that both gzip and good old compress (.Z)
32 are rather good ;) And above is probably a quite comprehensible list,
33 and except .Z, .gz and .bz2 all are name as the binaries used to
34 create them.
35
36 I'd use bzip2/xz/lz as there are e.g. [blx]z(e)(grep|cat|less), but
37 not e.g. 7zgrep, and I guess they can easy access to those archives
38 quite a bit.
39
40 >I can't imagine that any tool will do much better than something like
41 >lzo, gzip, xz, etc. You'll definitely benefit from compression though
42 >- your text files full of digits are encoding 3.3 bits of information
43 >in an 8-bit ascii character and even if the order of digits in pi can
44 >be treated as purely random just about any compression algorithm is
45 >going to get pretty close to that 3.3 bits per digit figure.
46
47 Good estimate:
48
49 $ calc '101000/(8/3.3)'
50 41662.5
51 and I get from (lzip)
52 $ calc 44543*8/101000
53 3.528... (bits/digit)
54 to zip:
55 $ calc 49696*8/101000
56 ~3.93 (bits/digit)
57
58 HTH,
59 -dnh
60
61 --
62 Q: Hobbies?
63 A: Hating music. -- Marvin

Replies

Subject Author
Re: [gentoo-user] OT Best way to compress files with digits Rich Freeman <rich0@g.o>