1 
On 20141031, Rich Freeman <rich0@g.o> wrote: 
2 
> On Fri, Oct 31, 2014 at 2:55 PM, David Haller <gentoo@×××××××.de> wrote: 
3 
>> 
4 
>> On Fri, 31 Oct 2014, Rich Freeman wrote: 
5 
>> 
6 
>>>I can't imagine that any tool will do much better than something like 
7 
>>>lzo, gzip, xz, etc. You'll definitely benefit from compression though 
8 
>>> your text files full of digits are encoding 3.3 bits of information 
9 
>>>in an 8bit ascii character and even if the order of digits in pi can 
10 
>>>be treated as purely random just about any compression algorithm is 
11 
>>>going to get pretty close to that 3.3 bits per digit figure. 
12 
>> 
13 
>> Good estimate: 
14 
>> 
15 
>> $ calc '101000/(8/3.3)' 
16 
>> 41662.5 
17 
>> and I get from (lzip) 
18 
>> $ calc 44543*8/101000 
19 
>> 3.528... (bits/digit) 
20 
>> to zip: 
21 
>> $ calc 49696*8/101000 
22 
>> ~3.93 (bits/digit) 
23 
> 
24 
> Actually, I'm surprised how far off of this the various methods are. 
25 
> I was expecting SOME overhead, but not this much. 
26 
> 
27 
> A fairly quick algorithm would be to encode every possible set of 96 
28 
> digits into a 40 byte code (that is just a straight decimalbinary 
29 
> conversion). Then read a "word" at a time and translate it. This 
30 
> will only waste 0.011 bits per digit. 
31 

32 
You're cheating. The algorithm you tested will compress strings of 
33 
arbitrary 8bit values. The algorithm you proposed will only compress 
34 
strings of bytes where each byte can have only one of 10 values. 
35 

36 
 
37 
Grant Edwards grant.b.edwards Yow! I want another 
38 
at REWRITE on my CEASAR 
39 
gmail.com SALAD!! 