Gentoo Archives: gentoo-dev

From: Andrew Savchenko <bircoph@×××××.com>
To: gentoo-dev@l.g.o
Subject: Re: [gentoo-dev] RFC: using .xz for doc/man/info compression
Date: Wed, 14 May 2014 02:39:25
Message-Id: 20140514063852.c85dafc02b951814b379e16e@gmail.com
In Reply to: Re: [gentoo-dev] RFC: using .xz for doc/man/info compression by Rich Freeman
1 On Tue, 13 May 2014 08:18:25 -0400 Rich Freeman wrote:
2 > On Tue, May 13, 2014 at 7:01 AM, Andrew Savchenko <bircoph@×××××.com> wrote:
3 > >
4 > > If we are trying to consider all possible cases, some filesystems
5 > > may benefit even from compression of very small files (e.g. from
6 > > 140 to 100 bytes) due to packing of multiple small files in the
7 > > same inode. ReiserFS is a good example, but more may be somewhere
8 > > there.
9 > >
10 >
11 > Btrfs also supports file inlining, so every byte saved on small files
12 > does actually help (I believe the data structure that stores the
13 > inlined data doesn't have a fixed record size). Then again, btrfs
14 > also supports lzo compression and I believe this is fairly widely
15 > used, so I'm not sure that the impact of not compressing small files
16 > will be felt.
17
18 I did not meant inlining. I was talking about block suballocation
19 which allows to store small files in underused blocks of another
20 files:
21 http://en.wikipedia.org/wiki/Block_suballocation
22
23 > I don't think ext4 supports inlining, but I see some discussions of
24 > attempts to add it.
25
26 Ext4 supports inlining for files up to 59 bytes:
27 https://ext4.wiki.kernel.org/index.php/Ext4_Disk_Layout#Inline_Data
28
29 > For VERY small files I would think that overhead would become an issue.
30 >
31 > Unless we have a bunch of 30-byte man pages I'd think that both
32 > simplicity and some potential for utility would lead us to use the
33 > best algorithm possible.
34
35 Agreed, though performance should be considered still. I doubt
36 paq8l -9 will be used for this task, though it is about 1.5 times
37 more effective than xz -9e on text files, even on small ones like
38 man pages; on large files it is at least 2 times better.
39
40 Best regards,
41 Andrew Savchenko