Gentoo Archives: gentoo-dev

From:	Duncan <1i5t5.duncan@×××.net>
To:	gentoo-dev@l.g.o
Subject:	[gentoo-dev] Re: RFC: using .xz for doc/man/info compression
Date:	Tue, 13 May 2014 17:27:41
Message-Id:	`pan$8dbef$4f96a0f6$eda0c1b3$7f9f5663@cox.net`
In Reply to:	Re: [gentoo-dev] RFC: using .xz for doc/man/info compression by Rich Freeman

1	Rich Freeman posted on Tue, 13 May 2014 08:18:25 -0400 as excerpted:
2
3	> Btrfs also supports file inlining, so every byte saved on small files
4	> does actually help (I believe the data structure that stores the inlined
5	> data doesn't have a fixed record size).
6
7	There's an option for it, altho I've not screwed with it and don't know
8	the default without looking it up.
9
10	The overall metadata node size (set at mkfs.btrfs time) originally
11	defaulted to the filesystem block size, which is the memory page size,
12	thus 4096 bytes on x86/amd64 and I believe arm. However, the metadata
13	node size default recently changed to 16KiB (or page size where that is
14	larger than 16KiB), altho I'd guess there's still more 4KiB node size
15	users due to all the legacy btrfs out there, but 16KiB will certainly be
16	the majority at some point.
17
18	Individual file inline size is certainly smaller than metadata node size,
19	but again, I've not messed with that so don't know the actual default for
20	it.
21
22	> Then again, btrfs also supports lzo compression and I believe this is
23	> fairly widely used, so I'm not sure that the impact of not compressing
24	> small files will be felt.
25
26	Of course there's gzip as well, and it's the (now legacy) default if
27	compression is specified but not type, altho lzo is recommended as faster
28	with "good enough" compression.
29
30	The other factor to consider is replication mode. On a single device
31	filesystem data replication mode is single by default, with metadata dup
32	(two copies), except on detected ssd, where the metadata default is
33	(somewhat controversially) single due to some ssds doing internal
34	deduplication. On multi-device filesystems the metadata default is (two-
35	copy, regardless of the number of devices) raid1, while the data default
36	remains single.
37
38	So from a size perspective, assuming defaults of single data, dup or
39	raid1 metadata, uncompressed, the cutover should be near 2048 bytes,
40	since under that, duplicated metadata inlining will still be smaller than
41	the 4096 byte data block size, while over that, sticking it in a single-
42	mode data extent should be more efficient.
43
44	Bottom line, there's enough btrfs variables including inlining size, data
45	vs. metadata replication modes, metadata node sizes and compression and
46	compression type, and the chances that gentoo btrfs users are likely to
47	be tweaking at least one of those variables is high enough, that I'm not
48	sure a generic ideal cutover makes a lot of sense, but to the extent that
49	there is one, it's likely to be near 2048 bytes.
50
51	FWIW I believe I'm still using portage bzip2 docs compression by default
52	here, altho in the context of this thread I should really examine that
53	since I use compress=lzo at the filesystem level. Both data and metadata
54	are raid1 here, so inlining doesn't matter except that AFAIK inlining is
55	NOT compressed while data extents can be, so portage level compression is
56	likely to make even less difference if it's in the range that portage
57	level bzip2 compression makes it small enough to be inlined, vs not
58	portage level compressed but then big enough to not be inlined, thus
59	btrfs-level transparent lzo compressed as a data extent.
60
61	--
62	Duncan - List replies preferred. No HTML msgs.
63	"Every nonfree program has a lord, a master --
64	and if you use the program, he is your master." Richard Stallman

Report Message

Find on MARC Find on Google Groups