Gentoo Archives: gentoo-user

From: Rich Freeman <rich0@g.o>
To: gentoo-user@l.g.o
Subject: Re: [gentoo-user] Portage spokes again...
Date: Thu, 22 Dec 2016 04:03:42
Message-Id: CAGfcS_mHiPQHDkTWM+kURkxBasmxnYBFOKeppUYAxoJfB7TdVA@mail.gmail.com
In Reply to: Re: [gentoo-user] Portage spokes again... by Daniel Campbell
1 On Wed, Dec 21, 2016 at 7:46 PM, Daniel Campbell <zlg@g.o> wrote:
2 >
3 > How does a file take up less than a single FS block? An inode has to be
4 > allocated _somewhere_, does it not?
5 >
6
7 So, the details are going to be filesystem-specific, but typically
8 inodes go into some kind of area of the disk reserved for metadata, so
9 that many of them can be stored in a single disk block. They're also
10 fixed-length so storing them in groups lets you address them as an
11 array. Likewise for directory trees, allocation tracking, and so on.
12
13 In ext4 inodes are 256 bytes by default. So, obviously you can fit 16
14 of those in a single 4k disk block.
15
16 60 of those bytes are used to map the inode to the extents on the disk
17 that contain the file's data. If the data within the file takes up
18 less than 60 bytes then ext4 will store the data inside the actual
19 inode itself since the mapping isn't actually needed in that case.
20 That saves a whole block.
21
22 Other filesystems do things differently. I don't profess to be a
23 general expert, but I have read a fair bit on btrfs. Btrfs allocates
24 spaces in b+ tree nodes that contain fixed-length records on one side
25 (which would store things like inodes and other metadata records), and
26 a heap full of variable-length records on the other. The latter can
27 be used to store the content of small files. I believe btrfs can also
28 use metadata space to store small regions of files as well (such as if
29 you have a file that is just a few bytes larger than the next block
30 boundary, or when you overwrite 1 byte of a large file which in btrfs
31 does not get done in-place).
32
33 The optimization of storing small bits of data without using entire
34 blocks is a pretty common one. Another common optimization is dealing
35 with large blocks of zeros in files. If you write a gigabyte of zeros
36 in most filesystems it will certainly not take a gigabyte of space,
37 even if the filesystem does not otherwise use compression.
38
39 And of course you have stuff that consumes nothing but inodes, like
40 links and device nodes and such.
41
42 It isn't surprising that these optimizations are widespread on
43 unix-like filesystems since small files for configuration/etc are so
44 common. Not only does it save a ton of space, but it also saves a
45 seek when the file is read.
46
47 Finally, I'll just comment that if you're interested in brushing up on
48 data structures, the documentation for any of the modern filesystems
49 is a great thing to read up on. Since disk seeks are incredibly
50 expensive but disks are very large a great deal of thought goes into
51 how the data gets stored.
52
53 --
54 Rich