1 |
On Wed, Dec 21, 2016 at 7:46 PM, Daniel Campbell <zlg@g.o> wrote: |
2 |
> |
3 |
> How does a file take up less than a single FS block? An inode has to be |
4 |
> allocated _somewhere_, does it not? |
5 |
> |
6 |
|
7 |
So, the details are going to be filesystem-specific, but typically |
8 |
inodes go into some kind of area of the disk reserved for metadata, so |
9 |
that many of them can be stored in a single disk block. They're also |
10 |
fixed-length so storing them in groups lets you address them as an |
11 |
array. Likewise for directory trees, allocation tracking, and so on. |
12 |
|
13 |
In ext4 inodes are 256 bytes by default. So, obviously you can fit 16 |
14 |
of those in a single 4k disk block. |
15 |
|
16 |
60 of those bytes are used to map the inode to the extents on the disk |
17 |
that contain the file's data. If the data within the file takes up |
18 |
less than 60 bytes then ext4 will store the data inside the actual |
19 |
inode itself since the mapping isn't actually needed in that case. |
20 |
That saves a whole block. |
21 |
|
22 |
Other filesystems do things differently. I don't profess to be a |
23 |
general expert, but I have read a fair bit on btrfs. Btrfs allocates |
24 |
spaces in b+ tree nodes that contain fixed-length records on one side |
25 |
(which would store things like inodes and other metadata records), and |
26 |
a heap full of variable-length records on the other. The latter can |
27 |
be used to store the content of small files. I believe btrfs can also |
28 |
use metadata space to store small regions of files as well (such as if |
29 |
you have a file that is just a few bytes larger than the next block |
30 |
boundary, or when you overwrite 1 byte of a large file which in btrfs |
31 |
does not get done in-place). |
32 |
|
33 |
The optimization of storing small bits of data without using entire |
34 |
blocks is a pretty common one. Another common optimization is dealing |
35 |
with large blocks of zeros in files. If you write a gigabyte of zeros |
36 |
in most filesystems it will certainly not take a gigabyte of space, |
37 |
even if the filesystem does not otherwise use compression. |
38 |
|
39 |
And of course you have stuff that consumes nothing but inodes, like |
40 |
links and device nodes and such. |
41 |
|
42 |
It isn't surprising that these optimizations are widespread on |
43 |
unix-like filesystems since small files for configuration/etc are so |
44 |
common. Not only does it save a ton of space, but it also saves a |
45 |
seek when the file is read. |
46 |
|
47 |
Finally, I'll just comment that if you're interested in brushing up on |
48 |
data structures, the documentation for any of the modern filesystems |
49 |
is a great thing to read up on. Since disk seeks are incredibly |
50 |
expensive but disks are very large a great deal of thought goes into |
51 |
how the data gets stored. |
52 |
|
53 |
-- |
54 |
Rich |