1 |
On Sun, Jun 22, 2014 at 7:44 AM, Kai Krakow <hurikhan77@×××××.com> wrote: |
2 |
> I don't see where you could lose the volume management features. You just |
3 |
> add device on top of the bcache device after you initialized the raw device |
4 |
> with a bcache superblock and attached it. The rest works the same, just that |
5 |
> you use bcacheX instead of sdX devices. |
6 |
|
7 |
Ah, didn't realize you could attach/remove devices to bcache later. |
8 |
Presumably it handles device failures gracefully, ie exposing them to |
9 |
the underlying filesystem so that it can properly recover? |
10 |
|
11 |
> |
12 |
> From that point of view, I don't think something like ZIL should be |
13 |
> implemented in btrfs itself but as a generic approach like bcache so every |
14 |
> component in Linux can make use of it. Hot data relocation OTOH is |
15 |
> interesting from another point of view and may become part of future btrfs |
16 |
> as it benefits from knowledge about the filesystem itself, using a generic |
17 |
> interface like "hot data tracking" in VFS - so other components can make use |
18 |
> of that, too. |
19 |
|
20 |
The only problem with doing stuff like this at a lower level (both |
21 |
write and read caching) is that it isn't RAID-aware. If you write |
22 |
10GB of data, you use 20GB of cache to do it if you're mirrored, |
23 |
because the cache doesn't know about mirroring. Offhand I'm not sure |
24 |
if there are any performance penalties as well around the need for |
25 |
barriers/etc with the cache not being able to be relied on to do the |
26 |
right thing in terms of what gets written out - also, the data isn't |
27 |
redundant while it is on the cache, unless you mirror the cache. |
28 |
Granted, if you're using it for write intent logging then there isn't |
29 |
much getting around that. |
30 |
|
31 |
> Having to prepare devices for bcache is kind of a show-stopper because it is |
32 |
> no drop-in component that way. But OTOH I like that approach better than dm- |
33 |
> cache because it protects from using the backing device without going |
34 |
> through the caching layer which could otherwise severely damage your data, |
35 |
> and you get along with fewer devices and don't need to size a meta device |
36 |
> (which probably needs to grow later if you add devices, I don't know). |
37 |
|
38 |
And this is the main thing keeping me away from it. It is REALLY |
39 |
painful to migrate to/from. Having it integrated into the filesystem |
40 |
delivers all the same benefits of not being able to mount it without |
41 |
the cache present. |
42 |
|
43 |
Now excuse me while I go fix my btrfs (I tried re-enabling snapper and |
44 |
it again got the filesystem into a worked-up state after trying to |
45 |
clean up half a dozen snapshots at the same time - it works fine until |
46 |
you go and try to write a lot of data to it, then it stops syncing |
47 |
though you don't necessarily notice until a few hours later when the |
48 |
write cache exhausts RAM and on reboot your disk reverts back a few |
49 |
hours). I suspect that if I just treat it gently for a few hours |
50 |
btrfs will clean up the mess and it will work normally again, but the |
51 |
damage apparently persists after a reboot if you go heavy in the disk |
52 |
too quickly... |
53 |
|
54 |
Rich |