Gentoo Archives: gentoo-user

From: Rich Freeman <rich0@g.o>
To: gentoo-user@l.g.o
Subject: Re: [gentoo-user] snapshots?
Date: Tue, 05 Jan 2016 23:19:21
Message-Id: CAGfcS_kTgXVNnO5Ke=QNuVmu2wbtrqXjvC+O7QF5NkLFdCiq=g@mail.gmail.com
In Reply to: Re: [gentoo-user] snapshots? by lee
1 On Tue, Jan 5, 2016 at 5:16 PM, lee <lee@××××××××.de> wrote:
2 > Rich Freeman <rich0@g.o> writes:
3 >
4 >>
5 >> I would run btrfs on bare partitions and use btrfs's raid1
6 >> capabilities. You're almost certainly going to get better
7 >> performance, and you get more data integrity features.
8 >
9 > That would require me to set up software raid with mdadm as well, for
10 > the swap partition.
11
12 Correct, if you don't want a panic if a single swap drive fails.
13
14 >
15 >> If you have a silent corruption with mdadm doing the raid1 then btrfs
16 >> will happily warn you of your problem and you're going to have a
17 >> really hard time fixing it,
18 >
19 > BTW, what do you do when you have silent corruption on a swap partition?
20 > Is that possible, or does swapping use its own checksums?
21
22 If the kernel pages in data from the good mirror, nothing happens. If
23 the kernel pages in data from the bad mirror, then whatever data
24 happens to be there is what will get loaded and used and/or executed.
25 If you're lucky the modified data will be part of unused heap or
26 something. If not, well, just about anything could happen.
27
28 Nothing in this scenario will check that the data is correct, except
29 for a forced scrub of the disks. A scrub would probably detect the
30 error, but I don't think mdadm has any ability to recover it. Your
31 best bet is probably to try to immediately reboot and save what you
32 can, or a less-risky solution assuming you don't have anything
33 critical in RAM is to just do an immediate hard reset so that there is
34 no risk of bad data getting swapped in and overwriting good data on
35 your normal filesystems.
36
37 > It's still odd. I already have two different file systems and the
38 > overhead of one kind of software raid while I would rather stick to one
39 > file system. With btrfs, I'd still have two different file systems ---
40 > plus mdadm and the overhead of three different kinds of software raid.
41
42 I'm not sure why you'd need two different filesystems. Just btrfs for
43 your data. I'm not sure where you're counting three types of software
44 raid either - you just have your swap. And I don't think any of this
45 involves any significant overhead, other than configuration.
46
47 >
48 > How would it be so much better to triple the software raids and to still
49 > have the same number of file systems?
50
51 Well, the difference would be more data integrity insofar as hardware
52 failure goes, but certainly more risk of logical errors (IMO).
53
54 >
55 >>> When you use hardware raid, it
56 >>> can be disadvantageous compared to btrfs-raid --- and when you use it
57 >>> anyway, things are suddenly much more straightforward because everything
58 >>> is on raid to begin with.
59 >>
60 >> I'd stick with mdadm. You're never going to run mixed
61 >> btrfs/hardware-raid on a single drive,
62 >
63 > A single disk doesn't make for a raid.
64
65 You misunderstood my statement. If you have two drives, you can't run
66 both hardware raid and btrfs raid across them. Hardware raid setups
67 don't generally support running across only part of a drive, and in
68 this setup you'd have to run hardware raid on part of each of two
69 single drives.
70
71 >
72 >> and the only time I'd consider
73 >> hardware raid is with a high quality raid card. You'd still have to
74 >> convince me not to use mdadm even if I had one of those lying around.
75 >
76 > From my own experience, I can tell you that mdadm already does have
77 > significant overhead when you use a raid1 of two disks and a raid5 with
78 > three disks. This overhead may be somewhat due to the SATA controller
79 > not being as capable as one would expect --- yet that doesn't matter
80 > because one thing you're looking at, besides reliability, is the overall
81 > performance. And the overall performance very noticeably increased when
82 > I migrated from mdadm raids to hardware raids, with the same disks and
83 > the same hardware, except that the raid card was added.
84
85 Well, sure, the raid card probably had battery-backed cache if it was
86 decent, so linux could complete its commits to RAM and not have to
87 wait for the disks.
88
89 >
90 > And that was only 5 disks. I also know that the performance with a ZFS
91 > mirror with two disks was disappointingly poor. Those disks aren't
92 > exactly fast, but still. I haven't tested yet if it changed after
93 > adding 4 mirrored disks to the pool. And I know that the performance of
94 > another hardware raid5 with 6 disks was very good.
95
96 You're probably going to find the performance of a COW filesystem to
97 be inferior to that of an overwrite-in-place filesystem, simply
98 because the latter has to do less work.
99
100 >
101 > Thus I'm not convinced that software raid is the way to go. I wish they
102 > would make hardware ZFS (or btrfs, if it ever becomes reliable)
103 > controllers.
104
105 I doubt it would perform any better. What would that controller do
106 that your CPU wouldn't do? Well, other than have battery-backed
107 cache, which would help in any circumstance. If you stuck 5 raid
108 cards in your PC and put one drive on each card and put mdadm or ZFS
109 across all five it would almost certainly perform better because
110 you're adding battery-backed cache.
111
112 >
113 > The relevant advantage of btrfs is being able to make snapshots. Is
114 > that worth all the (potential) trouble? Snapshots are worthless when
115 > the file system destroys them with the rest of the data.
116
117 And that is why I wouldn't use btrfs on a production system unless the
118 use case mitigated this risk and there was benefit from the snapshots.
119 Of course you're taking on more risk using an experimental filesystem.
120
121 >>
122 >> btrfs does not support swap files at present.
123 >
124 > What happens when you try it?
125
126 No idea. Should be easy to test in a VM. I suspect either an error
127 or a kernel bug/panic/etc.
128
129 >
130 >> When it does you'll need to disable COW for them (using chattr)
131 >> otherwise they'll be fragmented until your system grinds to a halt. A
132 >> swap file is about the worst case scenario for any COW filesystem -
133 >> I'm not sure how ZFS handles them.
134 >
135 > Well, then they need to make special provisions for swap files in btrfs
136 > so that we can finally get rid of the swap partitions.
137
138 I'm sure they'll happily accept patches. :)
139
140 >
141 >> If I had done that in the past I think I would have completely avoided
142 >> that issue that required me to restore from backups. That happened in
143 >> the 3.15/3.16 timeframe and I'd have never even run those kernels.
144 >> They were stable kernels at the time, and a few versions in when I
145 >> switched to them (I was probably just following gentoo-sources stable
146 >> keywords back then), but they still had regressions (fixes were
147 >> eventually backported).
148 >
149 > How do you know if an old kernel you pick because you think the btrfs
150 > part works well enough is the right pick? You can either encounter a
151 > bug that has been fixed or a regression that hasn't been
152 > discovered/fixed yet. That way, you can't win.
153
154 You read the lists closely. If you want to be bleeding-edge it will
155 take more work than if you just go with the flow. That's why I'm not
156 on 4.1 yet - I read the lists and am not quite sure they're ready yet.
157
158 >
159 >> I think btrfs is certainly usable today, though I'd be hesitant to run
160 >> it on production servers depending on the use case (I'd be looking for
161 >> a use case that actually has a significant benefit from using btrfs,
162 >> and which somehow mitigates the risks).
163 >
164 > There you go, it's usable, and the risk of using it is too high.
165
166 That is a judgement that everybody has to make based on their
167 requirements. The important thing is to make an informed decision. I
168 don't get paid if you pick btrfs.
169
170 >
171 >> Right now I keep a daily rsnapshot (rsync on steroids - it's in the
172 >> Gentoo repo) backup of my btrfs filesystems on ext4. I occasionally
173 >> debate whether I still need it, but I sleep better knowing I have it.
174 >> This is in addition to my daily duplicity cloud backups of my most
175 >> important data (so, /etc and /home are in the cloud, and mythtv's
176 >> /var/video is just on a local rsync backup).
177 >
178 > I wouldn't give my data out of my hands.
179
180 Somehow I doubt the folks at Amazon are going to break RSA anytime soon.
181
182 >
183 > Snapper? I've never heard of that ...
184 >
185
186 http://snapper.io/
187
188 Basically snapshots+crontab and some wrappers to set retention
189 policies and such. That and some things like package-manager plugins
190 so that you get snapshots before you install stuff.
191
192 >
193 > Queuing up the data when there's more data than the system can deal with
194 > only works when the system has sufficient time to catch up with the
195 > queue. Otherwise, you have to block something at some point, or you
196 > must drop the data. At that point, it doesn't matter how you arrange
197 > the contents of the queue within it.
198
199 Absolutely true. You need to throttle the data before it gets into
200 the queue, so that the business of the queue is exposed to the
201 applications so that they behave appropriately (falling back to
202 lower-bandwidth alternatives, etc). In my case if mythtv's write
203 buffers are filling up and I'm also running an emerge install phase
204 the correct answer (per ionice) is for emerge to block so that my
205 realtime video capture buffers are safely flushed. What you don't
206 want is for the kernel to let emerge dump a few GB of low-priority
207 data into the write cache alongside my 5Mbps HD recording stream.
208 Granted, it isn't as big a problem as it used to be now that RAM sizes
209 have increased.
210
211 >
212 > Gentoo /is/ fire-and-forget in that it works fine. Btrfs is not in that
213 > it may work or not.
214 >
215
216 Well, we certainly must have come a long way then. :) I still
217 remember the last time the glibc ABI changed and I was basically
218 rebuilding everything from single-user mode holding my breath.
219
220
221 --
222 Rich

Replies

Subject Author
Re: [gentoo-user] snapshots? lee <lee@××××××××.de>