1 |
On Wed, Dec 7, 2011 at 8:58 PM, Mike Frysinger <vapier@g.o> wrote: |
2 |
> i have no sympathy for broken userspace code |
3 |
|
4 |
I define broken userspace code as anything that uses fsync except for |
5 |
transactional synchronization with external sources. |
6 |
|
7 |
My system is a bit beefier now, but one of the biggest performance |
8 |
issues before the upgrade was that mythtv did an fsync all the time. |
9 |
What's the point in having RAM and buffer if some process is going to |
10 |
tell the kernel that it is worth taking a huge impact on everything |
11 |
else just to write one file a little faster. The irony in the end was |
12 |
that the result usually was dropped data in mythtv on another tuner |
13 |
since you'd end up with two processes stepping all over each other and |
14 |
the kernel probably could have easily flushed all that data if it were |
15 |
allowed to have more than a few MB of data in the cache between disk |
16 |
seeks. |
17 |
|
18 |
Filesystems need to fail gracefully. If I write data to disk and the |
19 |
power fails then the file should either contain what it did before the |
20 |
write, or the correct data after the write. As a compromise since COW |
21 |
is still new I'm fine with it even being partially modified, but |
22 |
things like zero-length files or other odd behavior is just dumb. And |
23 |
the correct userspace implementation is to call write, and close when |
24 |
you're done. I don't want every process on my system doing syncs just |
25 |
because some ext3 purist thinks that data is disposable by default, or |
26 |
in the magical world where every system is on a UPS, hardware never |
27 |
fails, and kernels never panic. I might live dangerously for a 50% |
28 |
performance gain, but when we're talking about shaving an extra 0.1% |
29 |
off of some benchmark then lets go ahead and write the data out in the |
30 |
right order. |
31 |
|
32 |
If we do want to have an API that tells the kernel that data isn't |
33 |
disposable or to safely overwrite a file in place then we should |
34 |
invent a new system call for that, and not try to overload some other |
35 |
behavior (like a combination of fsyncing and moves). The problem with |
36 |
the latter is that it often results in extra disk churn as the |
37 |
filesystem does more than is strictly necessary, and since every |
38 |
filesystem author has their own religious preference about the right |
39 |
series of system calls to use programs end up only working correctly |
40 |
on a handful of filesystems. |
41 |
|
42 |
And of course the real solution is COW once that becomes stable. And |
43 |
the next step after that is real transaction support. If only btrfs |
44 |
were ready for production use... |
45 |
|
46 |
Fortunately for the moment it seems Linus is reasonably sane and will |
47 |
fix filesystems if the upstream maintainers are unwilling to do so. |
48 |
:) |
49 |
|
50 |
Ok, off the soap box. My issue is that people seem to talk about |
51 |
"correct userspace behavior" as if there were some kind of standard we |
52 |
were referring to. By all means get the crazy behavior added to POSIX |
53 |
and then we can all go on and on about it (and then once COW comes |
54 |
along scrach our heads as we do all kinds of extra unnecessary file |
55 |
manipulation to do something that just works by default). If there is |
56 |
such a standard that has been endorsed by all the major unix |
57 |
filesystems, please do point it out. |
58 |
|
59 |
And sorry, Mike, not really directed at you in particular - I know |
60 |
that many people feel the same as you do... |
61 |
|
62 |
Rich |