On Wed, Dec 7, 2011 at 8:58 PM, Mike Frysinger <firstname.lastname@example.org> wrote:
> i have no sympathy for broken userspace code
I define broken userspace code as anything that uses fsync except for
transactional synchronization with external sources.
My system is a bit beefier now, but one of the biggest performance
issues before the upgrade was that mythtv did an fsync all the time.
What's the point in having RAM and buffer if some process is going to
tell the kernel that it is worth taking a huge impact on everything
else just to write one file a little faster. The irony in the end was
that the result usually was dropped data in mythtv on another tuner
since you'd end up with two processes stepping all over each other and
the kernel probably could have easily flushed all that data if it were
allowed to have more than a few MB of data in the cache between disk
Filesystems need to fail gracefully. If I write data to disk and the
power fails then the file should either contain what it did before the
write, or the correct data after the write. As a compromise since COW
is still new I'm fine with it even being partially modified, but
things like zero-length files or other odd behavior is just dumb. And
the correct userspace implementation is to call write, and close when
you're done. I don't want every process on my system doing syncs just
because some ext3 purist thinks that data is disposable by default, or
in the magical world where every system is on a UPS, hardware never
fails, and kernels never panic. I might live dangerously for a 50%
performance gain, but when we're talking about shaving an extra 0.1%
off of some benchmark then lets go ahead and write the data out in the
If we do want to have an API that tells the kernel that data isn't
disposable or to safely overwrite a file in place then we should
invent a new system call for that, and not try to overload some other
behavior (like a combination of fsyncing and moves). The problem with
the latter is that it often results in extra disk churn as the
filesystem does more than is strictly necessary, and since every
filesystem author has their own religious preference about the right
series of system calls to use programs end up only working correctly
on a handful of filesystems.
And of course the real solution is COW once that becomes stable. And
the next step after that is real transaction support. If only btrfs
were ready for production use...
Fortunately for the moment it seems Linus is reasonably sane and will
fix filesystems if the upstream maintainers are unwilling to do so.
Ok, off the soap box. My issue is that people seem to talk about
"correct userspace behavior" as if there were some kind of standard we
were referring to. By all means get the crazy behavior added to POSIX
and then we can all go on and on about it (and then once COW comes
along scrach our heads as we do all kinds of extra unnecessary file
manipulation to do something that just works by default). If there is
such a standard that has been endorsed by all the major unix
filesystems, please do point it out.
And sorry, Mike, not really directed at you in particular - I know
that many people feel the same as you do...