Re: [gentoo-dev] reiserfs - gentoo-dev - Gentoo Mailing List Archives

From:	Jean-Michel Smith <jsmith@××××.com>
To:	gentoo-dev@g.o, Bill Kenworthy <billk@×××××××××.au>
Subject:	Re: [gentoo-dev] reiserfs
Date:	Tue, 14 May 2002 19:12:49
Message-Id:	`200205141910.31969.jsmith@kcco.com`
In Reply to:	Re: [gentoo-dev] reiserfs by Bill Kenworthy

1

On Tuesday 14 May 2002 05:44 pm, Bill Kenworthy wrote:

2

3

> The question came up on a local lug as well, with reiserfs and ext3

4

> seeming to be the top choices, and from memory xfs seemed to be bagged

5

> (cant remember why - something to do with the linux implementation?).

6

7

The patch touches a great many files.  Those who want to apply other patches 

8

typically find the xfs patch means doing some work by hand, which puts most 

9

people off.  XFS is rock solid in my experience, even in the face of power 

10

outages, untimely shutdowns, and the like, but I am conservative and only run 

11

it patched against stock kernels (e.g. xfs-sources, which is 2.4.18+xfs 

12

patches only).  Those wanting to play with other experimental patches 

13

generally avoid XFS because of this (as do I on the machines I do that sort 

14

of thing on) for this reason.

15

16

> Reiserfs did get some bad press in the early days, and I think that may

17

> be a hangover that effects peoples thinking.  I think this is a case of

18

> YMMV, and as far as I am concerned, gentoo is the odd one out by not

19

> reccomending reiserfs, and because there seems to be little

20

> documentation to back it up its point of view, but a fair bit of

21

> experiance saying reiserfs is reasonably stable.

22

23

YMMV is reason enough to not recommend a filesystem, when your milage is 

24

varying with respect to spontaneous filesystem corruption!  Gentoo may be the 

25

odd one out on this, but in my opinion that says a great deal positively 

26

about the technical expertise and caution of the Gentoo developers, 

27

particularly in light of my own experiences.

28

29

These will be my final comments on the subject.  Looking at my notes and log 

30

entries, I had a total of 7 machines (out of 9 total deployed) go south with 

31

Reiserfs on them (unrecoverable filesystem corruption, including lost 

32

directories, strangely null files, and in one case oddly corrupt 

33

files/filenames that were undeletable).  None were due to kernel oopses, 

34

untimely shutdowns, or any other cause that would lead one to expect 

35

filesystem troubles, they were all apparently spontaneous, and all happened 

36

within 6 months of being deployed.  The last two machines were migrated off 

37

of Reiser (onto ext2) before they could screw up, having been in use only 

38

about three months.

39

40

The corruptions happened between April and August of last year (2001).  5 

41

machines were running Mandrake, one Red Hat, and one Debian. (~3 months 

42

testing, ~8 months deployed.  I was incorrect in an earlier post when I said 

43

none lasted more than 6 months ... one machine survived 9 months before 

44

problems arose, and another didn't suffer filesystem corruption until 7 

45

months after deployment.  All of these machines are on 24/7)

46

47

My friend had his Suse Reiserfs go south (entire directory tree spontaneously 

48

vanished, but filesystem usage remained the same and even continued to grow) 

49

six weeks ago (April 2002), so this is by no means an early development 

50

glitch that is now ancient history we can comfortably dismiss and forget 

51

about.  (I do not know how long he had had the machine deployed for, but I 

52

can find out if anyone is really interested).

53

54

XFS is annoying because the patch is big, and sometimes one must wait a week 

55

or two after a kernel is released before a patch for xfs exists.  In the case 

56

of gentoo, where multiple cool patches are being applied to an experimental, 

57

pre-release of 2.4.19, we had to wait a week or two before some kind, 

58

enterprising soul managed to work the patch into the OS (testing on -r5 is 

59

looking very good, fwiw).  That having been said, unless one is desperate for 

60

a particular fix one is generally wise to wait a week or two after a kernel 

61

release before deploying it in a production environment anyway, so this 

62

(admittedly minor) irritation is mitigated for the most part.  

63

64

I've beaten on XFS under just about every condition imaginable (minus LVS, 

65

which I do not use) and have yet to be able to make it corrupt the 

66

filesystem.  I've even deliberately caused kernel oopses by trying to compile 

67

glibc on a kernel with high-mem, or a machine with 1 GB RAM, and been unable 

68

to cause damage to the filesystem.  It appears to be very solid and does not 

69

corrupt spontaneously.  (about 3.5 years fairly rigorous testing and very 

70

rigorous usage, including 2 enterprise NFS servers on large RAID devices and 

71

several developer workstations. Of all the filesystems I've tested, this one 

72

has been tested the most thoroughly, except of course for ext2 which I've 

73

been using a great deal longer)

74

75

JFS I've done less testing with.  It appears to be pretty good, but others 

76

have reported LVS corruption which may have been caused by JFS.  I haven't 

77

beaten on it nearly as hard as I have XFS, so I cannot say with certainty 

78

that it is reliable, but thus far I've yet to have it screw up.  Not exactly 

79

a ringing endorsement, but a cautious "it looks ok so far." (~3 months casual 

80

testing)

81

82

ditto ext3.  It needs more testing.  It seems to do alright thus far, but I 

83

tend to treat it as I would an ext2 filesystem. (~5 months casual testing).

84

85

ext2 is very solid, provided it is treated correctly (no improper shutdowns or 

86

power-offs), or buffering is turned off.  It does not corrupt spontaneously, 

87

ever.  BUT, and this is a big BUT, it can and does become corrupted if it is 

88

not shutdown properly (and this can happen due to system hangs, e.g. X with 

89

Nvidia drivers on some configurations, power outage, impatient user hitting 

90

the reset switch, etc.).  Most of the time it will recover through fsck, but 

91

not always, and I echo others who have lost ext2 filesystems that have been 

92

unrecoverably corrupted in this way.  It is why I prefer journalled 

93

filesystems and have gone to deploying XFS where possible and practical 

94

(often, but not always, the case due to the patch's size and complexity), and 

95

why I am keeping an eye on JFS and others.

96

97

It is important to note that filesystem corruption due to untimely shutdowns, 

98

which both ext2 and reiser have suffered from, are a completely different 

99

animal from the apparent spontaneous loss of data that is my major complaint 

100

with Reiser, and why I am so vocal in defending Gentoo's word of caution 

101

regarding it.

102

103

Kernel oopses:  all bets are off for any filesystem (though I've yet to be 

104

able to get XFS to corrupt from this, it is theoretically possible AFAICT 

105

since something might be going on within the kernel's vfs layer when the hang 

106

happens.  This is the only situation in which I find filesystem corruption in 

107

a journaled filesystem to be at all forgivable.

108

109

The only filesystem I have ever experienced that has corrupted itself during 

110

normal operations, with no unexpected reboots, kernel oopses, or other 

111

mitigating circumstances to explain the corruption, is Reiserfs, and these 

112

experiences are all within the last 14 months.

113

114

The only filesystem I've been unable to corrupt has been XFS.  (JFS and ext3 

115

do not count, I haven' t beaten on them the way I have ext2, XFS, and 

116

Reiser).

117

118

So while people should continue to experiement with Reiser (after all, that is 

119

how these sorts of bugs will be found and fixed), a word of caution is IMHO 

120

certainly in order, regardless of whether or not that makes Gentoo "the odd 

121

man out" or not.

122

123

Jean.

Gentoo Archives: gentoo-dev

Replies

1	On Tuesday 14 May 2002 05:44 pm, Bill Kenworthy wrote:
2
3	> The question came up on a local lug as well, with reiserfs and ext3
4	> seeming to be the top choices, and from memory xfs seemed to be bagged
5	> (cant remember why - something to do with the linux implementation?).
6
7	The patch touches a great many files. Those who want to apply other patches
8	typically find the xfs patch means doing some work by hand, which puts most
9	people off. XFS is rock solid in my experience, even in the face of power
10	outages, untimely shutdowns, and the like, but I am conservative and only run
11	it patched against stock kernels (e.g. xfs-sources, which is 2.4.18+xfs
12	patches only). Those wanting to play with other experimental patches
13	generally avoid XFS because of this (as do I on the machines I do that sort
14	of thing on) for this reason.
15
16	> Reiserfs did get some bad press in the early days, and I think that may
17	> be a hangover that effects peoples thinking. I think this is a case of
18	> YMMV, and as far as I am concerned, gentoo is the odd one out by not
19	> reccomending reiserfs, and because there seems to be little
20	> documentation to back it up its point of view, but a fair bit of
21	> experiance saying reiserfs is reasonably stable.
22
23	YMMV is reason enough to not recommend a filesystem, when your milage is
24	varying with respect to spontaneous filesystem corruption! Gentoo may be the
25	odd one out on this, but in my opinion that says a great deal positively
26	about the technical expertise and caution of the Gentoo developers,
27	particularly in light of my own experiences.
28
29	These will be my final comments on the subject. Looking at my notes and log
30	entries, I had a total of 7 machines (out of 9 total deployed) go south with
31	Reiserfs on them (unrecoverable filesystem corruption, including lost
32	directories, strangely null files, and in one case oddly corrupt
33	files/filenames that were undeletable). None were due to kernel oopses,
34	untimely shutdowns, or any other cause that would lead one to expect
35	filesystem troubles, they were all apparently spontaneous, and all happened
36	within 6 months of being deployed. The last two machines were migrated off
37	of Reiser (onto ext2) before they could screw up, having been in use only
38	about three months.
39
40	The corruptions happened between April and August of last year (2001). 5
41	machines were running Mandrake, one Red Hat, and one Debian. (~3 months
42	testing, ~8 months deployed. I was incorrect in an earlier post when I said
43	none lasted more than 6 months ... one machine survived 9 months before
44	problems arose, and another didn't suffer filesystem corruption until 7
45	months after deployment. All of these machines are on 24/7)
46
47	My friend had his Suse Reiserfs go south (entire directory tree spontaneously
48	vanished, but filesystem usage remained the same and even continued to grow)
49	six weeks ago (April 2002), so this is by no means an early development
50	glitch that is now ancient history we can comfortably dismiss and forget
51	about. (I do not know how long he had had the machine deployed for, but I
52	can find out if anyone is really interested).
53
54	XFS is annoying because the patch is big, and sometimes one must wait a week
55	or two after a kernel is released before a patch for xfs exists. In the case
56	of gentoo, where multiple cool patches are being applied to an experimental,
57	pre-release of 2.4.19, we had to wait a week or two before some kind,
58	enterprising soul managed to work the patch into the OS (testing on -r5 is
59	looking very good, fwiw). That having been said, unless one is desperate for
60	a particular fix one is generally wise to wait a week or two after a kernel
61	release before deploying it in a production environment anyway, so this
62	(admittedly minor) irritation is mitigated for the most part.
63
64	I've beaten on XFS under just about every condition imaginable (minus LVS,
65	which I do not use) and have yet to be able to make it corrupt the
66	filesystem. I've even deliberately caused kernel oopses by trying to compile
67	glibc on a kernel with high-mem, or a machine with 1 GB RAM, and been unable
68	to cause damage to the filesystem. It appears to be very solid and does not
69	corrupt spontaneously. (about 3.5 years fairly rigorous testing and very
70	rigorous usage, including 2 enterprise NFS servers on large RAID devices and
71	several developer workstations. Of all the filesystems I've tested, this one
72	has been tested the most thoroughly, except of course for ext2 which I've
73	been using a great deal longer)
74
75	JFS I've done less testing with. It appears to be pretty good, but others
76	have reported LVS corruption which may have been caused by JFS. I haven't
77	beaten on it nearly as hard as I have XFS, so I cannot say with certainty
78	that it is reliable, but thus far I've yet to have it screw up. Not exactly
79	a ringing endorsement, but a cautious "it looks ok so far." (~3 months casual
80	testing)
81
82	ditto ext3. It needs more testing. It seems to do alright thus far, but I
83	tend to treat it as I would an ext2 filesystem. (~5 months casual testing).
84
85	ext2 is very solid, provided it is treated correctly (no improper shutdowns or
86	power-offs), or buffering is turned off. It does not corrupt spontaneously,
87	ever. BUT, and this is a big BUT, it can and does become corrupted if it is
88	not shutdown properly (and this can happen due to system hangs, e.g. X with
89	Nvidia drivers on some configurations, power outage, impatient user hitting
90	the reset switch, etc.). Most of the time it will recover through fsck, but
91	not always, and I echo others who have lost ext2 filesystems that have been
92	unrecoverably corrupted in this way. It is why I prefer journalled
93	filesystems and have gone to deploying XFS where possible and practical
94	(often, but not always, the case due to the patch's size and complexity), and
95	why I am keeping an eye on JFS and others.
96
97	It is important to note that filesystem corruption due to untimely shutdowns,
98	which both ext2 and reiser have suffered from, are a completely different
99	animal from the apparent spontaneous loss of data that is my major complaint
100	with Reiser, and why I am so vocal in defending Gentoo's word of caution
101	regarding it.
102
103	Kernel oopses: all bets are off for any filesystem (though I've yet to be
104	able to get XFS to corrupt from this, it is theoretically possible AFAICT
105	since something might be going on within the kernel's vfs layer when the hang
106	happens. This is the only situation in which I find filesystem corruption in
107	a journaled filesystem to be at all forgivable.
108
109	The only filesystem I have ever experienced that has corrupted itself during
110	normal operations, with no unexpected reboots, kernel oopses, or other
111	mitigating circumstances to explain the corruption, is Reiserfs, and these
112	experiences are all within the last 14 months.
113
114	The only filesystem I've been unable to corrupt has been XFS. (JFS and ext3
115	do not count, I haven' t beaten on them the way I have ext2, XFS, and
116	Reiser).
117
118	So while people should continue to experiement with Reiser (after all, that is
119	how these sorts of bugs will be found and fixed), a word of caution is IMHO
120	certainly in order, regardless of whether or not that makes Gentoo "the odd
121	man out" or not.
122
123	Jean.