Re: [gentoo-user] SSDs, swap, caching, other unusual uses - gentoo-user

From:	Volker Armin Hemmann <volkerarmin@××××××××××.com>
To:	gentoo-user@l.g.o
Subject:	Re: [gentoo-user] SSDs, swap, caching, other unusual uses
Date:	Mon, 01 Aug 2011 17:28:06
Message-Id:	`1977507.ogDxyn5tH7@localhost`
In Reply to:	Re: [gentoo-user] SSDs, swap, caching, other unusual uses by Michael Mol

1

Am Sonntag 31 Juli 2011, 19:11:06 schrieb Michael Mol:

2

> On Sun, Jul 31, 2011 at 6:37 PM, Volker Armin Hemmann

3

>

4

> <volkerarmin@××××××××××.com> wrote:

5

> > Am Sonntag 31 Juli 2011, 10:44:28 schrieb Michael Mol:

6

> >> While I take your point about write-cycle limitations, and I would

7

> >> *assume* you're familiar with the various improvements on

8

> >> wear-leveling technique that have happened over the past *ten years*

9

> >

10

> > yeah, I am. Or let it phrase it differently:

11

> > I know what is claimed.

12

> >

13

> > The problem is, the best wear leveling does not help you if your disk is

14

> > pretty filled up and you still do a lot of writing. 1 000 000 write

15

> > cycles aren't much.

16

>

17

> Ok; I wasn't certain, but it sounded like you'd had your head in the

18

> sand (if you'll pardon the expression). It's clear you didn't. I'm

19

> sorry.

20

>

21

> >> since those concerns were first raised, I could probably raise an

22

> >> argument that a fresh SSD is likely to last longer as a swap device

23

> >> than as a filesystem.

24

> >

25

> > depends - because thanks to wear leveling that 'swap partition' is just

26

> > something the firmware makes the kernel believe to be there.

27

> >

28

> >> Swap is only touched as-needed, while there's been an explosion in

29

> >> programs and user software which demands synchronous writes to disk

30

> >> for data integrity purposes. (Firefox uses sqlite in such a way, for

31

> >> example; I discovered this when I was using sqlite heavily in my *own*

32

> >> application, and Firefox hung for a couple minutes during every batch

33

> >> insert.)

34

> >

35

> > which is another goof reason not to use firefox - but

36

> >             total       used       free     shared    buffers     cached

37

> > Mem:       8182556    7373736     808820          0      56252

38

> >  2197064

39

> > -/+ buffers/cache:    5120420    3062136

40

> > Swap:     23446848      82868   23363980

41

> >

42

> > even with lots of ram, you will hit swap. And since you are using the

43

> > wear- leveling of the drive's firmware it does not matter that your

44

> > swap resides on its own partition - every page written means a

45

> > block-rewrite somewhere. Really not good for your ssd.

46

>

47

> Fair enough.

48

>

49

> It Would Be Nice(tm) if the SSD's block size and alignment matched

50

> that of the kernel's pagesize. Not certain if it's possible to tune

51

> those settings (reliably) in the kernel.

52

>

53

> Also, my stats, from three different systems (they appear to be using

54

> trivial amounts of swap, though my Gentoo box doesn't appear to be

55

> using any)

56

>

57

> (Desktop box)

58

> shortcircuit:1@serenity~

59

> Sun Jul 31 07:03 PM

60

> !499 #1 j0 ?0 $ free -m

61

>              total       used       free     shared    buffers     cached

62

> Mem:          5975       3718       2256          0        617       1106

63

> -/+ buffers/cache:       1994       3980

64

> Swap:         9993          0       9993

65

>

66

> (laptop)

67

> shortcircuit@saffron:~$ free -m

68

>              total       used       free     shared    buffers     cached

69

> Mem:          1995       1732        263          0        169        913

70

> -/+ buffers/cache:        648       1347

71

> Swap:         3921          3       3918

72

>

73

> (server)

74

> shortcircuit@×××××××××××××××××××××.com~

75

> 23:05:34 $ free -m

76

>              total       used       free     shared    buffers     cached

77

> Mem:          2048       2000         47          0        285        488

78

> -/+ buffers/cache:       1225        822

79

> Swap:          511          1        510

80

>

81

> >> Also, despite the MBTF data provided by the manufacturers, there's

82

> >> more empirical evidence that the drives expire faster than expected,

83

> >> anyway. I'm aware of this, and not particularly concerned about it.

84

> >

85

> > well, it is your money to burn.

86

>

87

> Best evidence I've read lately is that the drives last about a year

88

> under heavy use. I was going to include a reference in the last email,

89

> but I can't find a link to the post. I thought it was something Joel

90

> Spolsky (or *someone* at StackOverflow) wrote, but I was unable to

91

> find it quickly.

92

>

93

> My parts usually last 3-5 years, so that's pretty low. Still, having

94

> my swap partition drop (and the entire system halt) would be generally

95

> less damaging to me than having real data on the drive.

96

>

97

> >> False dichotomy. Yes, it increases the wear on the device. That says

98

> >> nothing of its impact on system performance, which was the nature of

99

> >> my point.

100

> >

101

> > if you are so concerned of swap performance you should probably go with

102

> > a

103

> > smaller ssd, get more ram and let that few mb of swap you need been

104

> > handled by several swap partitions.

105

>

106

> This is where I get back to my original, 'prohibitively expensive'

107

> bit. I can get 16GB of RAM into my system for about $200. The use

108

> cases where I've been contemplating this have been where I wanted to

109

> have 60GB to 80GB of data quickly accessible in a random-access

110

> fashion, but where that type of load wasn't what I normally spent my

111

> time doing. (Hence the idea to have a broader improvement from

112

> something such as the file cache)

113

>

114

> And, really, the whole point of the thread was for thought

115

> experiments. Posits are occasionally required.

116

>

117

> >> As for a filecache not being that important, that's only the case if

118

> >> your data of interest exists on the filesystem you put on the SSD.

119

> >>

120

> >> Let's say you're someone like me, who would tend to go with 60GB for /

121

> >> and 3TB for /home. At various times, I'll be doing HDR photo

122

> >> processing, some video transcoding, some random non-portage compile

123

> >> jobs, web browsing, coding, etc.

124

> >

125

> > 60gb for /, 75gb for /var, and 2.5tb data...

126

> > my current setup.

127

>

128

> Handy; we'll have common frames of reference.

129

>

130

> >> If I take a 160GB SSD, I could put / (or, at least, /var/ and /usr),

131

> >> and have some space left over for scratch--but it's going to be a pain

132

> >> trying to figure out which of my 3TB of /home data I want in that fast

133

> >> scratch.

134

> >>

135

> >> File cache is great, because it takes caches your most-used data from

136

> >> *anywhere* and keeps it in a fast-access datastore. I could have a 3

137

> >> *petabyte* volume, not be particularly concerned about data

138

> >> distribution, and have just as response from the filecache as if I had

139

> >> a mere 30GB volume. Putting a filesystem on an SSD simply cannot scale

140

> >> that way.

141

> >

142

> > true, but all those microseconds saved with swap on ssd won't offset the

143

> > pain when the ssd dies earlier.

144

>

145

> It really depends on the quantity and nature of the pain. When the

146

> things I'm toying around with have projected completion times of a

147

> *week* rather than an hour or two, and when I don't normally need so

148

> much memory, it wouldn't be too much of a hassle to remove the dead

149

> drive from fstab and boot back up. (after fsck, etc, natch). In the

150

> words of the Architect, "There are levels of existence we are prepared

151

> to accept..."

152

>

153

> >> Actually, this conversation reminds me of another idea I'd had at one

154

> >> point...putting ext3/ext4's journal on an SSD, while keeping the bulk

155

> >> of the data on large, dense spinning platters.

156

> >

157

> > which sounds nice in theory.

158

>

159

> Yet would potentially run afoul of the SSD's write block resolution.

160

> And, of course, having the journal fail out from under me would be a

161

> fair bit worse than the kernel panicking during a swap operation.

162

>

163

> >> Did you miss the last week's worth of discussion of memory limits on

164

> >> tmpfs?>

165

> > probably. Because I am using tempfs for /var/tmp/portage for ages and

166

> > the only problematic packet is openoffice/libreoffice.

167

>

168

> I ran into trouble with Thunderbird a couple months ago, which is why

169

> I had to drop from using tmpfs. (Also, I compile with -ggdb in CFLAGS,

170

> so I expect my build sizes bloat a bit more than most)

171

>

172

> Anyway, the edge cases and caveats like the ones discussed are why I

173

> ask about what people have tried, and what mitigators, workarounds and

174

> technological improvements people have been working on.

175

--

176

#163933

1	Am Sonntag 31 Juli 2011, 19:11:06 schrieb Michael Mol:
2	> On Sun, Jul 31, 2011 at 6:37 PM, Volker Armin Hemmann
3	>
4	> <volkerarmin@××××××××××.com> wrote:
5	> > Am Sonntag 31 Juli 2011, 10:44:28 schrieb Michael Mol:
6	> >> While I take your point about write-cycle limitations, and I would
7	> >> assume you're familiar with the various improvements on
8	> >> wear-leveling technique that have happened over the past ten years
9	> >
10	> > yeah, I am. Or let it phrase it differently:
11	> > I know what is claimed.
12	> >
13	> > The problem is, the best wear leveling does not help you if your disk is
14	> > pretty filled up and you still do a lot of writing. 1 000 000 write
15	> > cycles aren't much.
16	>
17	> Ok; I wasn't certain, but it sounded like you'd had your head in the
18	> sand (if you'll pardon the expression). It's clear you didn't. I'm
19	> sorry.
20	>
21	> >> since those concerns were first raised, I could probably raise an
22	> >> argument that a fresh SSD is likely to last longer as a swap device
23	> >> than as a filesystem.
24	> >
25	> > depends - because thanks to wear leveling that 'swap partition' is just
26	> > something the firmware makes the kernel believe to be there.
27	> >
28	> >> Swap is only touched as-needed, while there's been an explosion in
29	> >> programs and user software which demands synchronous writes to disk
30	> >> for data integrity purposes. (Firefox uses sqlite in such a way, for
31	> >> example; I discovered this when I was using sqlite heavily in my own
32	> >> application, and Firefox hung for a couple minutes during every batch
33	> >> insert.)
34	> >
35	> > which is another goof reason not to use firefox - but
36	> > total used free shared buffers cached
37	> > Mem: 8182556 7373736 808820 0 56252
38	> > 2197064
39	> > -/+ buffers/cache: 5120420 3062136
40	> > Swap: 23446848 82868 23363980
41	> >
42	> > even with lots of ram, you will hit swap. And since you are using the
43	> > wear- leveling of the drive's firmware it does not matter that your
44	> > swap resides on its own partition - every page written means a
45	> > block-rewrite somewhere. Really not good for your ssd.
46	>
47	> Fair enough.
48	>
49	> It Would Be Nice(tm) if the SSD's block size and alignment matched
50	> that of the kernel's pagesize. Not certain if it's possible to tune
51	> those settings (reliably) in the kernel.
52	>
53	> Also, my stats, from three different systems (they appear to be using
54	> trivial amounts of swap, though my Gentoo box doesn't appear to be
55	> using any)
56	>
57	> (Desktop box)
58	> shortcircuit:1@serenity~
59	> Sun Jul 31 07:03 PM
60	> !499 #1 j0 ?0 $ free -m
61	> total used free shared buffers cached
62	> Mem: 5975 3718 2256 0 617 1106
63	> -/+ buffers/cache: 1994 3980
64	> Swap: 9993 0 9993
65	>
66	> (laptop)
67	> shortcircuit@saffron:~$ free -m
68	> total used free shared buffers cached
69	> Mem: 1995 1732 263 0 169 913
70	> -/+ buffers/cache: 648 1347
71	> Swap: 3921 3 3918
72	>
73	> (server)
74	> shortcircuit@×××××××××××××××××××××.com~
75	> 23:05:34 $ free -m
76	> total used free shared buffers cached
77	> Mem: 2048 2000 47 0 285 488
78	> -/+ buffers/cache: 1225 822
79	> Swap: 511 1 510
80	>
81	> >> Also, despite the MBTF data provided by the manufacturers, there's
82	> >> more empirical evidence that the drives expire faster than expected,
83	> >> anyway. I'm aware of this, and not particularly concerned about it.
84	> >
85	> > well, it is your money to burn.
86	>
87	> Best evidence I've read lately is that the drives last about a year
88	> under heavy use. I was going to include a reference in the last email,
89	> but I can't find a link to the post. I thought it was something Joel
90	> Spolsky (or someone at StackOverflow) wrote, but I was unable to
91	> find it quickly.
92	>
93	> My parts usually last 3-5 years, so that's pretty low. Still, having
94	> my swap partition drop (and the entire system halt) would be generally
95	> less damaging to me than having real data on the drive.
96	>
97	> >> False dichotomy. Yes, it increases the wear on the device. That says
98	> >> nothing of its impact on system performance, which was the nature of
99	> >> my point.
100	> >
101	> > if you are so concerned of swap performance you should probably go with
102	> > a
103	> > smaller ssd, get more ram and let that few mb of swap you need been
104	> > handled by several swap partitions.
105	>
106	> This is where I get back to my original, 'prohibitively expensive'
107	> bit. I can get 16GB of RAM into my system for about $200. The use
108	> cases where I've been contemplating this have been where I wanted to
109	> have 60GB to 80GB of data quickly accessible in a random-access
110	> fashion, but where that type of load wasn't what I normally spent my
111	> time doing. (Hence the idea to have a broader improvement from
112	> something such as the file cache)
113	>
114	> And, really, the whole point of the thread was for thought
115	> experiments. Posits are occasionally required.
116	>
117	> >> As for a filecache not being that important, that's only the case if
118	> >> your data of interest exists on the filesystem you put on the SSD.
119	> >>
120	> >> Let's say you're someone like me, who would tend to go with 60GB for /
121	> >> and 3TB for /home. At various times, I'll be doing HDR photo
122	> >> processing, some video transcoding, some random non-portage compile
123	> >> jobs, web browsing, coding, etc.
124	> >
125	> > 60gb for /, 75gb for /var, and 2.5tb data...
126	> > my current setup.
127	>
128	> Handy; we'll have common frames of reference.
129	>
130	> >> If I take a 160GB SSD, I could put / (or, at least, /var/ and /usr),
131	> >> and have some space left over for scratch--but it's going to be a pain
132	> >> trying to figure out which of my 3TB of /home data I want in that fast
133	> >> scratch.
134	> >>
135	> >> File cache is great, because it takes caches your most-used data from
136	> >> anywhere and keeps it in a fast-access datastore. I could have a 3
137	> >> petabyte volume, not be particularly concerned about data
138	> >> distribution, and have just as response from the filecache as if I had
139	> >> a mere 30GB volume. Putting a filesystem on an SSD simply cannot scale
140	> >> that way.
141	> >
142	> > true, but all those microseconds saved with swap on ssd won't offset the
143	> > pain when the ssd dies earlier.
144	>
145	> It really depends on the quantity and nature of the pain. When the
146	> things I'm toying around with have projected completion times of a
147	> week rather than an hour or two, and when I don't normally need so
148	> much memory, it wouldn't be too much of a hassle to remove the dead
149	> drive from fstab and boot back up. (after fsck, etc, natch). In the
150	> words of the Architect, "There are levels of existence we are prepared
151	> to accept..."
152	>
153	> >> Actually, this conversation reminds me of another idea I'd had at one
154	> >> point...putting ext3/ext4's journal on an SSD, while keeping the bulk
155	> >> of the data on large, dense spinning platters.
156	> >
157	> > which sounds nice in theory.
158	>
159	> Yet would potentially run afoul of the SSD's write block resolution.
160	> And, of course, having the journal fail out from under me would be a
161	> fair bit worse than the kernel panicking during a swap operation.
162	>
163	> >> Did you miss the last week's worth of discussion of memory limits on
164	> >> tmpfs?>
165	> > probably. Because I am using tempfs for /var/tmp/portage for ages and
166	> > the only problematic packet is openoffice/libreoffice.
167	>
168	> I ran into trouble with Thunderbird a couple months ago, which is why
169	> I had to drop from using tmpfs. (Also, I compile with -ggdb in CFLAGS,
170	> so I expect my build sizes bloat a bit more than most)
171	>
172	> Anyway, the edge cases and caveats like the ones discussed are why I
173	> ask about what people have tried, and what mitigators, workarounds and
174	> technological improvements people have been working on.
175	--
176	#163933

Gentoo Archives: gentoo-user