Gentoo Archives: gentoo-cluster

From: "John R. Dunning" <jrd@××××××××.com>
To: gentoo-cluster@l.g.o
Subject: [gentoo-cluster] examples of (large) Gentoo clusters
Date: Fri, 08 Dec 2006 14:19:31
Message-Id: 17785.29757.683233.632715@gs105.sicortex.com
In Reply to: Re: [gentoo-cluster] examples of (large) Gentoo clusters by Bryan Green
1 From: Bryan Green <bgreen@××××××××.gov>
2 Date: Thu, 07 Dec 2006 19:56:46 -0800
3 [...]
4 By comparison, 'mount -t lustre' pretty much characterizes the
5 simplicity of 1.6.
6
7 Agreed.
8
9 > Are you worrying about the kernel patching and other software installation
10 > issues, or about how to set up the fs itself once you've got the software
11 > together?
12
13 Kernel patching. For software installation, the lustre ebuild that was put on
14 this list recently seemed to do the trick for me, and setup was pretty easy.
15
16 Yeah, I think that ebuild came from us.
17
18 I was able to patch the kernel, but the server was somewhat unstable.
19
20 Do you remember how it was unstable? That's the kind of thing I'd very much
21 like to understand, as we're proposing to depend heavily on it. If there are
22 issues, whether specifically tied to our patches or not, I'd love to know
23 about them.
24
25 Actually,
26 my memory is hazy. I used the 'lustre-sources' ebuild, which effectively packaged
27 up the patches. It was a 2.6.15 kernel. I also tried to make a custom kernel for
28 lustre 1.4, but ultimately hit too many roadblocks. I did learn a bit about how
29 to use 'quilt' though.
30
31 Hmmm. Maybe not. Our stuff ditches quilt.
32
33 >
34 > Very briefly, the kernel-patching issue is an ongoing headache. Lustre
35 > patches vfs in non-trivial ways. Unfortunately, everybody else does too. It
36 > becomes a fairly ugly patch-merging problem. If you want, I can detail the
37 > process I've settled on for coming up with a kernel patchset, but you won't
38 > like it. There are similar issues around ldiskfs and other bits, but they're
39 > simpler, at least by comparison.
40
41 I'd be interested in some of the details - off-list if that is more appropriate,
42 though it might be of interest to others on the list as well. Once you download a
43 1.6 beta, how do you produce a kernel for Gentoo? Do you patch a gentoo-sources
44 kernel, a vanilla-sources kernel, or something else? The ideal would perhaps be
45 to have a 'lustre-sources' ebuild in the gentoo-science overlay. :)
46
47 We can start here and if people get sick of hearing about it, take it
48 someplace else.
49
50 The approach taken by most of the patches in lustre/kernel_patches/patches is,
51 for any particular base kernel, go through and add the datastructures and
52 logic to implement the lustre-specific functionality, which involves changes
53 to core vfs datastructures, sometimes changes in locking strategy, changes to
54 arglists etc. They generally start with RHEL or SLES kernels. There are a
55 couple of problems for the rest of us with that; (a) the RHEL and SLES kernels
56 tend to be a bit antiquated, and (b) the vendors also tend to make quite free
57 with the patches to core datastructures. Some of the latter is actually due
58 to the former; because they're using antique kernels, but they want some bits
59 of the latest and greatest fixes, they selectively import more modern stuff
60 as their own vendor patches.
61
62 The result of all this is the layer of patches to implement lustre
63 functionality, when viewed from the point of view of an unpatched kernel,
64 makes no sense at all. If you try to install such a patchset on a vanilla-ish
65 kernel, even if you get the right base version, you'll get tons of rejects,
66 and when you look at them, it's obvious that they depend on stuff that's not
67 there.
68
69 The way I settled on getting to a patchset which doesn't depend on all kinds
70 of RHEL or SLES was to essentially build a RHEL (fc5, if I recall) kernel,
71 then "subtract out" the RHEL-ness, then take the resultant kernel and diff it
72 against a virgin one. That description covers a multitude of sins.
73 Subtracting out the RHEL-ness (by essentially doing patch -R, then cleaning up
74 the mess) has the inverse of many of the same problems that you get trying to
75 patch lustre on top of a vanilla kernel; arglists don't match etc. The only
76 piece of good news is that you at that point have three datapoints to work
77 with; vanilla, RHEL, and RHEL+lustre, so it's rather easier (though not
78 exactly easy) to divine what the intention of the lustre patches is, and work
79 out how to do the analogous thing without RHEL. Even at that, I had to be
80 wary of some bits of code which disappeared in the RHEL transition, but came
81 back when I backed out RHEL, which needed to be given the same treatment as
82 other analogous bits of code which were still there.
83
84 The bottom line is that you have to understand enough about what the lustre
85 patches are accomplishing that you can come up with analogous patches for the
86 kernel of your choice, which happens not to be one of the ones cfs ships
87 patchsets for.
88
89 The first time I did all that stuff it took something like 3-4 weeks, with
90 numerous false starts. The most recent time I did it, it was something like a
91 couple of days, though that's misleading, because it was very close to the
92 previous version I was upgrading from. If I had to do it today, starting from
93 scratch, I'd estimate 3-5 days.
94
95 You'll note that nowhere in that set of stuff did I utter the word "gentoo".
96 The kernel we're using is not really a gentoo kernel. We're mips-based, so
97 we're starting from something that's perilously close to the mainline
98 linux-mips kernel, then building it up from there. Thankfully, the linux-mips
99 guys don't go in for heavy-duty patching of non-platform-related stuff, so
100 from the point of view of adding lustre to it, it's virtually identical to a
101 vanilla kernel.org kernel. I believe we may have pulled in a small number of
102 the gentoo kernel patches, but I'm not the kernel wizard, so don't know off
103 the top of my head. From my point of view, it looks vanilla.
104
105 It's not clear to me how you'd go about making lustre installation work in a
106 more gentoo-ish kind of way, at least not without a very large amount of work.
107 I guess I think the most likely path forward would be to work with cfs to try
108 to get them to support more vanilla kernels, then try to work on the rest of
109 the gentoo kernel patches to make them fit better. Unfortunately, I suspect
110 that that still isn't going to be easy, as you've got the classic
111 patch-collision problem happening all over the place. I suspect that
112 following that approach would end up with two parallel streams of patches, one
113 for lustre kernels and one for non-lustre kernels. Unless you can get the
114 gentoo community to roll lustre in as a standard part of the gentoo patchset.
115 That probably requires that somebody do a lustre patchset for every kernel
116 version. Unlikely.
117
118 You could, of course, invert the problem and layer lustre on top, but until
119 such time as gentoo is much more prevalent, I doubt you'll get cfs to do that,
120 which means that somebody in the gentoo community gets signed up for the task
121 of re-doing the process I outlined above, for every gentoo kernel which comes
122 down the pike. I'm not holding my breath for that one either.
123
124 A longer term solution is to do some combination of remodularizing vfs and
125 recasting the lustre stuff so as to depend less on getting its fingers into
126 the guts. I once spent some time looking into that, and I do believe it's
127 possible, but it would take some work, and would really need to be done in
128 concert with the rest of the core kernel guys, and I ran out of time to pursue
129 it. In the meantime, the more the gentoo community can resist the temptation
130 to patch the kernel (at least the vfs parts of it), the easier it will be to
131 add lustre.
132
133 Separate from the core kernel patching issues (Hah! you thought I was done,
134 didn't you?) there's stuff around ldiskfs. The strategy used by lustre is to
135 grab a copy of ext3, cart it off to the side, change all the names, insert a
136 few other strategically placed patches, and call it ldiskfs. That then
137 becomes the basic facility by which actual bits are stored on block devices.
138 The issue there is roughly similar to the core kernel, but not as severe, ie
139 any given patchset depends heavily on which specific version of ext3 you
140 started from. Update the kernel, and if it contained fixes to ext3, you've
141 got a problem.
142
143 In practice, this issue tends to be swamped by the core kernel one, ie getting
144 lustre going on a specific kernel binds you so tightly to that kernel that you
145 don't have to worry too much about changing ext3. But at such time as the
146 kernel integration issue becomes easier to deal with, this one will have to be
147 addressed as well. My preferred solution would be to simply snag a copy of
148 ext3 that works, do the foozling once, then make that code be a permanent part
149 of the lustre distribution, rather than relying on constructing it on the
150 fly. But that's up to cfs.
151
152 So anyhow, the short answer is that there's no real rocket science involved in
153 getting lustre to work on a gentoo system, but it does take some work, and if
154 you do it the way I did it, you end up with a system which is more constrained
155 than a normal gentoo system, because you're no longer free to update the
156 kernel using the stock tools. For us it's not a huge deal, but I suspect that
157 some of the gentoo community will balk at that.
158 [...]
159
160 Are you considering getting support from CFS at some point?
161
162 We are working with cfs. That doesn't mean they're doing all our work for us
163 :-}
164
165 Honestly, a big part of it is just plain old market sensitivity. Cfs is
166 paying attention to where their bread and butter is. So far, that's not
167 gentoo. Perhaps if sicortex is wildly successful we'll be able to change that
168 equation :-}
169
170 Sorry, you don't have
171 to answer if that is a sensitive question. But part of this thread has been the
172 topic of encouraging CFS to support Gentoo. Interestingly, my colleague, who is
173 in charge of installing Lustre (1.4) on our test system, is talking to CFS about
174 supporting a vanilla kernel configuration. The reason? We can't make the system
175 stable with a SLES kernel. It was stable for a long time with Gentoo.
176
177 I have not observed stability problems; it pretty much just works. If you can
178 say any more about what issues you ran across, I'd love to hear it.
179
180 Now they
181 seem to have gotten it stable with SLES plus a vanilla 2.6.19 kernel (which of
182 course does not have the Lustre patches). So they want Suse to provide a newer
183 SLES kernel with the Lustre patches, and CFS to support that configuration.
184
185 Well, ok, I dunno what to tell you about working with the vendors on that
186 one.
187
188 We actually did consider running RHEL or SLES kernels, but remember we're
189 mips, and looking at the state of the mips support in those kernels, it was
190 not a pretty picture. We also didn't really want to be in the game of having
191 that much of a frankenstein system. So our approach has boiled down to
192
193 1. Stick close to vanilla
194 2. Make mips work
195 3. Do whatever we need to do to make lustre layer on top of that
196
197 Based on what you've said, I wouldn't fool around with SLES, I'd just figure
198 out what close-to-vanilla kernel you want to start from (picking one you think
199 you can live with for a while) and do some part of what I described above.
200 You might have a somewhat easier time of it if you started with 2.6.18, as I
201 believe there's a cfs-supplied patchset for that one. If you want to start
202 from a gentoo 2.6.18 one, I suspect your task will be to start with vanilla,
203 make that work, then work out how to re-apply the gentoo patches. Re getting
204 cfs to help, my bet would be that you'll have an easier time getting the
205 gentoo community to create patches that are amenable to going on top of a
206 lustre-ized vanilla kernel (and relying on cfs to support vanilla kernels)
207 than you will getting cfs to generate patches to go on top of gentoo. If you
208 watch the lustre lists, you'll see more people asking for vanilla than are
209 asking for gentoo.
210
211 Under no circumstances would I advocate getting a kernel working at some
212 level, then trying to use the kernel.org patches, or anybody else's, to move
213 it forward. I tried that a few times, and while I actually did find a couple
214 of combinations that worked, most of the ones I tried blew up in my face.
215 It's the same problem; there's all kinds of activity going on in vfs. I hope
216 that situation doesn't continue indefinitely, but that's the way it seems to
217 be right now.
218
219 I've gone on long enough for now. Feel free to dig deeper if you dare :-}
220 --
221 gentoo-cluster@g.o mailing list

Replies

Subject Author
Re: [gentoo-cluster] examples of (large) Gentoo clusters Donnie Berkholz <dberkholz@g.o>