1 |
From: Bryan Green <bgreen@××××××××.gov> |
2 |
Date: Thu, 07 Dec 2006 19:56:46 -0800 |
3 |
[...] |
4 |
By comparison, 'mount -t lustre' pretty much characterizes the |
5 |
simplicity of 1.6. |
6 |
|
7 |
Agreed. |
8 |
|
9 |
> Are you worrying about the kernel patching and other software installation |
10 |
> issues, or about how to set up the fs itself once you've got the software |
11 |
> together? |
12 |
|
13 |
Kernel patching. For software installation, the lustre ebuild that was put on |
14 |
this list recently seemed to do the trick for me, and setup was pretty easy. |
15 |
|
16 |
Yeah, I think that ebuild came from us. |
17 |
|
18 |
I was able to patch the kernel, but the server was somewhat unstable. |
19 |
|
20 |
Do you remember how it was unstable? That's the kind of thing I'd very much |
21 |
like to understand, as we're proposing to depend heavily on it. If there are |
22 |
issues, whether specifically tied to our patches or not, I'd love to know |
23 |
about them. |
24 |
|
25 |
Actually, |
26 |
my memory is hazy. I used the 'lustre-sources' ebuild, which effectively packaged |
27 |
up the patches. It was a 2.6.15 kernel. I also tried to make a custom kernel for |
28 |
lustre 1.4, but ultimately hit too many roadblocks. I did learn a bit about how |
29 |
to use 'quilt' though. |
30 |
|
31 |
Hmmm. Maybe not. Our stuff ditches quilt. |
32 |
|
33 |
> |
34 |
> Very briefly, the kernel-patching issue is an ongoing headache. Lustre |
35 |
> patches vfs in non-trivial ways. Unfortunately, everybody else does too. It |
36 |
> becomes a fairly ugly patch-merging problem. If you want, I can detail the |
37 |
> process I've settled on for coming up with a kernel patchset, but you won't |
38 |
> like it. There are similar issues around ldiskfs and other bits, but they're |
39 |
> simpler, at least by comparison. |
40 |
|
41 |
I'd be interested in some of the details - off-list if that is more appropriate, |
42 |
though it might be of interest to others on the list as well. Once you download a |
43 |
1.6 beta, how do you produce a kernel for Gentoo? Do you patch a gentoo-sources |
44 |
kernel, a vanilla-sources kernel, or something else? The ideal would perhaps be |
45 |
to have a 'lustre-sources' ebuild in the gentoo-science overlay. :) |
46 |
|
47 |
We can start here and if people get sick of hearing about it, take it |
48 |
someplace else. |
49 |
|
50 |
The approach taken by most of the patches in lustre/kernel_patches/patches is, |
51 |
for any particular base kernel, go through and add the datastructures and |
52 |
logic to implement the lustre-specific functionality, which involves changes |
53 |
to core vfs datastructures, sometimes changes in locking strategy, changes to |
54 |
arglists etc. They generally start with RHEL or SLES kernels. There are a |
55 |
couple of problems for the rest of us with that; (a) the RHEL and SLES kernels |
56 |
tend to be a bit antiquated, and (b) the vendors also tend to make quite free |
57 |
with the patches to core datastructures. Some of the latter is actually due |
58 |
to the former; because they're using antique kernels, but they want some bits |
59 |
of the latest and greatest fixes, they selectively import more modern stuff |
60 |
as their own vendor patches. |
61 |
|
62 |
The result of all this is the layer of patches to implement lustre |
63 |
functionality, when viewed from the point of view of an unpatched kernel, |
64 |
makes no sense at all. If you try to install such a patchset on a vanilla-ish |
65 |
kernel, even if you get the right base version, you'll get tons of rejects, |
66 |
and when you look at them, it's obvious that they depend on stuff that's not |
67 |
there. |
68 |
|
69 |
The way I settled on getting to a patchset which doesn't depend on all kinds |
70 |
of RHEL or SLES was to essentially build a RHEL (fc5, if I recall) kernel, |
71 |
then "subtract out" the RHEL-ness, then take the resultant kernel and diff it |
72 |
against a virgin one. That description covers a multitude of sins. |
73 |
Subtracting out the RHEL-ness (by essentially doing patch -R, then cleaning up |
74 |
the mess) has the inverse of many of the same problems that you get trying to |
75 |
patch lustre on top of a vanilla kernel; arglists don't match etc. The only |
76 |
piece of good news is that you at that point have three datapoints to work |
77 |
with; vanilla, RHEL, and RHEL+lustre, so it's rather easier (though not |
78 |
exactly easy) to divine what the intention of the lustre patches is, and work |
79 |
out how to do the analogous thing without RHEL. Even at that, I had to be |
80 |
wary of some bits of code which disappeared in the RHEL transition, but came |
81 |
back when I backed out RHEL, which needed to be given the same treatment as |
82 |
other analogous bits of code which were still there. |
83 |
|
84 |
The bottom line is that you have to understand enough about what the lustre |
85 |
patches are accomplishing that you can come up with analogous patches for the |
86 |
kernel of your choice, which happens not to be one of the ones cfs ships |
87 |
patchsets for. |
88 |
|
89 |
The first time I did all that stuff it took something like 3-4 weeks, with |
90 |
numerous false starts. The most recent time I did it, it was something like a |
91 |
couple of days, though that's misleading, because it was very close to the |
92 |
previous version I was upgrading from. If I had to do it today, starting from |
93 |
scratch, I'd estimate 3-5 days. |
94 |
|
95 |
You'll note that nowhere in that set of stuff did I utter the word "gentoo". |
96 |
The kernel we're using is not really a gentoo kernel. We're mips-based, so |
97 |
we're starting from something that's perilously close to the mainline |
98 |
linux-mips kernel, then building it up from there. Thankfully, the linux-mips |
99 |
guys don't go in for heavy-duty patching of non-platform-related stuff, so |
100 |
from the point of view of adding lustre to it, it's virtually identical to a |
101 |
vanilla kernel.org kernel. I believe we may have pulled in a small number of |
102 |
the gentoo kernel patches, but I'm not the kernel wizard, so don't know off |
103 |
the top of my head. From my point of view, it looks vanilla. |
104 |
|
105 |
It's not clear to me how you'd go about making lustre installation work in a |
106 |
more gentoo-ish kind of way, at least not without a very large amount of work. |
107 |
I guess I think the most likely path forward would be to work with cfs to try |
108 |
to get them to support more vanilla kernels, then try to work on the rest of |
109 |
the gentoo kernel patches to make them fit better. Unfortunately, I suspect |
110 |
that that still isn't going to be easy, as you've got the classic |
111 |
patch-collision problem happening all over the place. I suspect that |
112 |
following that approach would end up with two parallel streams of patches, one |
113 |
for lustre kernels and one for non-lustre kernels. Unless you can get the |
114 |
gentoo community to roll lustre in as a standard part of the gentoo patchset. |
115 |
That probably requires that somebody do a lustre patchset for every kernel |
116 |
version. Unlikely. |
117 |
|
118 |
You could, of course, invert the problem and layer lustre on top, but until |
119 |
such time as gentoo is much more prevalent, I doubt you'll get cfs to do that, |
120 |
which means that somebody in the gentoo community gets signed up for the task |
121 |
of re-doing the process I outlined above, for every gentoo kernel which comes |
122 |
down the pike. I'm not holding my breath for that one either. |
123 |
|
124 |
A longer term solution is to do some combination of remodularizing vfs and |
125 |
recasting the lustre stuff so as to depend less on getting its fingers into |
126 |
the guts. I once spent some time looking into that, and I do believe it's |
127 |
possible, but it would take some work, and would really need to be done in |
128 |
concert with the rest of the core kernel guys, and I ran out of time to pursue |
129 |
it. In the meantime, the more the gentoo community can resist the temptation |
130 |
to patch the kernel (at least the vfs parts of it), the easier it will be to |
131 |
add lustre. |
132 |
|
133 |
Separate from the core kernel patching issues (Hah! you thought I was done, |
134 |
didn't you?) there's stuff around ldiskfs. The strategy used by lustre is to |
135 |
grab a copy of ext3, cart it off to the side, change all the names, insert a |
136 |
few other strategically placed patches, and call it ldiskfs. That then |
137 |
becomes the basic facility by which actual bits are stored on block devices. |
138 |
The issue there is roughly similar to the core kernel, but not as severe, ie |
139 |
any given patchset depends heavily on which specific version of ext3 you |
140 |
started from. Update the kernel, and if it contained fixes to ext3, you've |
141 |
got a problem. |
142 |
|
143 |
In practice, this issue tends to be swamped by the core kernel one, ie getting |
144 |
lustre going on a specific kernel binds you so tightly to that kernel that you |
145 |
don't have to worry too much about changing ext3. But at such time as the |
146 |
kernel integration issue becomes easier to deal with, this one will have to be |
147 |
addressed as well. My preferred solution would be to simply snag a copy of |
148 |
ext3 that works, do the foozling once, then make that code be a permanent part |
149 |
of the lustre distribution, rather than relying on constructing it on the |
150 |
fly. But that's up to cfs. |
151 |
|
152 |
So anyhow, the short answer is that there's no real rocket science involved in |
153 |
getting lustre to work on a gentoo system, but it does take some work, and if |
154 |
you do it the way I did it, you end up with a system which is more constrained |
155 |
than a normal gentoo system, because you're no longer free to update the |
156 |
kernel using the stock tools. For us it's not a huge deal, but I suspect that |
157 |
some of the gentoo community will balk at that. |
158 |
[...] |
159 |
|
160 |
Are you considering getting support from CFS at some point? |
161 |
|
162 |
We are working with cfs. That doesn't mean they're doing all our work for us |
163 |
:-} |
164 |
|
165 |
Honestly, a big part of it is just plain old market sensitivity. Cfs is |
166 |
paying attention to where their bread and butter is. So far, that's not |
167 |
gentoo. Perhaps if sicortex is wildly successful we'll be able to change that |
168 |
equation :-} |
169 |
|
170 |
Sorry, you don't have |
171 |
to answer if that is a sensitive question. But part of this thread has been the |
172 |
topic of encouraging CFS to support Gentoo. Interestingly, my colleague, who is |
173 |
in charge of installing Lustre (1.4) on our test system, is talking to CFS about |
174 |
supporting a vanilla kernel configuration. The reason? We can't make the system |
175 |
stable with a SLES kernel. It was stable for a long time with Gentoo. |
176 |
|
177 |
I have not observed stability problems; it pretty much just works. If you can |
178 |
say any more about what issues you ran across, I'd love to hear it. |
179 |
|
180 |
Now they |
181 |
seem to have gotten it stable with SLES plus a vanilla 2.6.19 kernel (which of |
182 |
course does not have the Lustre patches). So they want Suse to provide a newer |
183 |
SLES kernel with the Lustre patches, and CFS to support that configuration. |
184 |
|
185 |
Well, ok, I dunno what to tell you about working with the vendors on that |
186 |
one. |
187 |
|
188 |
We actually did consider running RHEL or SLES kernels, but remember we're |
189 |
mips, and looking at the state of the mips support in those kernels, it was |
190 |
not a pretty picture. We also didn't really want to be in the game of having |
191 |
that much of a frankenstein system. So our approach has boiled down to |
192 |
|
193 |
1. Stick close to vanilla |
194 |
2. Make mips work |
195 |
3. Do whatever we need to do to make lustre layer on top of that |
196 |
|
197 |
Based on what you've said, I wouldn't fool around with SLES, I'd just figure |
198 |
out what close-to-vanilla kernel you want to start from (picking one you think |
199 |
you can live with for a while) and do some part of what I described above. |
200 |
You might have a somewhat easier time of it if you started with 2.6.18, as I |
201 |
believe there's a cfs-supplied patchset for that one. If you want to start |
202 |
from a gentoo 2.6.18 one, I suspect your task will be to start with vanilla, |
203 |
make that work, then work out how to re-apply the gentoo patches. Re getting |
204 |
cfs to help, my bet would be that you'll have an easier time getting the |
205 |
gentoo community to create patches that are amenable to going on top of a |
206 |
lustre-ized vanilla kernel (and relying on cfs to support vanilla kernels) |
207 |
than you will getting cfs to generate patches to go on top of gentoo. If you |
208 |
watch the lustre lists, you'll see more people asking for vanilla than are |
209 |
asking for gentoo. |
210 |
|
211 |
Under no circumstances would I advocate getting a kernel working at some |
212 |
level, then trying to use the kernel.org patches, or anybody else's, to move |
213 |
it forward. I tried that a few times, and while I actually did find a couple |
214 |
of combinations that worked, most of the ones I tried blew up in my face. |
215 |
It's the same problem; there's all kinds of activity going on in vfs. I hope |
216 |
that situation doesn't continue indefinitely, but that's the way it seems to |
217 |
be right now. |
218 |
|
219 |
I've gone on long enough for now. Feel free to dig deeper if you dare :-} |
220 |
-- |
221 |
gentoo-cluster@g.o mailing list |