Gentoo Archives: gentoo-dev

From: Stuart Herbert <stuart@g.o>
To: Max Kalika <max@g.o>, Troy Dack <tad@g.o>, gentoo-dev@g.o
Subject: Re: [gentoo-dev] [GLEP] Web Application Installation
Date: Mon, 04 Aug 2003 22:18:24
Message-Id: 200308042316.19549.stuart@gentoo.org
In Reply to: Re: [gentoo-dev] [GLEP] Web Application Installation by Max Kalika
1 On Monday 04 August 2003 6:11 pm, Max Kalika wrote:
2 > Good morning!
3
4 Evening, Max.
5
6 > See, I don't think that running two (or more) copies of an application is
7 > supporting virtual hosts. Upgrading all the files is a pain. Why not have
8 > one installation of the core files and multiple installations of the config
9 > files? This is, of course, very application specific -- most of them will
10 > still need database upgrades, but that has to be done by the sysadmin.
11
12 Because that won't quite work, unless the app is aware of virtual hosting -
13 and many (if not most) aren't. Think about it.
14
15 Here's an example. Imagine hosting (oh, I don't know) www.iammax.com and
16 www.maxisgreat.com on the same physical box. Document roots for each are
17 /var/www/<host>/public_html/ for arguments sake.
18
19 Now imagine running phpBB on both domains, to provide separate forums. As far
20 as phpBB is concerned, when it accesses, say, login.php on www.iammax.com,
21 the URL is http://www.iammax.com/phpbb/login.php, which translates through as
22 /var/www/www.iammax.com/public_html/phpbb/login.php. Similarly, login.php on
23 www.maxisgreat.com translates through as
24 /var/www/www.maxisgreat.com/public_html/phpbb/login.php
25
26 Given this scenario, how do you make these two sites share the same phpBB code
27 files? Here's a few possible ways, and the problems that they cause. Chip
28 in with others - because this is the core problem.
29
30 a) A bit of mod_alias magic, and you make the /phpbb/ directories aliases for
31 /usr/share/webapps/phpBB-<version>/files/. If you do this, though, how do
32 you get each copy of phpBB to use a separate set of configuration files?
33 What happens when the app needs write-access to the directories? The
34 directories are on /usr, and we're all agreed that /usr should be mountable
35 read-only. And does every webserver have something like mod_alias in the
36 first place?
37
38 If I've understood your eclass correctly, this is what it tries to do, yes?
39
40 b) A bit of .htaccess magic, and you have the directory structure of the
41 webapp on /var, but directives telling PHP where to find the .php files by
42 setting the include_path. The config file problem is solved, because you can
43 drop in local config files, and there are real directories that can be made
44 writable. Only problem with this approach is that not every type of web
45 server has the equivalent of .htaccess files - and the PHP SAPI for each type
46 of web server doesn't necessarily support configuration directives in config
47 files either. And that's before we think about Perl, Python and other ways
48 that webapps can be implemented.
49
50 I believe that this is Robin's basic idea, if I've understood correctly. It's
51 a neat solution, but perhaps not a universal one.
52
53 c) A bit of find and cp -l, and you've got the directory structure of the
54 webapp on /var, plus links back to the original on /usr. Again, the config
55 file problem is solved, because local copies can be parachuted in, and again
56 there are real directories that can be made writable. No webserver-specific
57 tricks are required - as far as the web server is concerned, each domain has
58 its own installation of phpBB.
59
60 d) ?? There must be other solutions to the problem that haven't been discussed
61 in this thread. Please - contribute!
62
63 > How do they handle db updates? How do they handle config file updates?
64
65 I've never asked, and I don't want to know ;-) Maybe Debian just doesn't
66 change that much from year to year <grin>.
67
68 > > 1) Your eclass doesn't correctly (as I understand correct usage to be!)
69 > > support the apache2 USE flag. Easy enough to do - see my webapp-apache
70 > > eclass for an example.
71 >
72 > Why base it on the flag? If the webserver is installed and is supported,
73 > configure for it.
74
75 Because that's what USE flags are there for. If the user puts '-apache2' in
76 their USE flags, it's the job of the ebuild to respect that. Otherwise, the
77 ebuild is broken - and probably in breach of policy too.
78
79 > > 2) Your eclass doesn't specify the permissions that the source files
80 > > should be installed under. Again, easy enough to fix.
81 >
82 > This is very straightforward to add.
83
84 Agreed.
85
86 > > 3) Your eclass doesn't provide support for running multiple copies of the
87 > > same app on the same machine. This is a showstopper.
88 >
89 > This whole thing needs a separate discussion. There's no portage support
90 > for dealing with more than one installation of the same version of the same
91 > package.
92
93 Why a separate discussion? If this GLEP isn't here to address this
94 fundamental issue, then I'd say that the GLEP has the wrong scope.
95
96 You're right - Portage currently can't do all of this on its own. Perhaps it
97 never will be able to. As I understand it, that's why Robin's volunteered to
98 write and maintain additional tools to bridge the gap.
99
100 > > 4) Your eclass requires the admin to stop and start Apache as part of the
101 > > install. This is a showstopper. Not every site will want to stop and
102 > > start their web server just because they've installed a new app.
103 > > Imagine a site hosting hundreds of domains, and having to take them
104 > > *all* offline at the same time just because phpBB's been upgraded (for
105 > > example!). Robin's idea of creating .htaccess files under the document
106 > > root deals with this much more managably - although I think we're gonna
107 > > end up using symlinks, as that'll make it easy to support multiple web
108 > > servers.
109 >
110 > ..htaccess files only work in already web-accessable directories (in your
111 > case, DocumentRoot). If we're putting applications in /usr/share/webapps,
112 > the webserver has no idea to look there. Which is what the Alias directive
113 > does in all the .conf files that are generated. However, Alias is not
114 > allowed in .htaccess AFAIK.
115
116 See about for why I think .htaccess isn't the way to go anyway ;-)
117
118 > http://httpd.apache.org/docs-2.0/mod/mod_alias.html#alias
119 >
120 > Besides that, the server doesn't have to be restarted -- just HUPed.
121
122 Fair enough ;-)
123
124 > > 6) I *like* the idea that check_php() is in this
125 > > eclass, because that check is specific to mod_php under Apache. I'm
126 > > gonna steal that and add it to my webapp-apache eclass ;-)
127 >
128 > Steal away!
129
130 Done ;-)
131
132 > Keep in mind it is not completely accurate if the per-package
133 > USE flags goes into portage (http://bugs.gentoo.org/show_bug.cgi?id=13616).
134 > Things in PUSE will also have to be taken into account. But this is a
135 > completely different matter and will be easy to integrate.
136
137 It's a nice start though. Ideally, we need a programatic interface to
138 portage, an API we can use to handle these types of queries. karltk is
139 working on one ;-)
140
141 > > 7) Your variable names are not generic enough for my liking ;-)
142 > > AWEB_CFG, for example, might be better off being WEBAPP_CFG.
143 >
144 > Hey! I originally had them as WEBAPP_* but was told to change the eclass
145 > to apache-webapp to distinguish from other servers so I wanted the
146 > variables to reflect the name of the eclass. I'm as flexible on this as
147 > playdoh. :-)
148
149 /me bites back his comment about the person who told you to do that ;-)
150
151 > > 8) Instead of trying to supply an all-encompasing
152 > > apache-webapp_src_install(), relying as it does on defining global
153 > > variables, I'd have supplied a number of individual functions to do each
154 > > bit. Say, a webapp-install-appconfig, webapp-install-serverconfig each
155 > > taking parameters (this is off the top of my head here ;-) This is a
156 > > personal preference thing.
157 >
158 > Thats just more to call from the ebuild. My goal was to have very small
159 > ebuilds -- just declare a few variables and inherit the eclass.
160
161 Hmm ... there's a tradeoff here. Smaller ebuilds and less flexible (and
162 re-usable) eclasses. Or larger ebuilds, but more re-usable eclasses. I
163 guess I prefer the latter.
164
165 > > 9) If I'm not mistaken, your eclass does nothing to ensure that the
166 > > webapp can find the configuration files you've moved into
167 > > /etc/webapps/$PN/. Personally, I'm coming to the conclusion that
168 > > /etc/webapps/$PN/ isn't a good idea, because again it doesn't support
169 > > the idea of running multiple copies of the same app on the same machine.
170 >
171 > I'd say we should be shooting for an easy upgrade path first.
172
173 Good point ;-) But what's the point of installing and upgrading configuration
174 files that the application is never going to look at? ;-) (Sorry, it's 22:30,
175 and still ridiculously hot here)
176
177 > If config
178 > files aren't stored in /etc/webapps/${PN}, then we need to have a way of
179 > generating an env.d/${PN} which contains a CONFIG_PROTECT line, otherwise
180 > we're forcing sysadins to reconfigure the application at every upgrade.
181
182 Yeah, but if each instance of the installed app has its own config files, then
183 what's the relevance of /etc/webapps at all?
184
185 > And as I mentioned before, there's no way for portage to handle
186 > multiple-copy installs, so I'm not sure the best way to go about achieving
187 > this goal.
188
189 This is the problem that I think we should be solving - how to support the
190 installation, configuration, and upgrading of multiple-copy installs.
191
192 > > 10) I'm coming to the conclusion that 'emerge -u <webapp>' shouldn't
193 > > overwrite the older version, but should always install alongside, in a
194 > > different slot, so that sites can easily run different versions of apps
195 > > as required. Perhaps this should be configurable somehow? Your eclass
196 > > doesn't make this possible.
197 >
198 > Why should this be handled any different then the rest of apps handled by
199 > portage?
200
201 Because most of the rest of the apps handled by portage don't run in a virtual
202 domain environment. Webapps really are a different beast.
203
204 > If a sysadmin doesn't want older versions removed, just add
205 > AUTOCLEAN=no to make.conf. Granted, the current eclass doesn't use the
206 > full ${PF} in the target directory, but that is easily changed.
207
208 Here's an example. Imagine you're running your own hosting firm, and you have
209 a non-trivial number of customers using the same webapp (say, phpBB for
210 arguments sake). A new version of phpBB comes out. Some customers will ask
211 for the upgrade, and some explicitly will ask you not to upgrade. So, in
212 this situation, you need to have two copies of phpBB installed on the same
213 box at the same time.
214
215 Now let's look at what happens when you run 'emerge -u phpBB', with the
216 appropriate ACCEPT_KEYWORDS of course. Portage goes and installs the new
217 version of phpBB over the top of the old phpBB files. The old version of
218 phpBB gets overwritten, yes? I don't see how AUTOCLEAN will prevent that
219 from happening.
220
221 The whole point of SLOTing apps (as I understand it) is to allow you to have
222 multiple versions installed alongside each other. This is the mechanism that
223 Portage offers us.
224
225 > Ok. Hopefully we can get this thing hashed out and out the door so things
226 > can start getting into shape soon.
227
228 Hell yes.
229
230 > > 1) Apache1/Apache2 conundrum
231 > >
232 > > My eclass uses the detection technique adopted for mod_php, and no-one
233 > > has complained about that. If this eclass is invalid, then so's the
234 > > ebuild for mod_php. And I don't think it is.
235 >
236 > Ok, this one is a keeper.
237
238 Kudos to Robin - it's his algorithm.
239
240 > > 2) Support for multiple DocumentRoot configurations, and also
241 > > 3) Binary packages installing on machines with different DocumentRoot
242 > > values
243 > >
244 > > Until the GLEP is firmed up and approved, we don't have an agreed
245 > > solution to implement.
246 >
247 > Ah! Ok, so before we have this issues ironed out, the current status-quo
248 > will just be maintained. Fair enough.
249
250 Agreed.
251
252 > > Please excuse me, but I don't want to put support for existing ebuilds on
253 > > hold while we debate the GLEP. I believe that we *have* to continue
254 > > support until we're ready and able to switch. Stopping maintenance
255 > > activities is *not* an option.
256 >
257 > Since I'm the laziest person I know, I didn't want to do the work twice.
258 > I'll just hold off on introducing more ebuilds which will have to be
259 > converted later. :-)
260
261 That's up to you. If you're maintaining webapp ebuilds currently in portage,
262 though, I'd urge you not to stop maintaining them just because you're waiting
263 for a design solution via this GLEP.
264
265 > > 4) Which user/group to use
266 > >
267 > > My class uses Robin's suggestion, and assumes that Apache is running with
268 > > the default settings of apache.apache.
269 >
270 > Right, but as we already agreed, not all apps need to have their files
271 > owned by apache:apache. This should be configurable in the eclass.
272 > Correct?
273
274 Do we need a new user/group to own most of the files? And then we just make
275 the files that need write-access owned by the webserver?
276
277 > > 5) DocumentRoot pointing to a read-only mount
278 > >
279 > > As far as I'm concerned, that's like trying to do 'make bzlilo' with
280 > > /boot not mounted, or run an 'emerge' with /usr mounted read-only. It's
281 > > the sysadmin's job to make sure that any necessary filesystems are
282 > > mounted read/write before an installation is attempted. This is not a
283 > > problem unique to web applications.
284 >
285 > The issue is not whether DocumentRoot is a read-only mount during the
286 > install, but during the day-to-day operations. Running 'emerge' with /usr
287 > mounted read-only is not the same as having apache running a webapp that
288 > needs to write to /usr -- one is done seldomly, the other is all the time.
289
290 In his email in that thread you pointed me at, Robin was explicitly talking
291 about /var being a read-only NFS mount at install time.
292
293 > > 6) "it's weak"
294 > >
295 > > That's not an argument, it's an opinion ;-) Anyway, I've taken 5-10
296 > > lines of broken and incorrectly duplicated code from a number of
297 > > ebuilds, and moved them all into one place where they can be re-used and
298 > > maintained for now. Reduced defects is a strong argument, not a weak
299 > > one.
300 >
301 > I never made the "weak" argument
302
303 Again, this is in response to the comments in the archived thread - not this
304 one ;-) You didn't make the comment - someone else did.
305
306 > -- I don't believe in it myself. Things
307 > need to be justified a bit better than that, so I'm completely with you on
308 > this one.
309
310 :)
311
312 > Ok, agreed. Troy, is there any chance for another draft with some of these
313 > things incorporated? (Thanks a bezillion, btw, for putting up with me) :-)
314
315 Who's "putting up" with you? I don't feel like I am! I'm just grateful that
316 we're both interested in finding a solution to this problem. Now, if just a
317 few more people would chip in and help these discussions ... ;-)
318
319 > And a successful one at that from what I saw in my emerge --sync this
320 > morning. :-) Thanks!
321
322 By the time I got Saturday evening, it was all but over. Didn't see a single
323 person in #gentoo-bugs who I could help :(
324
325 > If you say it is very easy to do, I'm on board.
326
327 If it's not easy to do, then we'll scrap it and come up with something better.
328
329 > I can't personally speak
330 > for other webservers, so I'll leave the decision of whether/how to support
331 > others to those with the experience.
332
333 Fair enough.
334
335 > So in any case, we have to pull the
336 > apache-specific things out into a separate framework. Therefore things
337 > like DocumentRoot can't even be considered. So a central location for
338 > webapps must be once again taken into account.
339
340 Yep. How does this sound as the design of the central layout? Let's agree a
341 design, so that it can be added to the GLEP.
342
343 * /usr/webapps/<app-name> as the main directory.
344 * /usr/webapps/<app-name>/public_html/ for files served by the web server
345 * /usr/webapps/<app-name>/cgi-bin/ for CGI-BIN files
346 * /etc/webapps/<app-name>/ to hold the box-default config files
347 * <app-name> is ${PN} for non-slotted packages
348 * <app-name> is ${P} for slotted packages
349
350 > ok, I didn't realize that it wasn't a permanent solution. All is ok.
351
352 Neat.
353
354 > >> How is mod_php related to the way applications are installed?
355 > >
356 > > Erm, how about the whole 'do I use Apache 1 or Apache 2' conumdrum? See
357 > > the mod_php ebuild for details.
358 >
359 > Whichever is installed gets configured. Either or both get touched,
360 > depending on what is detected. It is up to the sysadmin to start one or
361 > the either automatically with rc-update. I don't see a problem if both are
362 > configured (if detected), but only one is running.
363
364 Yeah, but as discussed earlier:
365
366 a) you need a standard way of detecting which one is installed, and
367 b) you need to honour the 'apache2' USE flag
368
369 Take care,
370 Stu
371 --
372 Stuart Herbert stuart@g.o
373 Gentoo Developer http://www.gentoo.org/
374 Beta packages for download http://dev.gentoo.org/~stuart/packages/
375
376 GnuGP key id# F9AFC57C available from http://pgp.mit.edu
377 Key fingerprint = 31FB 50D4 1F88 E227 F319 C549 0C2F 80BA F9AF C57C
378 --

Replies

Subject Author
Re: [gentoo-dev] [GLEP] Web Application Installation Michael Cummings <mcummings@g.o>