1 |
On Friday, December 21, 2012 12:02:34 PM Michał Górny wrote: |
2 |
> On Fri, 21 Dec 2012 11:24:45 +0100 |
3 |
> |
4 |
> "J. Roeleveld" <joost@××××××××.org> wrote: |
5 |
> > On Friday, December 21, 2012 09:57:25 AM Michał Górny wrote: |
6 |
> > > Just let me know when you have to maintain a lot of such systemd |
7 |
> > > and upgrade, say, glibc. Then maybe you'll understand. |
8 |
> > |
9 |
> > A shared /usr means I need to update ALL the systems at once. |
10 |
> > When /usr is not shared, I can update groups at a time. |
11 |
> |
12 |
> Yes, and this is what disqualifies it for the general case. If you |
13 |
> can't update one at some point, you can't update the others or it is |
14 |
> going to likely get broken in a random manner. |
15 |
|
16 |
Yes, but do you want to find out when the entire production environment is |
17 |
down? Or would you rather do the upgrades in steps and only risk having to |
18 |
rebuild a few nodes and have a lower performance during that time? |
19 |
There is a big difference between 50% performance and 0%. |
20 |
|
21 |
> > To save time, a shared filesystem containing binary packages can easily be |
22 |
> > used and this is what I use myself. |
23 |
> > I have one VM that is used to rebuild the packages when I want to do an |
24 |
> > update and the real host then simply uses the binary packages. |
25 |
> > The configuration items needed for emerge are synchronized between the |
26 |
> > build system and the actual server. |
27 |
> |
28 |
> Wait, wait. So you have introduced even more hackery to get it working? |
29 |
> Good to hear. That's really a good reason to support your arguments. |
30 |
> 'I got it working with a lot of hackery, so it is a good solution!' |
31 |
|
32 |
Please explain, what is hackery about having a single host doing all the |
33 |
compiling for multiple servers? |
34 |
The only thing I need to synchronize between the "real" host and the "compile" |
35 |
host is "/etc/portage" and "/var/lib/portage/world" |
36 |
|
37 |
I don't need any of those to keep the environment running. It's only needed |
38 |
during the install/update steps. |
39 |
|
40 |
> > The main reason why I would never share an OS filesystem between multiple |
41 |
> > systems is to avoid the situation where a failed upgrade takes down the |
42 |
> > entire environment. |
43 |
> |
44 |
> And this doesn't happen in your case because...? Because as far as I |
45 |
> can see: |
46 |
> |
47 |
> 1) failed upgrade in /usr takes down the entire environment, |
48 |
> |
49 |
> 2) failed upgrade in / may take down the machine, |
50 |
> |
51 |
> 3) failed hackery you're doing to get it all working may have even more |
52 |
> unpredictable results. |
53 |
> |
54 |
> And yes, I prefer to take down the entire environment and fix it in one |
55 |
> step. That sounds much better than trying to get it back up and re-sync |
56 |
> all the machines which got into some mid-broken state. |
57 |
|
58 |
With shared OS filesystems, that is what you will get. |
59 |
With non-shared OS filesystems, the other nodes will keep working. |
60 |
|
61 |
> > And a shared OS filesystem also introduces a very nice Single Point of |
62 |
> > Failure. What will happen when the NFS-server (or whatever is used) goes |
63 |
> > down for whatever reason? |
64 |
> |
65 |
> And what is the difference now? Is it another argument like 'hey, i can |
66 |
> still see the command-line, so it's better. not that i can do anything |
67 |
> useful with it.' |
68 |
|
69 |
Actually, with a shared OS-filesystem: |
70 |
When it goes down: "Oops, we lost the entire environment" |
71 |
|
72 |
With non-shared: |
73 |
One node goes down: "Oops, we need to fix this node, performance will be down |
74 |
while we fix this" |
75 |
Or "this and that app won't work, but the rest still does" |
76 |
|
77 |
That's the difference between a major outage impacting the entire company or |
78 |
one that only affects a few departments. |
79 |
|
80 |
> > In other words, to make an environment that has a very nice single point |
81 |
> > of |
82 |
> > failure possible, existing working environments are classed as "broken". |
83 |
> |
84 |
> NFS-shared system does classify as 'a single point of failure'. |
85 |
|
86 |
If a single shared filesystem is necessary to be able to use the entire |
87 |
environment, then yes. |
88 |
|
89 |
-- |
90 |
Joost |