1 |
On Fri, May 17, 2019 at 6:28 AM Mick <michaelkintzios@×××××.com> wrote: |
2 |
> |
3 |
> Count yourself lucky. You could have discovered your disk wouldn't spin up |
4 |
> again, your PSU packed up, or even the MoBo chipset decided to retire from |
5 |
> active service. Eventually, any of these hardware problems would manifest |
6 |
> themselves, but a reboot could reveal their demise sooner and hopefully at a |
7 |
> point where you were somewhat prepared for it. |
8 |
> |
9 |
|
10 |
++ |
11 |
|
12 |
You can't completely prevent reboots (not unless you are willing to |
13 |
spend big and go mainframe or something like that - and those create a |
14 |
different set of issues). What you can do is take steps to reduce the |
15 |
risk that an unplanned reboot will cause problems. |
16 |
|
17 |
One of the best ways to ensure you're prepared for disaster is to make |
18 |
disaster routine. Regular reboots can be a part of this, because you |
19 |
can do them at a time when you have time to deal with problems, and |
20 |
when you're looking for problems. |
21 |
|
22 |
This is why I've made the move to containers largely. I still have a |
23 |
few processes running on my host because, but almost everything has |
24 |
moved into containers that do one thing. When I update a container I |
25 |
take a snapshot, run updates, shut it down, take another snapshot, |
26 |
start it up, and test the service it runs. Since each container only |
27 |
does one thing, I know exactly what to test. If it works I'm good, |
28 |
and if it doesn't work I can roll it back and not worry about what |
29 |
that might break on the 47 other services running on the same host. |
30 |
Every update involves an effective reboot for that one service, so I |
31 |
know that in the event of a host reboot they will generally all come |
32 |
up fine. I of course update the host regularly and reboot that for |
33 |
kernel updates, which seem to come about twice a week these days |
34 |
anyway. |
35 |
|
36 |
Obviously I don't run updates the day before I leave on vacation, |
37 |
unless they are security critical, and then I exercise some care. |
38 |
|
39 |
The downside is that I end up with a lot more hosts to keep up to |
40 |
date, because I can't just run emerge -u world once on one host and |
41 |
have every service I run updated. However, I gladly accept the extra |
42 |
work because the work itself becomes much simpler and predictable. If |
43 |
I'm updating my imapd container and imapd still works, then I'm fine. |
44 |
I don't have to worry about suddenly realizing two days later that |
45 |
postgrey is bouncing a ton of mail or whatever. If something obscure |
46 |
like a text editor breaks in my imapd container which I didn't catch, |
47 |
that might be an annoyance but it doesn't really impact me much since |
48 |
it isn't critical for the operation of that container. |
49 |
|
50 |
-- |
51 |
Rich |