Gentoo Archives: gentoo-user

From: hitachi303 <gentoo-user@××××××××××××××××.de>
To: gentoo-user@l.g.o
Subject: Re: [gentoo-user] which linux RAID setup to choose?
Date: Mon, 04 May 2020 07:51:38
Message-Id: 49b1d819-0e85-3bb8-a495-417677aaf15e@konstantinhansen.de
In Reply to: Re: [gentoo-user] which linux RAID setup to choose? by Rich Freeman
1 Am 04.05.2020 um 02:46 schrieb Rich Freeman:
2 > On Sun, May 3, 2020 at 6:50 PM hitachi303
3 > <gentoo-user@××××××××××××××××.de> wrote:
4 >>
5 >> The only person I know who is running a really huge raid ( I guess 2000+
6 >> drives) is comfortable with some spare drives. His raid did fail an can
7 >> fail. Data will be lost. Everything important has to be stored at a
8 >> secondary location. But they are using the raid to store data for some
9 >> days or weeks when a server is calculating stuff. If the raid fails they
10 >> have to restart the program for the calculation.
11 >
12 > So, if you have thousands of drives, you really shouldn't be using a
13 > conventional RAID solution. Now, if you're just using RAID to refer
14 > to any technology that stores data redundantly that is one thing.
15 > However, if you wanted to stick 2000 drives into a single host using
16 > something like mdadm/zfs, or heaven forbid a bazillion LSI HBAs with
17 > some kind of hacked-up solution for PCIe port replication plus SATA
18 > bus multipliers/etc, you're probably doing it wrong. (Really even
19 > with mdadm/zfs you probably still need some kind of terribly
20 > non-optimal solution for attaching all those drives to a single host.)
21 >
22 > At that scale you really should be using a distributed filesystem. Or
23 > you could use some application-level solution that accomplishes the
24 > same thing on top of a bunch of more modest hosts running zfs/etc (the
25 > Backblaze solution at least in the past).
26 >
27 > The most mainstream FOSS solution at this scale is Ceph. It achieves
28 > redundancy at the host level. That is, if you have it set up to
29 > tolerate two failures then you can take two random hosts in the
30 > cluster and smash their motherboards with a hammer in the middle of
31 > operation, and the cluster will keep on working and quickly restore
32 > its redundancy. Each host can have multiple drives, and losing any or
33 > all of the drives within a single host counts as a single failure.
34 > You can even do clever stuff like tell it which hosts are attached to
35 > which circuit breakers and then you could lose all the hosts on a
36 > single power circuit at once and it would be fine.
37 >
38 > This also has the benefit of covering you when one of your flakey
39 > drives causes weird bus issues that affect other drives, or one host
40 > crashes, and so on. The redundancy is entirely at the host level so
41 > you're protected against a much larger number of failure modes.
42 >
43 > This sort of solution also performs much faster as data requests are
44 > not CPU/NIC/HBA limited for any particular host. The software is
45 > obviously more complex, but the hardware can be simpler since if you
46 > want to expand storage you just buy more servers and plug them into
47 > the LAN, versus trying to figure out how to cram an extra dozen hard
48 > drives into a single host with all kinds of port multiplier games.
49 > You can also do maintenance and just reboot an entire host while the
50 > cluster stays online as long as you aren't messing with them all at
51 > once.
52 >
53 > I've gone in this general direction because I was tired of having to
54 > try to deal with massive cases, being limited to motherboards with 6
55 > SATA ports, adding LSI HBAs that require an 8x slot and often
56 > conflicts with using an NVMe, and so on.
57
58
59 So you are right. This is the way they do it. I used the term raid to
60 broadly.
61 But still they have problems with limitations. Size of room, what air
62 conditioning can handle and stuff like this.
63
64 Anyway I only wanted to point out that there are different approaches in
65 the industries and saving the data at any price is not always necessary.

Replies

Subject Author
Re: [gentoo-user] which linux RAID setup to choose? William Kenworthy <billk@×××××××××.au>