Re: [gentoo-user] Rasp-Pi-4 Gentoo servers - gentoo-user

From:	Rich Freeman <rich0@g.o>
To:	gentoo-user@l.g.o
Subject:	Re: [gentoo-user] Rasp-Pi-4 Gentoo servers
Date:	Fri, 28 Feb 2020 12:56:53
Message-Id:	`CAGfcS_k0u=4mbkiOPPMxD4mGqemfaCMep6==o=XLiioB5R5QZw@mail.gmail.com`
In Reply to:	Re: [gentoo-user] Rasp-Pi-4 Gentoo servers by Wols Lists

1

On Fri, Feb 28, 2020 at 6:09 AM Wols Lists <antlists@××××××××××××.uk> wrote:

2

>

3

> On 27/02/20 21:49, Rich Freeman wrote:

4

> > A fairly cheap amd64 system can run a ton of services in containers

5

> > though, and it is way simpler to maintain that way.  I still get quick

6

> > access to snapshots/etc, but now if I want to run a gentoo container

7

> > it is no big deal if 99% of the time it uses 25MB of RAM and 1% of one

8

> > core, but once a month it needs 4GB of RAM and 100% of 6 cores.  As

9

> > long as I'm not doing an emerge -u world on half a dozen containers at

10

> > once it is no big deal at all.

11

>

12

> Do all your containers have the same make options etc? Can't remember

13

> which directory it is, but I had a shared emerge directory where it

14

> stored this stuff and I emerged with -bk options (use binary if it's

15

> there, create binary if it isn't).

16

>

17

18

They're probably not too far off in general, but not exact.  I only

19

run one instance of any particular container, so I haven't tried to do

20

parallel builds.  If portage had support for multiple binary packages

21

co-existing with different build options I might.  If I ever get

22

really bored for a few weeks I could see playing around with that.  It

23

seems like it ought to be possible to content-hash the list of build

24

options and stick that hash in the binary package filename, and then

25

have portage search for suitable packages, using a binary package if

26

one matches, and doing a new build if not.

27

28

Many of my containers don't even run Gentoo.  I have a few running

29

Arch, Ubuntu Server, or Debian.  If some service is well-supported in

30

one of those and is poorly supported in Gentoo I will tend to go that

31

route.  I'll package it if reasonable but some upstreams are just not

32

very conducive to this.

33

34

There was a question about ARM-based NAS in this thread which I'll go

35

ahead and tackle to save a reply.  I'm actually playing around with

36

lizardfs (I might consider moosefs instead if starting from scratch -

37

or Ceph if I were scaling up but that wouldn't be practical on ARM).

38

I have a mix of chunkservers but my target is to run new ones on ARM.

39

I'm using RockPro64 SBCs with LSI HBAs (this SBC is fairly unique in

40

having PCIe).  There is some issue with the lizardfs code that causes

41

performance issues on ARM though I understand they're working on this,

42

so that could change.  I'm using it for multimedia and I care more

43

about static space than iops, so it is fine for me.  The LSI HBA pulls

44

more power than the SBC does, but overall the setup is very low-power

45

and fairly inexpensive (used HBAs on ebay).  I can in theory get up to

46

16 drives on one SBC this way.  The SBC also supports USB3 so that is

47

another option with a hub - in fact I'm mostly shucking USB3 drives

48

anyway.

49

50

Main issue with ARM SBCs in general is that they don't have much RAM,

51

so IMO that makes Ceph a non-starter.  Otherwise that would probably

52

be my preferred option.  Bad things can happen on rebuilds if you

53

don't have 1GB/TB as they suggest, and even with the relatively

54

under-utilized servers I have now that would be a LOT of RAM for ARM

55

(really, it would be expensive even on amd64).  Lizardfs/moosefs

56

chunkservers barely use any RAM at all.  The master server does need

57

more - I have shadow masters running on the SBCs but since I'm using

58

this for multimedia the metadata server only uses about 100MB of RAM

59

and that includes processes, libraries, and random minimal service

60

daemons like sshd.  I'm running my master on amd64 though to get

61

optimal performance, shadowed on the chunkservers so that I can

62

failover if needed, though in truth the amd64 box with ECC is the

63

least likely thing to die and runs all the stuff that uses the storage

64

right now anyway.

65

66

The other suggestion to consider USB3 instead of SATA for storage

67

isn't a bad idea.  Though going that route means wall warts and drives

68

as far as the eye can see.  Might still be less messy than my setup,

69

which has a couple of cheap ATX PSUs with ATX power switches, 16x PCIe

70

powered risers for the HBAs (they pull too much power for the SBC),

71

and rockwell drive cages to stack the drives in (they're meant for a

72

server chasis but they're reasonably priced and basically give you an

73

open enclosure with a fan).  I'd definitely have a lot fewer PCBs

74

showing if I used USB3 instead.  I'm not sure how well that would

75

perform though - that HBA has a lot of bandwidth if the node got busy

76

with PCIe v2 x4 connectivity (SAS9200-16E) and with USB3 it would all

77

go through 1-2 ports.  Though I doubt I'd ever get THAT many drives on

78

a node and if I needed more space I'd probably expand up to 5

79

chunkservers before I'm putting more than about 3 drives on each - you

80

get better performance and more fault-tolerance that way.

81

82

One big reason I went the distributed filesystem approach was that I

83

was getting tired of trying to cram as many drives as I could into a

84

single host and then dealing with some of the inflexibilities of zfs.

85

The inflexibility bit is improving somewhat with removable vdevs,

86

though I'm not sure how much residue those leave behind if you do it

87

often.  But, zfs is still limited to however many drives you can cram

88

into one host, while a distributed filesystem lets you expand

89

outwards.  Plus it is fault-tolerant at the host level instead of the

90

drive level.

91

92

--

93

Rich

1	On Fri, Feb 28, 2020 at 6:09 AM Wols Lists <antlists@××××××××××××.uk> wrote:
2	>
3	> On 27/02/20 21:49, Rich Freeman wrote:
4	> > A fairly cheap amd64 system can run a ton of services in containers
5	> > though, and it is way simpler to maintain that way. I still get quick
6	> > access to snapshots/etc, but now if I want to run a gentoo container
7	> > it is no big deal if 99% of the time it uses 25MB of RAM and 1% of one
8	> > core, but once a month it needs 4GB of RAM and 100% of 6 cores. As
9	> > long as I'm not doing an emerge -u world on half a dozen containers at
10	> > once it is no big deal at all.
11	>
12	> Do all your containers have the same make options etc? Can't remember
13	> which directory it is, but I had a shared emerge directory where it
14	> stored this stuff and I emerged with -bk options (use binary if it's
15	> there, create binary if it isn't).
16	>
17
18	They're probably not too far off in general, but not exact. I only
19	run one instance of any particular container, so I haven't tried to do
20	parallel builds. If portage had support for multiple binary packages
21	co-existing with different build options I might. If I ever get
22	really bored for a few weeks I could see playing around with that. It
23	seems like it ought to be possible to content-hash the list of build
24	options and stick that hash in the binary package filename, and then
25	have portage search for suitable packages, using a binary package if
26	one matches, and doing a new build if not.
27
28	Many of my containers don't even run Gentoo. I have a few running
29	Arch, Ubuntu Server, or Debian. If some service is well-supported in
30	one of those and is poorly supported in Gentoo I will tend to go that
31	route. I'll package it if reasonable but some upstreams are just not
32	very conducive to this.
33
34	There was a question about ARM-based NAS in this thread which I'll go
35	ahead and tackle to save a reply. I'm actually playing around with
36	lizardfs (I might consider moosefs instead if starting from scratch -
37	or Ceph if I were scaling up but that wouldn't be practical on ARM).
38	I have a mix of chunkservers but my target is to run new ones on ARM.
39	I'm using RockPro64 SBCs with LSI HBAs (this SBC is fairly unique in
40	having PCIe). There is some issue with the lizardfs code that causes
41	performance issues on ARM though I understand they're working on this,
42	so that could change. I'm using it for multimedia and I care more
43	about static space than iops, so it is fine for me. The LSI HBA pulls
44	more power than the SBC does, but overall the setup is very low-power
45	and fairly inexpensive (used HBAs on ebay). I can in theory get up to
46	16 drives on one SBC this way. The SBC also supports USB3 so that is
47	another option with a hub - in fact I'm mostly shucking USB3 drives
48	anyway.
49
50	Main issue with ARM SBCs in general is that they don't have much RAM,
51	so IMO that makes Ceph a non-starter. Otherwise that would probably
52	be my preferred option. Bad things can happen on rebuilds if you
53	don't have 1GB/TB as they suggest, and even with the relatively
54	under-utilized servers I have now that would be a LOT of RAM for ARM
55	(really, it would be expensive even on amd64). Lizardfs/moosefs
56	chunkservers barely use any RAM at all. The master server does need
57	more - I have shadow masters running on the SBCs but since I'm using
58	this for multimedia the metadata server only uses about 100MB of RAM
59	and that includes processes, libraries, and random minimal service
60	daemons like sshd. I'm running my master on amd64 though to get
61	optimal performance, shadowed on the chunkservers so that I can
62	failover if needed, though in truth the amd64 box with ECC is the
63	least likely thing to die and runs all the stuff that uses the storage
64	right now anyway.
65
66	The other suggestion to consider USB3 instead of SATA for storage
67	isn't a bad idea. Though going that route means wall warts and drives
68	as far as the eye can see. Might still be less messy than my setup,
69	which has a couple of cheap ATX PSUs with ATX power switches, 16x PCIe
70	powered risers for the HBAs (they pull too much power for the SBC),
71	and rockwell drive cages to stack the drives in (they're meant for a
72	server chasis but they're reasonably priced and basically give you an
73	open enclosure with a fan). I'd definitely have a lot fewer PCBs
74	showing if I used USB3 instead. I'm not sure how well that would
75	perform though - that HBA has a lot of bandwidth if the node got busy
76	with PCIe v2 x4 connectivity (SAS9200-16E) and with USB3 it would all
77	go through 1-2 ports. Though I doubt I'd ever get THAT many drives on
78	a node and if I needed more space I'd probably expand up to 5
79	chunkservers before I'm putting more than about 3 drives on each - you
80	get better performance and more fault-tolerance that way.
81
82	One big reason I went the distributed filesystem approach was that I
83	was getting tired of trying to cram as many drives as I could into a
84	single host and then dealing with some of the inflexibilities of zfs.
85	The inflexibility bit is improving somewhat with removable vdevs,
86	though I'm not sure how much residue those leave behind if you do it
87	often. But, zfs is still limited to however many drives you can cram
88	into one host, while a distributed filesystem lets you expand
89	outwards. Plus it is fault-tolerant at the host level instead of the
90	drive level.
91
92	--
93	Rich

Gentoo Archives: gentoo-user