Re: [gentoo-user] VRFs / Jails / Containers - gentoo-user

From:	Grant Taylor <gtaylor@×××××××××××××××××××××.net>
To:	gentoo-user@l.g.o
Subject:	Re: [gentoo-user] VRFs / Jails / Containers
Date:	Sun, 03 Feb 2019 17:26:37
Message-Id:	`4da02096-0c6b-7589-b4f4-1badb5b70603@spamtrap.tnetconsulting.net`
In Reply to:	Re: [gentoo-user] VRFs / Jails / Containers by Rich Freeman

1

On 2/3/19 5:37 AM, Rich Freeman wrote:

2

> Nothing wrong with that approach.  I use systemd-nspawn to run a bunch

3

> of containers, hosted in Gentoo, and many of which run Gentoo.  However,

4

> these all run systemd and I don't believe you can run nspawn without a

5

> systemd host (the guest/container can be anything).  These are containers

6

> running full distros with systemd in my case, not just single-process

7

> containers, in my case.  However, nspawn does support single-process

8

> containers, and that includes with veth, but nspawn WON'T initialize

9

> networking in those containers (ie DHCP/etc), leaving this up to the guest

10

> (it does provide a config file for systemd-networkd inside the guest if

11

> it is in use to autoconfigure DHCP).

12

13

ACK

14

15

That makes me think that systemd-nspawn is less of a fit for what I'm 

16

wanting to do.

17

18

> I'm not exactly certain what you're trying to accomplish, but namespaces

19

> are just a kernel system call when it comes down to it (two of them I

20

> think offhand).  Two util-linux programs provide direct access to them

21

> for shell scripts: unshare and nsenter.  If you're just trying to run a

22

> process in a separate namespace so that it can use veth/etc then you could

23

> probably initialize that in a script run from unshare.  If you don't need

24

> more isolation you could run it right from the host filesystem without

25

> a separate mount or process namespace.  Or you could create a new mount

26

> namespace but only modify specific parts of it like /var/lib or whatever.

27

28

That's quite close to what I'm doing.  I'm actually using unshare to 

29

create a mount / network / UTS namespace (set) and then running some 

30

commands in them.

31

32

The namespaces are functioning as routers.  I have an OvS switch 

33

connected to the main / default (unnamed) namespace and nine (internal) 

34

OvS ports, each one in a different namespace.  Thus forming a backbone 

35

between the ten network namespaces.

36

37

Each of the nine network namespaces then has a veth pair that connects 

38

back to the main network namespace as an L2 interface that VirtualBox 

39

(et al) can glom onto as necessary.

40

41

This way I can easily have nine completely different networks that VMs 

42

can use.  My main home network has a route to these networks via my 

43

workstation.  (I'm actually using routing protocols to distribute this.)

44

45

So the main use of the network namespaces is as a basic IP router. 

46

There doesn't /need/ to be any processes running in them.  I do run BIRD 

47

in the network namespaces for simplicity reasons.  But that's more 

48

ancillary.

49

50

I don't strictly need the mount namespaces for what I'm currently doing. 

51

  That's left over from when I was running Quagga and /needed/ to alter 

52

some mounts to run multiple instances of Quagga on the same machine.

53

54

I do like the UTS namespace so that each ""router has a different host 

55

name when I enter it.

56

57

Maybe this helps explain /what/ I'm doing.  As for /why/ I'm doing it, 

58

well because reasons.  Maybe not even good reasons.  But I'm still doing 

59

it.  ¯\_(ツ)_/¯  I'm happy to discuss this in a private thread if anyone 

60

is really curious.

61

62

> People generally equate containers with docker but as you seem to get

63

> you can do a lot with namespaces without basically running completely

64

> independent distros.

65

66

Yep.  I feel like independent distros, plus heavier weight management 

67

daemons on top are a LOT more than I want.

68

69

As stated, I don't really /need/ to run processes in the containers.  I 

70

do because it's easy.  The only thing I /need/ is the separate IP stack 

71

/ configuration.

72

73

> Now, I will point out that there are good reasons for keeping things

74

> separate - they may or may not apply to your application.  If you just

75

> want to run a single daemon on 14 different IPs and have each of those

76

> daemons see the same filesystem minus /var/lib and /etc that is something

77

> you could certainly do with namespaces and the only resource cost would

78

> be the storage of the extra /var/lib and /etc directories (they could

79

> even use the same shared libraries in RAM, and indeed the same process

80

> image itself I think).

81

82

Yep.

83

84

> The only gotcha is that I'm not sure how much of it is already done, so

85

> you may have to roll your own.  If you find generic solutions for running

86

> services in partially-isolated namespaces with network initialization

87

> taken care of for you I'd be very interested in hearing about it.

88

89

I think there are a LOT of solutions for creating and managing 

90

containers.  (I'm using the term "container" loosely here.)  The thing 

91

is that many of them are each their own heavy weight entity.  I have yet 

92

to find any that integrate well with OS init scripts.

93

94

I feel like what I want to do can /almost/ be done with netifrc.  Or 

95

that netifrc could be extended to do what (I think is) /little/ 

96

additional work to do it.

97

98

I don't know that network namespaces are strictly required.  I've been 

99

using them for years.  That being said, the current incarnation of 

100

Virtual Routing and Forwarding (VRF) provided by l3mdev seems to be very 

101

promising.  I expect that I could make VRF (l3mdev) do what I wanted to 

102

do too.  At least the part that I /need/.  I'm not sure how to launch 

103

processes associated with the VRF (l3mdev).  I'm confident it's 

104

possible, but I've not done it.

105

106

But, even VRF (l3mdev) is not supported by netifrc.  I feel like the 

107

Policy Based Routing (PBR) is even a kludge and largely consists of 

108

(parts of) the ip / tc commands being put into the /etc/conf.d/net file.

109

110

I feel like bridging / bonding / VLANs have better support than PBR 

111

does.  All of which are way better supported than VRF (l3mdev) which is 

112

better supported than network namespaces.

113

114

Though, I'm not really surprised.  All of the init scripts that I've 

115

seen seem to be designed around the premise of a singular system and 

116

have no knowledge that there might be other (virtual) systems.  What 

117

little I know about Docker is that even it's configuration is singular 

118

system in nature and still only applies to the instance that it's 

119

working on.  I've not seen any OS init scripts that are aware of the 

120

fact that they might be working on other systems.  I think the closest 

121

I've seen is FreeBSD jails.  But even that is separate init scripts, 

122

which are again somewhat focused on the jail.

123

124

I need to do some thinking about /what/ /specifically/ I want to do 

125

before I start thinking about /how/ to go about doing it.

126

127

That being said, I think it would be really nice to have various 

128

interfaces tagged with what NetNS they belong to and use the same 

129

net.$interface type init scripts for them.

1	On 2/3/19 5:37 AM, Rich Freeman wrote:
2	> Nothing wrong with that approach. I use systemd-nspawn to run a bunch
3	> of containers, hosted in Gentoo, and many of which run Gentoo. However,
4	> these all run systemd and I don't believe you can run nspawn without a
5	> systemd host (the guest/container can be anything). These are containers
6	> running full distros with systemd in my case, not just single-process
7	> containers, in my case. However, nspawn does support single-process
8	> containers, and that includes with veth, but nspawn WON'T initialize
9	> networking in those containers (ie DHCP/etc), leaving this up to the guest
10	> (it does provide a config file for systemd-networkd inside the guest if
11	> it is in use to autoconfigure DHCP).
12
13	ACK
14
15	That makes me think that systemd-nspawn is less of a fit for what I'm
16	wanting to do.
17
18	> I'm not exactly certain what you're trying to accomplish, but namespaces
19	> are just a kernel system call when it comes down to it (two of them I
20	> think offhand). Two util-linux programs provide direct access to them
21	> for shell scripts: unshare and nsenter. If you're just trying to run a
22	> process in a separate namespace so that it can use veth/etc then you could
23	> probably initialize that in a script run from unshare. If you don't need
24	> more isolation you could run it right from the host filesystem without
25	> a separate mount or process namespace. Or you could create a new mount
26	> namespace but only modify specific parts of it like /var/lib or whatever.
27
28	That's quite close to what I'm doing. I'm actually using unshare to
29	create a mount / network / UTS namespace (set) and then running some
30	commands in them.
31
32	The namespaces are functioning as routers. I have an OvS switch
33	connected to the main / default (unnamed) namespace and nine (internal)
34	OvS ports, each one in a different namespace. Thus forming a backbone
35	between the ten network namespaces.
36
37	Each of the nine network namespaces then has a veth pair that connects
38	back to the main network namespace as an L2 interface that VirtualBox
39	(et al) can glom onto as necessary.
40
41	This way I can easily have nine completely different networks that VMs
42	can use. My main home network has a route to these networks via my
43	workstation. (I'm actually using routing protocols to distribute this.)
44
45	So the main use of the network namespaces is as a basic IP router.
46	There doesn't /need/ to be any processes running in them. I do run BIRD
47	in the network namespaces for simplicity reasons. But that's more
48	ancillary.
49
50	I don't strictly need the mount namespaces for what I'm currently doing.
51	That's left over from when I was running Quagga and /needed/ to alter
52	some mounts to run multiple instances of Quagga on the same machine.
53
54	I do like the UTS namespace so that each ""router has a different host
55	name when I enter it.
56
57	Maybe this helps explain /what/ I'm doing. As for /why/ I'm doing it,
58	well because reasons. Maybe not even good reasons. But I'm still doing
59	it. ¯\_(ツ)_/¯ I'm happy to discuss this in a private thread if anyone
60	is really curious.
61
62	> People generally equate containers with docker but as you seem to get
63	> you can do a lot with namespaces without basically running completely
64	> independent distros.
65
66	Yep. I feel like independent distros, plus heavier weight management
67	daemons on top are a LOT more than I want.
68
69	As stated, I don't really /need/ to run processes in the containers. I
70	do because it's easy. The only thing I /need/ is the separate IP stack
71	/ configuration.
72
73	> Now, I will point out that there are good reasons for keeping things
74	> separate - they may or may not apply to your application. If you just
75	> want to run a single daemon on 14 different IPs and have each of those
76	> daemons see the same filesystem minus /var/lib and /etc that is something
77	> you could certainly do with namespaces and the only resource cost would
78	> be the storage of the extra /var/lib and /etc directories (they could
79	> even use the same shared libraries in RAM, and indeed the same process
80	> image itself I think).
81
82	Yep.
83
84	> The only gotcha is that I'm not sure how much of it is already done, so
85	> you may have to roll your own. If you find generic solutions for running
86	> services in partially-isolated namespaces with network initialization
87	> taken care of for you I'd be very interested in hearing about it.
88
89	I think there are a LOT of solutions for creating and managing
90	containers. (I'm using the term "container" loosely here.) The thing
91	is that many of them are each their own heavy weight entity. I have yet
92	to find any that integrate well with OS init scripts.
93
94	I feel like what I want to do can /almost/ be done with netifrc. Or
95	that netifrc could be extended to do what (I think is) /little/
96	additional work to do it.
97
98	I don't know that network namespaces are strictly required. I've been
99	using them for years. That being said, the current incarnation of
100	Virtual Routing and Forwarding (VRF) provided by l3mdev seems to be very
101	promising. I expect that I could make VRF (l3mdev) do what I wanted to
102	do too. At least the part that I /need/. I'm not sure how to launch
103	processes associated with the VRF (l3mdev). I'm confident it's
104	possible, but I've not done it.
105
106	But, even VRF (l3mdev) is not supported by netifrc. I feel like the
107	Policy Based Routing (PBR) is even a kludge and largely consists of
108	(parts of) the ip / tc commands being put into the /etc/conf.d/net file.
109
110	I feel like bridging / bonding / VLANs have better support than PBR
111	does. All of which are way better supported than VRF (l3mdev) which is
112	better supported than network namespaces.
113
114	Though, I'm not really surprised. All of the init scripts that I've
115	seen seem to be designed around the premise of a singular system and
116	have no knowledge that there might be other (virtual) systems. What
117	little I know about Docker is that even it's configuration is singular
118	system in nature and still only applies to the instance that it's
119	working on. I've not seen any OS init scripts that are aware of the
120	fact that they might be working on other systems. I think the closest
121	I've seen is FreeBSD jails. But even that is separate init scripts,
122	which are again somewhat focused on the jail.
123
124	I need to do some thinking about /what/ /specifically/ I want to do
125	before I start thinking about /how/ to go about doing it.
126
127	That being said, I think it would be really nice to have various
128	interfaces tagged with what NetNS they belong to and use the same
129	net.$interface type init scripts for them.

Gentoo Archives: gentoo-user