Gentoo Archives: gentoo-user

From: Grant Taylor <gtaylor@×××××××××××××××××××××.net>
To: gentoo-user@l.g.o
Subject: Re: [gentoo-user] VRFs / Jails / Containers
Date: Sun, 03 Feb 2019 17:26:37
Message-Id: 4da02096-0c6b-7589-b4f4-1badb5b70603@spamtrap.tnetconsulting.net
In Reply to: Re: [gentoo-user] VRFs / Jails / Containers by Rich Freeman
1 On 2/3/19 5:37 AM, Rich Freeman wrote:
2 > Nothing wrong with that approach. I use systemd-nspawn to run a bunch
3 > of containers, hosted in Gentoo, and many of which run Gentoo. However,
4 > these all run systemd and I don't believe you can run nspawn without a
5 > systemd host (the guest/container can be anything). These are containers
6 > running full distros with systemd in my case, not just single-process
7 > containers, in my case. However, nspawn does support single-process
8 > containers, and that includes with veth, but nspawn WON'T initialize
9 > networking in those containers (ie DHCP/etc), leaving this up to the guest
10 > (it does provide a config file for systemd-networkd inside the guest if
11 > it is in use to autoconfigure DHCP).
12
13 ACK
14
15 That makes me think that systemd-nspawn is less of a fit for what I'm
16 wanting to do.
17
18 > I'm not exactly certain what you're trying to accomplish, but namespaces
19 > are just a kernel system call when it comes down to it (two of them I
20 > think offhand). Two util-linux programs provide direct access to them
21 > for shell scripts: unshare and nsenter. If you're just trying to run a
22 > process in a separate namespace so that it can use veth/etc then you could
23 > probably initialize that in a script run from unshare. If you don't need
24 > more isolation you could run it right from the host filesystem without
25 > a separate mount or process namespace. Or you could create a new mount
26 > namespace but only modify specific parts of it like /var/lib or whatever.
27
28 That's quite close to what I'm doing. I'm actually using unshare to
29 create a mount / network / UTS namespace (set) and then running some
30 commands in them.
31
32 The namespaces are functioning as routers. I have an OvS switch
33 connected to the main / default (unnamed) namespace and nine (internal)
34 OvS ports, each one in a different namespace. Thus forming a backbone
35 between the ten network namespaces.
36
37 Each of the nine network namespaces then has a veth pair that connects
38 back to the main network namespace as an L2 interface that VirtualBox
39 (et al) can glom onto as necessary.
40
41 This way I can easily have nine completely different networks that VMs
42 can use. My main home network has a route to these networks via my
43 workstation. (I'm actually using routing protocols to distribute this.)
44
45 So the main use of the network namespaces is as a basic IP router.
46 There doesn't /need/ to be any processes running in them. I do run BIRD
47 in the network namespaces for simplicity reasons. But that's more
48 ancillary.
49
50 I don't strictly need the mount namespaces for what I'm currently doing.
51 That's left over from when I was running Quagga and /needed/ to alter
52 some mounts to run multiple instances of Quagga on the same machine.
53
54 I do like the UTS namespace so that each ""router has a different host
55 name when I enter it.
56
57 Maybe this helps explain /what/ I'm doing. As for /why/ I'm doing it,
58 well because reasons. Maybe not even good reasons. But I'm still doing
59 it. ¯\_(ツ)_/¯ I'm happy to discuss this in a private thread if anyone
60 is really curious.
61
62 > People generally equate containers with docker but as you seem to get
63 > you can do a lot with namespaces without basically running completely
64 > independent distros.
65
66 Yep. I feel like independent distros, plus heavier weight management
67 daemons on top are a LOT more than I want.
68
69 As stated, I don't really /need/ to run processes in the containers. I
70 do because it's easy. The only thing I /need/ is the separate IP stack
71 / configuration.
72
73 > Now, I will point out that there are good reasons for keeping things
74 > separate - they may or may not apply to your application. If you just
75 > want to run a single daemon on 14 different IPs and have each of those
76 > daemons see the same filesystem minus /var/lib and /etc that is something
77 > you could certainly do with namespaces and the only resource cost would
78 > be the storage of the extra /var/lib and /etc directories (they could
79 > even use the same shared libraries in RAM, and indeed the same process
80 > image itself I think).
81
82 Yep.
83
84 > The only gotcha is that I'm not sure how much of it is already done, so
85 > you may have to roll your own. If you find generic solutions for running
86 > services in partially-isolated namespaces with network initialization
87 > taken care of for you I'd be very interested in hearing about it.
88
89 I think there are a LOT of solutions for creating and managing
90 containers. (I'm using the term "container" loosely here.) The thing
91 is that many of them are each their own heavy weight entity. I have yet
92 to find any that integrate well with OS init scripts.
93
94 I feel like what I want to do can /almost/ be done with netifrc. Or
95 that netifrc could be extended to do what (I think is) /little/
96 additional work to do it.
97
98 I don't know that network namespaces are strictly required. I've been
99 using them for years. That being said, the current incarnation of
100 Virtual Routing and Forwarding (VRF) provided by l3mdev seems to be very
101 promising. I expect that I could make VRF (l3mdev) do what I wanted to
102 do too. At least the part that I /need/. I'm not sure how to launch
103 processes associated with the VRF (l3mdev). I'm confident it's
104 possible, but I've not done it.
105
106 But, even VRF (l3mdev) is not supported by netifrc. I feel like the
107 Policy Based Routing (PBR) is even a kludge and largely consists of
108 (parts of) the ip / tc commands being put into the /etc/conf.d/net file.
109
110 I feel like bridging / bonding / VLANs have better support than PBR
111 does. All of which are way better supported than VRF (l3mdev) which is
112 better supported than network namespaces.
113
114 Though, I'm not really surprised. All of the init scripts that I've
115 seen seem to be designed around the premise of a singular system and
116 have no knowledge that there might be other (virtual) systems. What
117 little I know about Docker is that even it's configuration is singular
118 system in nature and still only applies to the instance that it's
119 working on. I've not seen any OS init scripts that are aware of the
120 fact that they might be working on other systems. I think the closest
121 I've seen is FreeBSD jails. But even that is separate init scripts,
122 which are again somewhat focused on the jail.
123
124 I need to do some thinking about /what/ /specifically/ I want to do
125 before I start thinking about /how/ to go about doing it.
126
127 That being said, I think it would be really nice to have various
128 interfaces tagged with what NetNS they belong to and use the same
129 net.$interface type init scripts for them.