Gentoo Archives: gentoo-user

From: "J. Roeleveld" <joost@××××××××.org>
To: gentoo-user@l.g.o
Subject: Re: [gentoo-user] Re: File system testing
Date: Fri, 19 Sep 2014 15:02:42
Message-Id: 2818936.5Fl7dU4F5e@andromeda
In Reply to: [gentoo-user] Re: File system testing by James
1 On Friday, September 19, 2014 01:41:26 PM James wrote:
2 > J. Roeleveld <joost <at> antarean.org> writes:
3 > > Out of curiosity, what do you want to simulate?
4 >
5 > subsurface flows in porous medium. AKA carbon sequestration
6 > by injection wells. You know, provide proof that those
7 > that remove hydrocarbons and actuall put the CO2 back
8 > and significantly mitigate the effects of their ventures.
9
10 Interesting topic. Can't provide advice on that topic.
11
12 > It's like this. I have been stuggling with my 17 year old "genius"
13 > son who is a year away from entering medical school, with
14 > learning responsibility. So I got him a hyperactive, highly
15 > intelligent (mix-doberman) puppy to nurture, raise, train, love
16 > and be resonsible for. It's one genious pup, teaching another
17 > pup about being responsible.
18
19 Overactive kids, always fun.
20 I try to keep mine busy without computers and TVs for now. (She's going to be
21 3 in November)
22
23 > So goes the earl_bidness.......imho.
24 >
25 > > > Many folks are recommending to skip Hadoop/HDFS all together
26 > >
27 > > I agree, Hadoop/HDFS is for data analysis. Like building a profile
28 > > about people based on the information companies like Facebook,
29 > > Google, NSA, Walmart, Governments, Banks,.... collect about their
30 > > customers/users/citizens/slaves/....
31 > >
32 > > > and go straight to mesos/spark. RDD (in-memory) cluster
33 > > > calculations are at the heart of my needs. The opposite end of the
34 > > > spectrum, loads of small files and small apps; I dunno about, but, I'm
35 > > > all
36 > > > ears.
37 > > > In the end, my (3) node scientific cluster will morph and support
38 > > > the typical myriad of networked applications, but I can take
39 > > > a few years to figure that out, or just copy what smart guys like
40 > > > you and joost do.....
41 > >
42 > >
43 > > Nope, I'm simply following what you do and provide suggestions where I
44 > > can.
45 > > Most of the clusters and distributed computing stuff I do is based on
46 > > adding machines to distribute the load. But the mechanisms for these are
47 > > implemented in the applications I work with, not what I design underneath.
48 > > The filesystems I am interested in are different to the ones you want.
49 >
50 > Maybe. I do not know what I want yet. My vision is very light weight
51 > workstations running lxqt (small memory footprint) or such, and a bad_arse
52 > cluster for the heavy lifting running on whatever heterogenous resoruces I
53 > have. From what I've read, the cluster and the file systems are all
54 > redundant that the cluster level (mesos/spark anyway) regardless of one any
55 > give processor/system is doing. All of Alans fantasies (needs) can be
56 > realized once the cluster stuff is master. (chronos, ansible etc etc).
57
58 Alan = your son? or?
59 I would, from the workstation point of view, keep the cluster as a single
60 entity, to keep things easier.
61 A cluster FS for workstation/desktop use is generally not suitable for a High
62 Performance Cluster (HPC) (or vice-versa)
63
64 > > I need to provided access to software installation files to a VM server
65 > > and access to documentation which is created by the users. The
66 > > VM server is physically next to what I already mentioned as server A.
67 > > Access to the VM from the remote site will be using remote desktop
68 > > connections. But to allow faster and easier access to the
69 > > documentation, I need a server B at the remote site which functions as
70 > > described. AFS might be suitable, but I need to be able to layer Samba
71 > > on top of that to allow a seamless operation.
72 > > I don't want the laptops to have their own cache and then having to
73 > > figure out how to solve the multiple different changes to documents
74 > > containing layouts. (MS Word and OpenDocument files).
75 >
76 > Ok so your customers (hperactive problem users) inteface to your cluster
77 > to do their work. When finished you write things out to other servers
78 > with all of the VM servers. Lots of really cool tools are emerging
79 > in the cluster space.
80
81 Actually, slightly different scenario.
82 Most work is done at customers systems. Occasionally we need to test software
83 versions prior to implementing these at customers. For that, we use VMs.
84
85 The VM-server we have is currently sufficient for this. When it isn't, we'll
86 need to add a 2nd VMserver.
87
88 On the NAS, we store:
89 - Documentation about customers + Howto documents on how to best install the
90 software.
91 - Installation files downloaded from vendors (We also deal with older versions
92 that are no longer available. We need to have our own collection to handle
93 that)
94
95 As we are looking into also working from a different location, we need:
96 - Access to the VM-server (easy, using VPN and Remote Desktops)
97 - Access to the files (I prefer to have a local 'cache' at the remote location)
98
99 It's the access to files part where I need to have some sort of "distributed"
100 filesystem.
101
102 > I think these folks have mesos + spark + samba + nfs all in one box. [1]
103 > [1]
104 > http://www.quantaqct.com/en/01_product/02_detail.php?mid=29&sid=162&id=163&q
105 > s=102
106
107 Had a quick look, these use MS Windows Storage 2012, this is only failover on
108 the storage side. I don't see anything related to what we are working with.
109
110 > Build rather than purchase? WE have to figure out what you and Alan need, on
111 > a cluster, because it is what most folks need/want. It the admin_advantage
112 > part of cluster. (There also the Big Science (me) and Web centric needs.
113 > Right now they are realted project, but things will coalesce, imho. There
114 > is even "Spark_sql" for postgres admins [2].
115 >
116 >
117 > [2] https://spark.apache.org/sql/
118
119 Hmm.... that is interesting.
120
121 > > > > We use Lustre for our high performance general storage. I don't
122 > > > > have any numbers, but I'm pretty sure it is *really* fast (10Gbit/s
123 > > > > over IB sounds familiar, but don't quote me on that).
124 > > >
125 > > > AT Umich, you guys should test the FhGFS/btrfs combo. The folks
126 > > > at UCI swear about it, although they are only publishing a wee bit.
127 > > > (you know, water cooler gossip)...... Surely the Wolverines do not
128 > > > want those californians getting up on them?
129 > > >
130 > > > Are you guys planning a mesos/spark test?
131 > > >
132 > > > > > Personally, I would read up on these and see how they work. Then,
133 > > > > > based on that, decide if they are likely to assist in the specific
134 > > > > > situation you are interested in.
135 > > >
136 > > > It's a ton of reading. It's not apples-to-apple_cider type of reading.
137 > > > My head hurts.....
138 > >
139 > > Take a walk outside. Clear air should help you with the headaches :P
140 >
141 > Basketball, Boobs and Burbon use to work quite well. Now it's mostly
142 > basketball, but I'm working on someone "very cute"......
143
144 Cloning? Genetics?
145 Now that I am interested in. I could do with a couple of clones. ;)
146
147 Btw, there are women who know more about some aspects of IT then you and me
148 put together. Some of those even manage to look great as well ;)
149
150 > > > I'm leaning to DFS/LFS
151 > > > (2) Luster/btrfs and FhGFS/btrfs
152 > >
153 > > I have insufficient knowledge to advise on either of these.
154 > > One question, why BTRFS instead of ZFS?
155 >
156 > I think btrfs has tremendous potential. I tried ZFS a few times,
157 > but the installs are not part of gentoo, so they got borked
158 > uEFI, grubs to uuids, etc etc also were in the mix. That was almost
159 > a year ago.
160
161 I did a quick test with Gentoo and ZFS. With the current documentation and
162 ebuilds, it actually is quite simple to get to use. Provided you don't intend
163 to use it for the root filesystem.
164
165 > For what ever reason the clustering folks I have
166 > read and communicated with are using ext4, xfs and btrfs. Prolly
167 > mostly because those are mostly used in their (systemd) inspired)
168 > distros....?
169
170 I think mostly because they are included native into the kernel and when
171 dealing with HPC, you don't want to use a filesystem that is know to eat memory
172 for breakfast.
173 When I switch the NAS over to ZFS, I will be using a dedicated machine with
174 16GB of memory. Probably going to increase that to 32GB not too long after.
175
176 > > My current understanding is: - ZFS is production ready, but due to
177 > > licensing issues, not included in the kernel - BTRFS is included, but
178 > > not yet production ready with all planned features.
179 >
180 > Yep. the license issue with ZFS is a real killer for me. Besides,
181 > as an old state-machine, C hack, anything with B-tree is fabulous.
182 > Prejudices? Yep, but here, I'm sticking with my gut. Multi port
183 > ram can do mavelous things with Btree data structures. The
184 > rest will become available/stable. Simply, I just trust btrfs, in
185 > my gut.
186
187 I think both are stable and usable, with the limitations I currently see and
188 confirmed by Rich.
189
190 > > For me, Raid6-like functionality is an absolute requirement and latest I
191 > > know is that that isn't implemented in BTRFS yet. Does anyone know when
192 > > that will be implemented and reliable? Eg. what time-frame are we
193 > > talking about?
194 >
195 > Now we are "communicating"! We have different visions. I want cheap,
196 > mirrored HD on small numbers of processors (less than 16 for now).
197 > I want max ram of the hightest performance possilbe. I want my reduncancy
198 > in my cluster with my cluster software deciding when/where/how-often
199 > to write out to HD. If the max_ram is not enought, then SSD will
200 > be between the ram and HD. Also, know this. The GPU will be assimilated
201 > into the processors, just like the FPUs were, some decade ago. Remember
202 > the i386 and the i387 math coprocessor chip? The good folks at opengl,
203 > gcc (GNU) and others will soon (eventually?) give us compilers to
204 > automagically use the gpu (and all of that blazingly fast ram therein,
205 > as slave to Alan admin authority (some bullship like that).
206
207 Yep, and for HPC and VMs, you want to keep as much memory available for what
208 matters.
209 For a file storage cluster, memory is there to assist the serving of files. (As
210 that is what matters there)
211
212 > So, my "Epiphany" is this. The bitches at systemd are to renamed
213 > "StripperD", as they will manage the boot cycle (how fast you can
214 > go down (save power) and come back up (online). The Cluster
215 > will rule off of your hardware, like a "Sheilk" "the ring that rules
216 > them all" be the driver of the gabage collect processes.
217
218 Aargh, garbage collectors...
219 They tend to spring into action when least convenient...
220 Try to be able to control when they start cleaning.
221
222 > The cluster
223 > will be like the "knights of the round table"; each node helping, and
224 > standing for those other nodes (nobles) that stumble, always with
225 > extra resources, triple/quad redundancy and solving problems
226 > before that kernel based "piece of" has a chance to anything
227 > other than "go down" or "Come up" online.
228
229 Interesting, need to parse this slowly over the weekend.
230
231 > We shall see just who the master is of my hardawre!
232 > The sadest thing for me is that when I extolled about billion
233 > dollar companies corrupting the kernel development process, I did
234 > not even have those {hat wearing loosers} in mind. They are
235 > irrelevant. I was thinking about those semiconductor companies.
236 > You know the ones that accept billions of dollars for the NSA
237 > and private spoofs to embed hardware inside of hardware. The ones
238 > that can use "white noise" as a communications channel. The ones
239 > that can tap a fiber optic cable, with penetration. Those are
240 > the ones to focus on. Not a bunch of "silly boyz"......
241
242 For that, you need to keep the important sensitive data off the grid.
243
244 > My new K_main{} has highlighted a path to neuter systemd.
245 > But I do like how StripperD moves up and down, very quickly.
246
247 I don't care about boot times or shutdown times. If I did, I'd invest in high
248 speed ram disks and SSDs.
249 Having 50 of the fastest SSDs in Raid-0 config will give more data then the
250 rest of the system can handle ;)
251
252 If then using that for VMs which can keep the entire virtual disk also in
253 memory, and you really are fllying with performance. That's why in-memory
254 systems are becoming popular again.
255
256 > Cool huh?
257 > It's PARTY TIME!
258
259 Parties are nice...
260
261 --
262 Joost