1 |
J. Roeleveld <joost <at> antarean.org> writes: |
2 |
|
3 |
|
4 |
> Out of curiosity, what do you want to simulate? |
5 |
|
6 |
subsurface flows in porous medium. AKA carbon sequestration |
7 |
by injection wells. You know, provide proof that those |
8 |
that remove hydrocarbons and actuall put the CO2 back |
9 |
and significantly mitigate the effects of their ventures. |
10 |
|
11 |
It's like this. I have been stuggling with my 17 year old "genius" |
12 |
son who is a year away from entering medical school, with |
13 |
learning responsibility. So I got him a hyperactive, highly |
14 |
intelligent (mix-doberman) puppy to nurture, raise, train, love |
15 |
and be resonsible for. It's one genious pup, teaching another |
16 |
pup about being responsible. |
17 |
|
18 |
So goes the earl_bidness.......imho. |
19 |
|
20 |
|
21 |
|
22 |
|
23 |
> > Many folks are recommending to skip Hadoop/HDFS all together |
24 |
|
25 |
> I agree, Hadoop/HDFS is for data analysis. Like building a profile |
26 |
> about people based on the information companies like Facebook, |
27 |
> Google, NSA, Walmart, Governments, Banks,.... collect about their |
28 |
> customers/users/citizens/slaves/.... |
29 |
|
30 |
> > and go straight to mesos/spark. RDD (in-memory) cluster |
31 |
> > calculations are at the heart of my needs. The opposite end of the |
32 |
> > spectrum, loads of small files and small apps; I dunno about, but, I'm all |
33 |
> > ears. |
34 |
> > In the end, my (3) node scientific cluster will morph and support |
35 |
> > the typical myriad of networked applications, but I can take |
36 |
> > a few years to figure that out, or just copy what smart guys like |
37 |
> > you and joost do..... |
38 |
> |
39 |
> Nope, I'm simply following what you do and provide suggestions where I can. |
40 |
> Most of the clusters and distributed computing stuff I do is based on |
41 |
> adding machines to distribute the load. But the mechanisms for these are > |
42 |
implemented in the applications I work with, not what I design underneath. |
43 |
|
44 |
> The filesystems I am interested in are different to the ones you want. |
45 |
|
46 |
Maybe. I do not know what I want yet. My vision is very light weight |
47 |
workstations running lxqt (small memory footprint) or such, and a bad_arse |
48 |
cluster for the heavy lifting running on whatever heterogenous resoruces I |
49 |
have. From what I've read, the cluster and the file systems are all |
50 |
redundant that the cluster level (mesos/spark anyway) regardless of one any |
51 |
give processor/system is doing. All of Alans fantasies (needs) can be |
52 |
realized once the cluster stuff is master. (chronos, ansible etc etc). |
53 |
|
54 |
> I need to provided access to software installation files to a VM server |
55 |
> and access to documentation which is created by the users. The |
56 |
> VM server is physically next to what I already mentioned as server A. |
57 |
> Access to the VM from the remote site will be using remote desktop |
58 |
> connections. But to allow faster and easier access to the |
59 |
> documentation, I need a server B at the remote site which functions as |
60 |
> described. AFS might be suitable, but I need to be able to layer Samba |
61 |
> on top of that to allow a seamless operation. |
62 |
> I don't want the laptops to have their own cache and then having to |
63 |
> figure out how to solve the multiple different changes to documents |
64 |
> containing layouts. (MS Word and OpenDocument files). |
65 |
|
66 |
Ok so your customers (hperactive problem users) inteface to your cluster |
67 |
to do their work. When finished you write things out to other servers |
68 |
with all of the VM servers. Lots of really cool tools are emerging |
69 |
in the cluster space. |
70 |
|
71 |
I think these folks have mesos + spark + samba + nfs all in one box. [1] |
72 |
Build rather than purchase? WE have to figure out what you and Alan need, on |
73 |
a cluster, because it is what most folks need/want. It the admin_advantage |
74 |
part of cluster. (There also the Big Science (me) and Web centric needs. |
75 |
Right now they are realted project, but things will coalesce, imho. There is |
76 |
even "Spark_sql" for postgres admins [2]. |
77 |
|
78 |
[1] |
79 |
http://www.quantaqct.com/en/01_product/02_detail.php?mid=29&sid=162&id=163&qs=102 |
80 |
|
81 |
[2] https://spark.apache.org/sql/ |
82 |
|
83 |
|
84 |
> > > We use Lustre for our high performance general storage. I don't |
85 |
> > > have any numbers, but I'm pretty sure it is *really* fast (10Gbit/s |
86 |
> > > over IB sounds familiar, but don't quote me on that). |
87 |
> > |
88 |
> > AT Umich, you guys should test the FhGFS/btrfs combo. The folks |
89 |
> > at UCI swear about it, although they are only publishing a wee bit. |
90 |
> > (you know, water cooler gossip)...... Surely the Wolverines do not |
91 |
> > want those californians getting up on them? |
92 |
|
93 |
> > Are you guys planning a mesos/spark test? |
94 |
|
95 |
> > > > Personally, I would read up on these and see how they work. Then, |
96 |
> > > > based on that, decide if they are likely to assist in the specific |
97 |
> > > > situation you are interested in. |
98 |
|
99 |
> > It's a ton of reading. It's not apples-to-apple_cider type of reading. |
100 |
> > My head hurts..... |
101 |
|
102 |
> Take a walk outside. Clear air should help you with the headaches :P |
103 |
|
104 |
Basketball, Boobs and Burbon use to work quite well. Now it's mostly |
105 |
basketball, but I'm working on someone "very cute"...... |
106 |
|
107 |
> > I'm leaning to DFS/LFS |
108 |
> > (2) Luster/btrfs and FhGFS/btrfs |
109 |
|
110 |
> I have insufficient knowledge to advise on either of these. |
111 |
> One question, why BTRFS instead of ZFS? |
112 |
|
113 |
I think btrfs has tremendous potential. I tried ZFS a few times, |
114 |
but the installs are not part of gentoo, so they got borked |
115 |
uEFI, grubs to uuids, etc etc also were in the mix. That was almost |
116 |
a year ago. For what ever reason the clustering folks I have |
117 |
read and communicated with are using ext4, xfs and btrfs. Prolly |
118 |
mostly because those are mostly used in their (systemd) inspired) |
119 |
distros....? |
120 |
|
121 |
|
122 |
> My current understanding is: - ZFS is production ready, but due to |
123 |
> licensing issues, not included in the kernel - BTRFS is included, but |
124 |
> not yet production ready with all planned features. |
125 |
|
126 |
Yep. the license issue with ZFS is a real killer for me. Besides, |
127 |
as an old state-machine, C hack, anything with B-tree is fabulous. |
128 |
Prejudices? Yep, but here, I'm sticking with my gut. Multi port |
129 |
ram can do mavelous things with Btree data structures. The |
130 |
rest will become available/stable. Simply, I just trust btrfs, in |
131 |
my gut. |
132 |
|
133 |
|
134 |
> For me, Raid6-like functionality is an absolute requirement and latest I > |
135 |
know is that that isn't implemented in BTRFS yet. Does anyone know when |
136 |
> that will be implemented and reliable? Eg. what time-frame are we |
137 |
> talking about? |
138 |
|
139 |
|
140 |
Now we are "communicating"! We have different visions. I want cheap, |
141 |
mirrored HD on small numbers of processors (less than 16 for now). |
142 |
I want max ram of the hightest performance possilbe. I want my reduncancy |
143 |
in my cluster with my cluster software deciding when/where/how-often |
144 |
to write out to HD. If the max_ram is not enought, then SSD will |
145 |
be between the ram and HD. Also, know this. The GPU will be assimilated |
146 |
into the processors, just like the FPUs were, some decade ago. Remember |
147 |
the i386 and the i387 math coprocessor chip? The good folks at opengl, |
148 |
gcc (GNU) and others will soon (eventually?) give us compilers to |
149 |
automagically use the gpu (and all of that blazingly fast ram therein, |
150 |
as slave to Alan admin authority (some bullship like that). |
151 |
|
152 |
|
153 |
So, my "Epiphany" is this. The bitches at systemd are to renamed |
154 |
"StripperD", as they will manage the boot cycle (how fast you can |
155 |
go down (save power) and come back up (online). The Cluster |
156 |
will rule off of your hardware, like a "Sheilk" "the ring that rules |
157 |
them all" be the driver of the gabage collect processes. The cluster |
158 |
will be like the "knights of the round table"; each node helping, and |
159 |
standing for those other nodes (nobles) that stumble, always with |
160 |
extra resources, triple/quad redundancy and solving problems |
161 |
before that kernel based "piece of" has a chance to anything |
162 |
other than "go down" or "Come up" online. |
163 |
|
164 |
We shall see just who the master is of my hardawre! |
165 |
The sadest thing for me is that when I extolled about billion |
166 |
dollar companies corrupting the kernel development process, I did |
167 |
not even have those {hat wearing loosers} in mind. They are |
168 |
irrelevant. I was thinking about those semiconductor companies. |
169 |
You know the ones that accept billions of dollars for the NSA |
170 |
and private spoofs to embed hardware inside of hardware. The ones |
171 |
that can use "white noise" as a communications channel. The ones |
172 |
that can tap a fiber optic cable, with penetration. Those are |
173 |
the ones to focus on. Not a bunch of "silly boyz"...... |
174 |
|
175 |
My new K_main{} has highlighted a path to neuter systemd. |
176 |
But I do like how StripperD moves up and down, very quickly. |
177 |
|
178 |
Cool huh? |
179 |
It's PARTY TIME! |
180 |
|
181 |
> Joost |
182 |
James |