[gentoo-user] Re: File system testing - gentoo-user

From:	James <wireless@×××××××××××.com>
To:	gentoo-user@l.g.o
Subject:	[gentoo-user] Re: File system testing
Date:	Fri, 19 Sep 2014 13:41:58
Message-Id:	`loom.20140919T143523-892@post.gmane.org`
In Reply to:	Re: [gentoo-user] Re: File system testing by "J. Roeleveld"

1

J. Roeleveld <joost <at> antarean.org> writes:

2

3

4

> Out of curiosity, what do you want to simulate?

5

6

subsurface flows in porous medium. AKA carbon sequestration

7

by injection wells. You know, provide proof that those

8

that remove hydrocarbons and actuall put the CO2 back

9

and significantly mitigate the effects of their ventures.

10

11

It's like this. I have been stuggling with my 17 year old "genius"

12

son who is a year away from entering medical school, with

13

learning responsibility. So I got him a hyperactive, highly

14

intelligent (mix-doberman) puppy to nurture, raise, train, love

15

and be resonsible for. It's one genious pup, teaching another

16

pup about being responsible.

17

18

So goes the earl_bidness.......imho.

> > Many folks are recommending to skip Hadoop/HDFS all  together

24

25

> I agree, Hadoop/HDFS is for data analysis. Like building a profile

26

> about people based on the information companies like Facebook,

27

> Google, NSA, Walmart, Governments, Banks,.... collect about their

28

> customers/users/citizens/slaves/....

29

30

> > and go straight to mesos/spark. RDD (in-memory)  cluster

31

> > calculations are at the heart of my needs. The opposite end of the

32

> > spectrum, loads of small files and small apps; I dunno about, but, I'm all

33

> > ears.

34

> > In the end, my (3) node scientific cluster will morph and support

35

> > the typical myriad  of networked applications, but I can take

36

> > a few years to figure that out, or just copy what smart guys like

37

> > you and joost do.....

38

>

39

> Nope, I'm simply following what you do and provide suggestions where I can.

40

> Most of the clusters and distributed computing stuff I do is based on

41

> adding machines to distribute the load. But the mechanisms for these are >

42

implemented in the applications I work with, not what I design underneath.

43

44

> The filesystems I am interested in are different to the ones you want.

45

46

Maybe. I do not know what I want yet. My vision is very light weight 

47

workstations running lxqt (small memory footprint) or such, and a bad_arse

48

cluster for the heavy lifting running on whatever heterogenous resoruces I

49

have. From what I've read, the cluster and the file systems are all

50

redundant that the cluster level (mesos/spark anyway) regardless of one any

51

give processor/system is doing. All of Alans fantasies (needs) can be

52

realized once the cluster stuff is master. (chronos, ansible etc etc).

53

54

> I need to provided access to software installation files to a VM server

55

> and access to documentation which is created by the users. The

56

> VM server is physically next to what I already mentioned as server A.

57

> Access to the VM from the remote site will be using remote desktop

58

> connections.  But to allow faster and easier access to the

59

> documentation, I need a server B at the remote site which functions as

60

> described.  AFS might be suitable, but I need to be able to layer Samba

61

> on top of that to allow a seamless operation.

62

> I don't want the laptops to have their own cache and then having to

63

> figure out how to solve the multiple different changes to documents

64

> containing layouts. (MS Word and OpenDocument files).

65

66

Ok so your customers (hperactive problem users) inteface to your cluster

67

to do their work. When finished you write things out to other servers

68

with all of the VM servers. Lots of really cool tools are emerging

69

in the cluster space.

70

71

I think these folks have mesos + spark + samba + nfs all in one box. [1]

72

Build rather than purchase? WE have to figure out what you and Alan need, on

73

a cluster, because it is what most folks need/want. It the admin_advantage

74

part of cluster. (There also the Big Science (me) and Web centric needs.

75

Right now they are realted project, but things will coalesce, imho. There is

76

even "Spark_sql" for postgres admins [2].

77

78

[1]

79

http://www.quantaqct.com/en/01_product/02_detail.php?mid=29&sid=162&id=163&qs=102

80

81

[2] https://spark.apache.org/sql/

82

83

84

> > > We use Lustre for our high performance general storage. I don't

85

> > > have any numbers, but I'm pretty sure it is *really* fast (10Gbit/s

86

> > > over IB sounds familiar, but don't quote me on that).

87

> >

88

> > AT Umich, you guys should test the FhGFS/btrfs combo. The folks

89

> > at UCI swear about it, although they are only publishing a wee bit.

90

> > (you know, water cooler gossip)...... Surely the Wolverines do not

91

> > want those californians getting up on them?

92

93

> > Are you guys planning a mesos/spark test?

94

95

> > > > Personally, I would read up on these and see how they work. Then,

96

> > > > based on that, decide if they are likely to assist in the specific

97

> > > > situation you are interested in.

98

99

> > It's a ton of reading. It's not apples-to-apple_cider type of reading.

100

> > My head hurts.....

101

102

> Take a walk outside. Clear air should help you with the headaches :P

103

104

Basketball, Boobs and Burbon use to work quite well. Now it's mostly

105

basketball, but I'm working on someone "very cute"......

106

107

> > I'm leaning to  DFS/LFS

108

> > (2)  Luster/btrfs      and     FhGFS/btrfs

109

110

> I have insufficient knowledge to advise on either of these.

111

> One question, why BTRFS instead of ZFS?

112

113

I think btrfs has tremendous potential. I tried ZFS a few times,

114

but the installs are not part of gentoo, so they got borked

115

uEFI, grubs to uuids, etc etc also were in the mix. That was almost

116

a year ago. For what ever reason the clustering folks I have

117

read and communicated with are using ext4, xfs and btrfs. Prolly

118

mostly because those are mostly used in their (systemd) inspired)

119

distros....?

120

121

122

> My current understanding is: - ZFS is production ready, but due to

123

> licensing issues, not included in the kernel - BTRFS is included, but

124

> not yet production ready with all planned features.

125

126

Yep. the license issue with ZFS is a real killer for me. Besides,

127

as an old state-machine, C hack, anything with B-tree is fabulous.

128

Prejudices? Yep, but here, I'm sticking with my gut. Multi port

129

ram can do mavelous things with Btree data structures. The 

130

rest will become available/stable. Simply, I just trust btrfs, in

131

my gut.

132

133

134

> For me, Raid6-like functionality is an absolute requirement and latest I >

135

know is that that isn't implemented in BTRFS yet. Does anyone know when 

136

> that will be implemented and reliable? Eg. what time-frame are we

137

> talking about?

138

139

140

Now we are "communicating"! We have different visions. I want cheap,

141

mirrored HD on small numbers of processors (less than 16 for now).

142

I want max ram of the hightest performance possilbe. I want my reduncancy

143

in my cluster with my cluster software deciding when/where/how-often

144

to write out to HD. If the max_ram is not enought, then SSD will 

145

be between the ram and HD. Also, know this. The GPU will be assimilated 

146

into the processors, just like the FPUs were, some decade ago. Remember

147

the i386 and the i387 math coprocessor chip? The good folks at opengl,

148

gcc (GNU) and others will soon (eventually?) give us compilers to

149

automagically use the gpu (and all of that blazingly fast ram therein,

150

as slave to Alan admin authority (some bullship like that).

151

152

153

So, my "Epiphany" is this. The bitches at systemd are to renamed

154

"StripperD", as they will manage the boot cycle (how fast you can

155

go down (save power) and come back up (online). The Cluster

156

will rule off of your hardware, like a "Sheilk" "the ring that rules 

157

them all" be  the driver of the gabage collect processes. The cluster

158

will be like the "knights of the round table"; each node helping, and

159

standing for those other nodes (nobles) that stumble, always with

160

extra resources, triple/quad redundancy and solving problems

161

before that kernel based "piece of" has a chance to anything

162

other than "go down" or "Come up" online.

163

164

We shall see just who the master is of my hardawre!

165

The sadest thing for me is that when I extolled about billion

166

dollar companies corrupting the kernel development process, I did

167

not even have those {hat wearing loosers} in mind. They are 

168

irrelevant. I was thinking about those semiconductor companies.

169

You know the ones that accept billions of dollars for the NSA

170

and private spoofs to embed hardware inside of hardware. The ones

171

that can use "white noise" as a communications channel. The ones

172

that can tap a fiber optic cable, with penetration. Those are

173

the ones to focus on. Not a bunch of "silly boyz"......

174

175

My new K_main{} has highlighted a path to neuter systemd.

176

But I do like how StripperD moves up and down, very quickly.

177

178

Cool huh? 

179

It's PARTY TIME!

180

181

> Joost

182

James

Gentoo Archives: gentoo-user

Replies

1	J. Roeleveld <joost <at> antarean.org> writes:
2
3
4	> Out of curiosity, what do you want to simulate?
5
6	subsurface flows in porous medium. AKA carbon sequestration
7	by injection wells. You know, provide proof that those
8	that remove hydrocarbons and actuall put the CO2 back
9	and significantly mitigate the effects of their ventures.
10
11	It's like this. I have been stuggling with my 17 year old "genius"
12	son who is a year away from entering medical school, with
13	learning responsibility. So I got him a hyperactive, highly
14	intelligent (mix-doberman) puppy to nurture, raise, train, love
15	and be resonsible for. It's one genious pup, teaching another
16	pup about being responsible.
17
18	So goes the earl_bidness.......imho.
19
20
21
22
23	> > Many folks are recommending to skip Hadoop/HDFS all together
24
25	> I agree, Hadoop/HDFS is for data analysis. Like building a profile
26	> about people based on the information companies like Facebook,
27	> Google, NSA, Walmart, Governments, Banks,.... collect about their
28	> customers/users/citizens/slaves/....
29
30	> > and go straight to mesos/spark. RDD (in-memory) cluster
31	> > calculations are at the heart of my needs. The opposite end of the
32	> > spectrum, loads of small files and small apps; I dunno about, but, I'm all
33	> > ears.
34	> > In the end, my (3) node scientific cluster will morph and support
35	> > the typical myriad of networked applications, but I can take
36	> > a few years to figure that out, or just copy what smart guys like
37	> > you and joost do.....
38	>
39	> Nope, I'm simply following what you do and provide suggestions where I can.
40	> Most of the clusters and distributed computing stuff I do is based on
41	> adding machines to distribute the load. But the mechanisms for these are >
42	implemented in the applications I work with, not what I design underneath.
43
44	> The filesystems I am interested in are different to the ones you want.
45
46	Maybe. I do not know what I want yet. My vision is very light weight
47	workstations running lxqt (small memory footprint) or such, and a bad_arse
48	cluster for the heavy lifting running on whatever heterogenous resoruces I
49	have. From what I've read, the cluster and the file systems are all
50	redundant that the cluster level (mesos/spark anyway) regardless of one any
51	give processor/system is doing. All of Alans fantasies (needs) can be
52	realized once the cluster stuff is master. (chronos, ansible etc etc).
53
54	> I need to provided access to software installation files to a VM server
55	> and access to documentation which is created by the users. The
56	> VM server is physically next to what I already mentioned as server A.
57	> Access to the VM from the remote site will be using remote desktop
58	> connections. But to allow faster and easier access to the
59	> documentation, I need a server B at the remote site which functions as
60	> described. AFS might be suitable, but I need to be able to layer Samba
61	> on top of that to allow a seamless operation.
62	> I don't want the laptops to have their own cache and then having to
63	> figure out how to solve the multiple different changes to documents
64	> containing layouts. (MS Word and OpenDocument files).
65
66	Ok so your customers (hperactive problem users) inteface to your cluster
67	to do their work. When finished you write things out to other servers
68	with all of the VM servers. Lots of really cool tools are emerging
69	in the cluster space.
70
71	I think these folks have mesos + spark + samba + nfs all in one box. [1]
72	Build rather than purchase? WE have to figure out what you and Alan need, on
73	a cluster, because it is what most folks need/want. It the admin_advantage
74	part of cluster. (There also the Big Science (me) and Web centric needs.
75	Right now they are realted project, but things will coalesce, imho. There is
76	even "Spark_sql" for postgres admins [2].
77
78	[1]
79	http://www.quantaqct.com/en/01_product/02_detail.php?mid=29&sid=162&id=163&qs=102
80
81	[2] https://spark.apache.org/sql/
82
83
84	> > > We use Lustre for our high performance general storage. I don't
85	> > > have any numbers, but I'm pretty sure it is really fast (10Gbit/s
86	> > > over IB sounds familiar, but don't quote me on that).
87	> >
88	> > AT Umich, you guys should test the FhGFS/btrfs combo. The folks
89	> > at UCI swear about it, although they are only publishing a wee bit.
90	> > (you know, water cooler gossip)...... Surely the Wolverines do not
91	> > want those californians getting up on them?
92
93	> > Are you guys planning a mesos/spark test?
94
95	> > > > Personally, I would read up on these and see how they work. Then,
96	> > > > based on that, decide if they are likely to assist in the specific
97	> > > > situation you are interested in.
98
99	> > It's a ton of reading. It's not apples-to-apple_cider type of reading.
100	> > My head hurts.....
101
102	> Take a walk outside. Clear air should help you with the headaches :P
103
104	Basketball, Boobs and Burbon use to work quite well. Now it's mostly
105	basketball, but I'm working on someone "very cute"......
106
107	> > I'm leaning to DFS/LFS
108	> > (2) Luster/btrfs and FhGFS/btrfs
109
110	> I have insufficient knowledge to advise on either of these.
111	> One question, why BTRFS instead of ZFS?
112
113	I think btrfs has tremendous potential. I tried ZFS a few times,
114	but the installs are not part of gentoo, so they got borked
115	uEFI, grubs to uuids, etc etc also were in the mix. That was almost
116	a year ago. For what ever reason the clustering folks I have
117	read and communicated with are using ext4, xfs and btrfs. Prolly
118	mostly because those are mostly used in their (systemd) inspired)
119	distros....?
120
121
122	> My current understanding is: - ZFS is production ready, but due to
123	> licensing issues, not included in the kernel - BTRFS is included, but
124	> not yet production ready with all planned features.
125
126	Yep. the license issue with ZFS is a real killer for me. Besides,
127	as an old state-machine, C hack, anything with B-tree is fabulous.
128	Prejudices? Yep, but here, I'm sticking with my gut. Multi port
129	ram can do mavelous things with Btree data structures. The
130	rest will become available/stable. Simply, I just trust btrfs, in
131	my gut.
132
133
134	> For me, Raid6-like functionality is an absolute requirement and latest I >
135	know is that that isn't implemented in BTRFS yet. Does anyone know when
136	> that will be implemented and reliable? Eg. what time-frame are we
137	> talking about?
138
139
140	Now we are "communicating"! We have different visions. I want cheap,
141	mirrored HD on small numbers of processors (less than 16 for now).
142	I want max ram of the hightest performance possilbe. I want my reduncancy
143	in my cluster with my cluster software deciding when/where/how-often
144	to write out to HD. If the max_ram is not enought, then SSD will
145	be between the ram and HD. Also, know this. The GPU will be assimilated
146	into the processors, just like the FPUs were, some decade ago. Remember
147	the i386 and the i387 math coprocessor chip? The good folks at opengl,
148	gcc (GNU) and others will soon (eventually?) give us compilers to
149	automagically use the gpu (and all of that blazingly fast ram therein,
150	as slave to Alan admin authority (some bullship like that).
151
152
153	So, my "Epiphany" is this. The bitches at systemd are to renamed
154	"StripperD", as they will manage the boot cycle (how fast you can
155	go down (save power) and come back up (online). The Cluster
156	will rule off of your hardware, like a "Sheilk" "the ring that rules
157	them all" be the driver of the gabage collect processes. The cluster
158	will be like the "knights of the round table"; each node helping, and
159	standing for those other nodes (nobles) that stumble, always with
160	extra resources, triple/quad redundancy and solving problems
161	before that kernel based "piece of" has a chance to anything
162	other than "go down" or "Come up" online.
163
164	We shall see just who the master is of my hardawre!
165	The sadest thing for me is that when I extolled about billion
166	dollar companies corrupting the kernel development process, I did
167	not even have those {hat wearing loosers} in mind. They are
168	irrelevant. I was thinking about those semiconductor companies.
169	You know the ones that accept billions of dollars for the NSA
170	and private spoofs to embed hardware inside of hardware. The ones
171	that can use "white noise" as a communications channel. The ones
172	that can tap a fiber optic cable, with penetration. Those are
173	the ones to focus on. Not a bunch of "silly boyz"......
174
175	My new K_main{} has highlighted a path to neuter systemd.
176	But I do like how StripperD moves up and down, very quickly.
177
178	Cool huh?
179	It's PARTY TIME!
180
181	> Joost
182	James

Subject	Author
Re: [gentoo-user] Re: File system testing	Rich Freeman <rich0@g.o>
Re: [gentoo-user] Re: File system testing	"J. Roeleveld" <joost@××××××××.org>