Gentoo Archives: gentoo-user

From: "J. Roeleveld" <joost@××××××××.org>
To: gentoo-user@l.g.o
Subject: Re: [gentoo-user] Re: File system testing
Date: Thu, 18 Sep 2014 08:24:29
Message-Id: 2801316.luTLpc0QeJ@andromeda
In Reply to: [gentoo-user] Re: File system testing by James
1 On Wednesday, September 17, 2014 08:56:28 PM James wrote:
2 > Alec Ten Harmsel <alec <at> alectenharmsel.com> writes:
3 > > As far as HDFS goes, I would only set that up if you will use it for
4 > > Hadoop or related tools. It's highly specific, and the performance is
5 > > not good unless you're doing a massively parallel read (what it was
6 > > designed for). I can elaborate why if anyone is actually interested.
7 >
8 > Acutally, from my research and my goal (one really big scientific
9 simulation
10 > running constantly).
11
12 Out of curiosity, what do you want to simulate?
13
14 > Many folks are recommending to skip Hadoop/HDFS all
15 > together
16
17 I agree, Hadoop/HDFS is for data analysis. Like building a profile about
18 people based on the information companies like Facebook, Google, NSA,
19 Walmart, Governments, Banks,.... collect about their
20 customers/users/citizens/slaves/....
21
22 > and go straight to mesos/spark. RDD (in-memory) cluster
23 > calculations are at the heart of my needs. The opposite end of the
24 > spectrum, loads of small files and small apps; I dunno about, but, I'm all
25 > ears.
26 > In the end, my (3) node scientific cluster will morph and support
27 > the typical myriad of networked applications, but I can take
28 > a few years to figure that out, or just copy what smart guys like
29 > you and joost do.....
30
31 Nope, I'm simply following what you do and provide suggestions where I
32 can.
33 Most of the clusters and distributed computing stuff I do is based on
34 adding machines to distribute the load. But the mechanisms for these are
35 implemented in the applications I work with, not what I design underneath.
36
37 The filesystems I am interested in are different to the ones you want.
38 I need to provided access to software installation files to a VM server and
39 access to documentation which is created by the users.
40 The VM server is physically next to what I already mentioned as server A.
41 Access to the VM from the remote site will be using remote desktop
42 connections.
43 But to allow faster and easier access to the documentation, I need a
44 server B at the remote site which functions as described.
45 AFS might be suitable, but I need to be able to layer Samba on top of that
46 to allow a seamless operation.
47 I don't want the laptops to have their own cache and then having to figure
48 out how to solve the multiple different changes to documents containing
49 layouts. (MS Word and OpenDocument files)
50
51 > > We use Lustre for our high performance general storage. I don't have
52 any
53 > > numbers, but I'm pretty sure it is *really* fast (10Gbit/s over IB
54 > > sounds familiar, but don't quote me on that).
55 >
56 > AT Umich, you guys should test the FhGFS/btrfs combo. The folks
57 > at UCI swear about it, although they are only publishing a wee bit.
58 > (you know, water cooler gossip)...... Surely the Wolverines do not
59 > want those californians getting up on them?
60 >
61 > Are you guys planning a mesos/spark test?
62 >
63 > > > Personally, I would read up on these and see how they work. Then,
64 > > > based on that, decide if they are likely to assist in the specific
65 > > > situation you are interested in.
66 >
67 > It's a ton of reading. It's not apples-to-apple_cider type of reading.
68 > My head hurts.....
69
70 Take a walk outside. Clear air should help you with the headaches :P
71
72 > I'm leaning to DFS/LFS
73 >
74 > (2) Luster/btrfs and FhGFS/btrfs
75 >
76 > Thoughts/comments?
77
78 I have insufficient knowledge to advise on either of these.
79 One question, why BTRFS instead of ZFS?
80
81 My current understanding is:
82 - ZFS is production ready, but due to licensing issues, not included in the
83 kernel
84 - BTRFS is included, but not yet production ready with all planned features
85
86 For me, Raid6-like functionality is an absolute requirement and latest I
87 know is that that isn't implemented in BTRFS yet. Does anyone know when
88 that will be implemented and reliable? Eg. what time-frame are we talking
89 about?
90
91 --
92 Joost

Replies

Subject Author
Re: [gentoo-user] Re: File system testing Rich Freeman <rich0@g.o>
[gentoo-user] Re: File system testing James <wireless@×××××××××××.com>