Gentoo Archives: gentoo-user

From: "J. Roeleveld" <joost@××××××××.org>
To: gentoo-user@l.g.o
Subject: Re: [gentoo-user] Re: File system testing
Date: Wed, 17 Sep 2014 19:34:02
Message-Id: 15339117.pAj2kdbPAt@andromeda
In Reply to: [gentoo-user] Re: File system testing by James
1 On Wednesday, September 17, 2014 03:55:56 PM James wrote:
2 > J. Roeleveld <joost <at> antarean.org> writes:
3 > > > Distributed File Systems (DFS):
4 > >
5 > > > Local (Device) File Systems LFS:
6 > > Is my understanding correct that the top list all require one of
7 > > the bottom list?
8 > > Eg. the "clustering" FSs only ensure the files on the LFSs are
9 > > duplicated/spread over the various nodes?
10 > >
11 > > I would normally expect the clustering FS to be either the full layer
12 > > or a clustered block-device where an FS can be placed on top.
13 >
14 > I have not performed these installation yet. My research indicates
15 > that first you put the Local FS on the drive, just like any installation
16 > of Linux. Then you put the distributed FS on top of this. Some DFS might
17 > not require a LFS, but FhGFS does and does HDFS. I will not acutally
18 > be able to accurately answer your questions, until I start to build
19 > up the 3 system cluster. (a week or 2 away) is my best guess.
20
21 Playing around with clusters is on my list, but due to other activities having
22 a higher priority, I haven't had much time yet.
23
24 > > Otherwise it seems more like a network filesystem with caching
25 > > options (See AFS).
26 >
27 > OK, I'll add AFS. You may be correct on this one or AFS might be both.
28
29 Personally, I would read up on these and see how they work. Then, based
30 on that, decide if they are likely to assist in the specific situation you are
31 interested in.
32 AFS, NFS, CIFS,... can be used for clusters, but, apart from NFS, I wouldn't
33 expect much performance out of them.
34 If you need it to be fault-tolerant and not overly rely on a single point of
35 failure, I wouldn't be using any of these. Only AFS, from my original
36 investigation, showed some fault-tolerence, but needed too many
37 resources (disk-space) on the clients.
38
39 > > I am also interested in these filesystems, but for a slightly different
40 >
41 > > scenario:
42 > Ok, so I the "test-dummy-crash-victim" I'd be honored to have, you,
43 > Alan, Neil, Mic etc etc back-seat-0drive on this adventure! (The more
44 > I read the more it's time for burbon, bash, and a bit of cursing
45 > to get started...)
46
47 Good luck and even though I'd love to join in with the testing, I simply do
48 not have the time to keep up. I would probably just slow you down.
49
50 > > - 2 servers in remote locations (different offices)
51 > > - 1 of these has all the files stored (server A) at the main office
52 > > - The other (server B - remote office) needs to "offer" all files
53 > > from serverA When server B needs to supply a file, it needs to
54 > > check if the local copy is still the "valid" version.
55 > > If yes, supply the local copy, otherwise download
56 > > from server A. When a file is changed, server A needs to be updated.
57 > > While server B is sharing a file, the file needs to be locked on server A
58 > > preventing simultaneous updates.
59 >
60 > OOch, file locking (precious tells me that is alway tricky).
61
62 I need it to be locked on server A while server B has a proper write-lock to
63 avoid 2 modifications to compete with each other.
64
65 > (pist, systemd is causing fits for the clustering geniuses;
66 > some are espousing a variety of cgroup gymnastics for phantom kills)
67
68 phantom kills?
69
70 > Spark is fault tolerant, regardless of node/memory/drive failures
71 > above the fault tolerance that a file system configuration many support.
72 > If fact, files lost can be 'regenerated' but it is computationally
73 > expensive.
74
75 Too much for me.
76
77 > You have to get your file system(s) set up. Then install
78 > mesos-0.20.0 and then spark. I have mesos mostly ready. I should
79 > have spark in alpha-beta this weekend. I'm fairly clueless on the
80 > DFS/LFS issue, so a DFS that needs no LFS might be a good first choice
81 > for testing the (3) system cluster.
82
83 That, or a 4th node acting like a NAS sharing the filesystem over NFS.
84
85 > > I prefer not to supply the same amount of storage at server B as
86 > > server A has. The remote location generally only needs access to 5%
87 of
88 > > the total amount of files stored on server A. But not always the same
89 5%.
90 > > Does anyone know of a filesystem that can handle this?
91 >
92 > So in clustering, from what I have read, there are all kinds of files
93 > passed around between the nodes and the master(s). Many are critical
94 > files not part of the application or scientific calculations.
95 > So in time, I think in a clustering evironment, all you seek is
96 > very possible, but it's a hunch, gut feeling, not fact. I'd put
97 > raid mirros underdneath that system, if it makes sense, for now,
98 > or just dd the stuff with a script of something kludgy (Alan is the
99 > king of kludge....)
100
101 Hmm... mirroring between servers. Always an option, except it will not work
102 for me in this case:
103 1) Remote location will have a domestic ADSL line. I'll be lucky if it has a
104 500kbps uplink
105 2) Server A, currently, has around 7TB of current data that also needs to
106 be available on the remote site.
107
108 With a 8mbps downlink, waiting for a file to be copied to the remote site is
109 acceptable. After modifications, the new version can be copied back to
110 serverA slowly during network-idle-time or when server A actually needs it.
111 If there is a constant mirroring between A and B, the 500kbps (if I am
112 lucky) will be insufficient.
113
114 > On gentoo planet one of the devs has "Consul" in his overlays. Read
115 > up on that for ideas that may be relevant to what you need.
116
117 Assuming the following is the website:
118 http://www.consul.io/intro/vs/
119
120 Then this seems more a tool to replace Nagios, Puppet and similar. It
121 doesn't have any magic inside to actually distribute a filesystem in a way
122 that when a file is "cached" at the local site, you don't have to wait for it to
123 download from the remote site. And any changes to the file will be copied
124 to the master store automagically.
125 It is intelligent enough to invalidate local copies only when the master
126 copy got changed.
127 And it distributes write-locks to ensure edits can occur only via 1 server at
128 a time. And every user will always get the latest version, regardless of
129 where/when it was last edited.
130
131 --
132 Joost
133
134 >
135 > > Joost
136 >
137 > James

Replies

Subject Author
Re: [gentoo-user] Re: File system testing Alec Ten Harmsel <alec@××××××××××××××.com>