1 |
On Wednesday, September 17, 2014 03:55:56 PM James wrote: |
2 |
> J. Roeleveld <joost <at> antarean.org> writes: |
3 |
> > > Distributed File Systems (DFS): |
4 |
> > |
5 |
> > > Local (Device) File Systems LFS: |
6 |
> > Is my understanding correct that the top list all require one of |
7 |
> > the bottom list? |
8 |
> > Eg. the "clustering" FSs only ensure the files on the LFSs are |
9 |
> > duplicated/spread over the various nodes? |
10 |
> > |
11 |
> > I would normally expect the clustering FS to be either the full layer |
12 |
> > or a clustered block-device where an FS can be placed on top. |
13 |
> |
14 |
> I have not performed these installation yet. My research indicates |
15 |
> that first you put the Local FS on the drive, just like any installation |
16 |
> of Linux. Then you put the distributed FS on top of this. Some DFS might |
17 |
> not require a LFS, but FhGFS does and does HDFS. I will not acutally |
18 |
> be able to accurately answer your questions, until I start to build |
19 |
> up the 3 system cluster. (a week or 2 away) is my best guess. |
20 |
|
21 |
Playing around with clusters is on my list, but due to other activities having |
22 |
a higher priority, I haven't had much time yet. |
23 |
|
24 |
> > Otherwise it seems more like a network filesystem with caching |
25 |
> > options (See AFS). |
26 |
> |
27 |
> OK, I'll add AFS. You may be correct on this one or AFS might be both. |
28 |
|
29 |
Personally, I would read up on these and see how they work. Then, based |
30 |
on that, decide if they are likely to assist in the specific situation you are |
31 |
interested in. |
32 |
AFS, NFS, CIFS,... can be used for clusters, but, apart from NFS, I wouldn't |
33 |
expect much performance out of them. |
34 |
If you need it to be fault-tolerant and not overly rely on a single point of |
35 |
failure, I wouldn't be using any of these. Only AFS, from my original |
36 |
investigation, showed some fault-tolerence, but needed too many |
37 |
resources (disk-space) on the clients. |
38 |
|
39 |
> > I am also interested in these filesystems, but for a slightly different |
40 |
> |
41 |
> > scenario: |
42 |
> Ok, so I the "test-dummy-crash-victim" I'd be honored to have, you, |
43 |
> Alan, Neil, Mic etc etc back-seat-0drive on this adventure! (The more |
44 |
> I read the more it's time for burbon, bash, and a bit of cursing |
45 |
> to get started...) |
46 |
|
47 |
Good luck and even though I'd love to join in with the testing, I simply do |
48 |
not have the time to keep up. I would probably just slow you down. |
49 |
|
50 |
> > - 2 servers in remote locations (different offices) |
51 |
> > - 1 of these has all the files stored (server A) at the main office |
52 |
> > - The other (server B - remote office) needs to "offer" all files |
53 |
> > from serverA When server B needs to supply a file, it needs to |
54 |
> > check if the local copy is still the "valid" version. |
55 |
> > If yes, supply the local copy, otherwise download |
56 |
> > from server A. When a file is changed, server A needs to be updated. |
57 |
> > While server B is sharing a file, the file needs to be locked on server A |
58 |
> > preventing simultaneous updates. |
59 |
> |
60 |
> OOch, file locking (precious tells me that is alway tricky). |
61 |
|
62 |
I need it to be locked on server A while server B has a proper write-lock to |
63 |
avoid 2 modifications to compete with each other. |
64 |
|
65 |
> (pist, systemd is causing fits for the clustering geniuses; |
66 |
> some are espousing a variety of cgroup gymnastics for phantom kills) |
67 |
|
68 |
phantom kills? |
69 |
|
70 |
> Spark is fault tolerant, regardless of node/memory/drive failures |
71 |
> above the fault tolerance that a file system configuration many support. |
72 |
> If fact, files lost can be 'regenerated' but it is computationally |
73 |
> expensive. |
74 |
|
75 |
Too much for me. |
76 |
|
77 |
> You have to get your file system(s) set up. Then install |
78 |
> mesos-0.20.0 and then spark. I have mesos mostly ready. I should |
79 |
> have spark in alpha-beta this weekend. I'm fairly clueless on the |
80 |
> DFS/LFS issue, so a DFS that needs no LFS might be a good first choice |
81 |
> for testing the (3) system cluster. |
82 |
|
83 |
That, or a 4th node acting like a NAS sharing the filesystem over NFS. |
84 |
|
85 |
> > I prefer not to supply the same amount of storage at server B as |
86 |
> > server A has. The remote location generally only needs access to 5% |
87 |
of |
88 |
> > the total amount of files stored on server A. But not always the same |
89 |
5%. |
90 |
> > Does anyone know of a filesystem that can handle this? |
91 |
> |
92 |
> So in clustering, from what I have read, there are all kinds of files |
93 |
> passed around between the nodes and the master(s). Many are critical |
94 |
> files not part of the application or scientific calculations. |
95 |
> So in time, I think in a clustering evironment, all you seek is |
96 |
> very possible, but it's a hunch, gut feeling, not fact. I'd put |
97 |
> raid mirros underdneath that system, if it makes sense, for now, |
98 |
> or just dd the stuff with a script of something kludgy (Alan is the |
99 |
> king of kludge....) |
100 |
|
101 |
Hmm... mirroring between servers. Always an option, except it will not work |
102 |
for me in this case: |
103 |
1) Remote location will have a domestic ADSL line. I'll be lucky if it has a |
104 |
500kbps uplink |
105 |
2) Server A, currently, has around 7TB of current data that also needs to |
106 |
be available on the remote site. |
107 |
|
108 |
With a 8mbps downlink, waiting for a file to be copied to the remote site is |
109 |
acceptable. After modifications, the new version can be copied back to |
110 |
serverA slowly during network-idle-time or when server A actually needs it. |
111 |
If there is a constant mirroring between A and B, the 500kbps (if I am |
112 |
lucky) will be insufficient. |
113 |
|
114 |
> On gentoo planet one of the devs has "Consul" in his overlays. Read |
115 |
> up on that for ideas that may be relevant to what you need. |
116 |
|
117 |
Assuming the following is the website: |
118 |
http://www.consul.io/intro/vs/ |
119 |
|
120 |
Then this seems more a tool to replace Nagios, Puppet and similar. It |
121 |
doesn't have any magic inside to actually distribute a filesystem in a way |
122 |
that when a file is "cached" at the local site, you don't have to wait for it to |
123 |
download from the remote site. And any changes to the file will be copied |
124 |
to the master store automagically. |
125 |
It is intelligent enough to invalidate local copies only when the master |
126 |
copy got changed. |
127 |
And it distributes write-locks to ensure edits can occur only via 1 server at |
128 |
a time. And every user will always get the latest version, regardless of |
129 |
where/when it was last edited. |
130 |
|
131 |
-- |
132 |
Joost |
133 |
|
134 |
> |
135 |
> > Joost |
136 |
> |
137 |
> James |