1 |
J. Roeleveld <joost <at> antarean.org> writes: |
2 |
|
3 |
|
4 |
> > Distributed File Systems (DFS): |
5 |
|
6 |
> > Local (Device) File Systems LFS: |
7 |
|
8 |
> Is my understanding correct that the top list all require one of |
9 |
> the bottom list? |
10 |
> Eg. the "clustering" FSs only ensure the files on the LFSs are |
11 |
> duplicated/spread over the various nodes? |
12 |
|
13 |
> I would normally expect the clustering FS to be either the full layer |
14 |
> or a clustered block-device where an FS can be placed on top. |
15 |
|
16 |
I have not performed these installation yet. My research indicates |
17 |
that first you put the Local FS on the drive, just like any installation |
18 |
of Linux. Then you put the distributed FS on top of this. Some DFS might |
19 |
not require a LFS, but FhGFS does and does HDFS. I will not acutally |
20 |
be able to accurately answer your questions, until I start to build |
21 |
up the 3 system cluster. (a week or 2 away) is my best guess. |
22 |
|
23 |
|
24 |
> Otherwise it seems more like a network filesystem with caching |
25 |
> options (See AFS). |
26 |
|
27 |
OK, I'll add AFS. You may be correct on this one or AFS might be both. |
28 |
|
29 |
> I am also interested in these filesystems, but for a slightly different |
30 |
> scenario: |
31 |
|
32 |
Ok, so I the "test-dummy-crash-victim" I'd be honored to have, you, |
33 |
Alan, Neil, Mic etc etc back-seat-0drive on this adventure! (The more |
34 |
I read the more it's time for burbon, bash, and a bit of cursing |
35 |
to get started...) |
36 |
|
37 |
|
38 |
> - 2 servers in remote locations (different offices) |
39 |
> - 1 of these has all the files stored (server A) at the main office |
40 |
> - The other (server B - remote office) needs to "offer" all files |
41 |
> from serverA When server B needs to supply a file, it needs to |
42 |
> check if the local copy is still the "valid" version. |
43 |
> If yes, supply the local copy, otherwise download |
44 |
> from server A. When a file is changed, server A needs to be updated. |
45 |
> While server B is sharing a file, the file needs to be locked on server A |
46 |
> preventing simultaneous updates. |
47 |
|
48 |
OOch, file locking (precious tells me that is alway tricky). |
49 |
(pist, systemd is causing fits for the clustering geniuses; |
50 |
some are espousing a variety of cgroup gymnastics for phantom kills) |
51 |
Spark is fault tolerant, regardless of node/memory/drive failures |
52 |
above the fault tolerance that a file system configuration many support. |
53 |
If fact, files lost can be 'regenerated' but it is computationally |
54 |
expensive. You have to get your file system(s) set up. Then install |
55 |
mesos-0.20.0 and then spark. I have mesos mostly ready. I should |
56 |
have spark in alpha-beta this weekend. I'm fairly clueless on the |
57 |
DFS/LFS issue, so a DFS that needs no LFS might be a good first choice |
58 |
for testing the (3) system cluster. |
59 |
|
60 |
|
61 |
> I prefer not to supply the same amount of storage at server B as |
62 |
> server A has. The remote location generally only needs access to 5% of |
63 |
> the total amount of files stored on server A. But not always the same 5%. |
64 |
> Does anyone know of a filesystem that can handle this? |
65 |
|
66 |
So in clustering, from what I have read, there are all kinds of files |
67 |
passed around between the nodes and the master(s). Many are critical |
68 |
files not part of the application or scientific calculations. |
69 |
So in time, I think in a clustering evironment, all you seek is |
70 |
very possible, but it's a hunch, gut feeling, not fact. I'd put |
71 |
raid mirros underdneath that system, if it makes sense, for now, |
72 |
or just dd the stuff with a script of something kludgy (Alan is the |
73 |
king of kludge....) |
74 |
|
75 |
On gentoo planet one of the devs has "Consul" in his overlays. Read |
76 |
up on that for ideas that may be relevant to what you need. |
77 |
|
78 |
|
79 |
> Joost |
80 |
|
81 |
James |