1 |
"Poison BL." <poisonbl@×××××.com> writes: |
2 |
|
3 |
> On Sat, Apr 29, 2017 at 9:11 PM, lee <lee@××××××××.de> wrote: |
4 |
>> |
5 |
>> "Poison BL." <poisonbl@×××××.com> writes: |
6 |
>> > Half petabyte datasets aren't really something I'd personally *ever* |
7 |
> trust |
8 |
>> > ftp with in the first place. |
9 |
>> |
10 |
>> Why not? (12GB are nowhere close to half a petabyte ...) |
11 |
> |
12 |
> Ah... I completely misread that "or over 50k files in 12GB" as 50k files |
13 |
> *at* 12GB each... which works out to 0.6 PB, incidentally. |
14 |
> |
15 |
>> The data would come in from suppliers. There isn't really anything |
16 |
>> going on atm but fetching data once a month which can be like 100MB or |
17 |
>> 12GB or more. That's because ppl don't use ftp ... |
18 |
> |
19 |
> Really, if you're pulling it in from third party suppliers, you tend to be |
20 |
> tied to what they offer as a method of pulling it from them (or them |
21 |
> pushing it out to you), unless you're in the unique position to dictate the |
22 |
> decision for them. |
23 |
|
24 |
They need to use ftp to deliver the data, we need to use ftp to get the |
25 |
data. I don't want that any other way. |
26 |
|
27 |
The problem is that the ones supposed to deliver data are incompetent |
28 |
and don't want to use ftp because it's too complicated. So what's the |
29 |
better solution? |
30 |
|
31 |
|
32 |
> [...] |
33 |
> |
34 |
>> > How often does it need moved in/out of your facility, and is there no |
35 |
> way |
36 |
>> > to break up the processing into smaller chunks than a 0.6PB mass of |
37 |
> files? |
38 |
>> > Distribute out the smaller pieces with rsync, scp, or the like, operate |
39 |
> on |
40 |
>> > them, and pull back in the results, rather than trying to shift around |
41 |
> the |
42 |
>> > entire set. There's a reason Amazon will send a physical truck to a |
43 |
> site to |
44 |
>> > import large datasets into glacier... ;) |
45 |
>> |
46 |
>> Amazon has trucks? Perhaps they do in other countries. Here, amazon is |
47 |
>> just another web shop. They might have some delivery vans, but I've |
48 |
>> never seen one, so I doubt it. And why would anyone give them their |
49 |
>> data? There's no telling what they would do with it. |
50 |
> |
51 |
> Amazon's also one of the best known cloud computing suppliers on the planet |
52 |
> (AWS = Amazon Web Services). They have everything from pure compute |
53 |
> offerings to cloud storage geared towards *large* data archival. The latter |
54 |
> offering is named "glacier", and they offer a service for the import of |
55 |
> data into it (usually the "first pass", incremental changes are generally |
56 |
> done over the wire) that consists of a shipping truck with a rather nifty |
57 |
> storage system in the back of it that they hook right into your network. |
58 |
> You fill it with data, and then they drive it back to one of their data |
59 |
> centers to load it into place. |
60 |
|
61 |
They might not have that here. And who would want to give their data |
62 |
out of hands? |
63 |
|
64 |
|
65 |
-- |
66 |
"Didn't work" is an error. |