Gentoo Archives: gentoo-user

From: Alec Ten Harmsel <alec@××××××××××××××.com>
To: gentoo-user@l.g.o
Subject: Re: [gentoo-user] Clusters on Gentoo ?
Date: Tue, 19 Aug 2014 11:01:06
Message-Id: 53F32C79.6030709@alectenharmsel.com
In Reply to: Re: [gentoo-user] Clusters on Gentoo ? by "J. Roeleveld"
1 On Tue 19 Aug 2014 05:34:40 AM EDT, J. Roeleveld wrote:
2 > On Monday, August 18, 2014 10:53:51 AM Alec Ten Harmsel wrote:
3 >> On Mon 18 Aug 2014 10:50:23 AM EDT, Rich Freeman wrote:
4 >>> Hadoop is a very specialized tool. It does what it does very well,
5 >>> but if you want to use it for something other than map/reduce then
6 >>> consider carefully whether it is the right tool for the job.
7 >>
8 >> Agreed; unless you have decent hardware and can comfortably measure
9 >> your data in TB, it'll be quicker to use something else once you factor
10 >> in the administration time and learning curve.
11 >
12 > The benefit of clustering technologies is that you don't need high-end
13 > hardware to start with. You can use the old hardware you found collecting dust
14 > in the basement.
15
16 Yes, but... if you are doing anything that *needs* to be fast (i.e. if
17 you're not a hobbyist), you don't need some super fancy database
18 machine but you still need some decent hardware (gotta have enough RAM
19 for that JVM ;) ). If you'd like to take a look at our hardware, you
20 can check out http://caen.github.io/hadoop/hardware.html.
21
22 > The learning curve isn't as steep as it used to be. There are plenty of tools
23 > to make it easier to start using Hadoop.
24
25 There are plenty of great tools (Pig, Sqoop, Hive, RHadoop, etc.) that
26 you can use so you're not writing Java. This is all client-side; it
27 doesn't make the administration easier.
28
29 I agree that it's easy to start using it (It's possible to configure a
30 small cluster from scratch in half an hour), but it takes a lot more
31 time to tune your installation so it actually performs well. Just like
32 any other piece of server software; serving a website with httpd is
33 easy, but serving it well and adding security takes a lot more time.
34
35 Rich Freeman wrote:
36 > As long as you're counting words and don't mind coding everything in Java. :)
37
38 We discourage researchers from writing in Java and instead use any of
39 the things I list above, unless they really like Java.
40
41 > I found that if you want to avoid using Java, then the
42 > available documentation plummets
43
44 Yeah, this is still a pretty big problem. Documentation is pretty
45 sparse.
46
47 Alec