1 |
On Mon, Aug 18, 2014 at 10:31 AM, J. Roeleveld <joost@××××××××.org> wrote: |
2 |
> |
3 |
> I wouldn't use Hadoop for storage of files. It's only useful if you have a lot |
4 |
> (and I do mean a LOT) of data where a query only returns a very small amount. |
5 |
|
6 |
Not to mention a lot of data in a small number of files. I think the |
7 |
minimum allocation size for Hadoop is measured in megabytes. I tried |
8 |
using it to process gentoo-x86 and the number of files just clobbered |
9 |
the thing. Since in my job the files were really just static data and |
10 |
not the actual subject of the map/reduce I instead just replicated the |
11 |
data to all the nodes and had them retrieve the data from the local |
12 |
filesystem. |
13 |
|
14 |
Hadoop is a very specialized tool. It does what it does very well, |
15 |
but if you want to use it for something other than map/reduce then |
16 |
consider carefully whether it is the right tool for the job. |
17 |
|
18 |
-- |
19 |
Rich |