1 |
On Thu, Sep 9, 2010 at 3:46 PM, Daniel Troeder <daniel@×××××××××.com> wrote: |
2 |
> On 09/09/2010 07:24 PM, Matt Neimeyer wrote: |
3 |
>> My generic question is: When I'm using a pipe line series of commands |
4 |
>> do I use up more/less space than doing things in sequence? |
5 |
>> |
6 |
>> For example, I have a development Gentoo VM that has a hard drive that |
7 |
>> is too small... I wanted to move a database off of that onto another |
8 |
>> machine but when I tried the following I filled my partition and 'evil |
9 |
>> things' happened... |
10 |
>> |
11 |
>> mysqldump blah... |
12 |
>> gzip blah... |
13 |
>> |
14 |
>> In this specific case I added another virtual drive, mounted that and |
15 |
>> went on with life but I'm curious if I could have gotten away with the |
16 |
>> pipe line instead. Will doing something like this still use "twice" |
17 |
>> the space? |
18 |
>> |
19 |
>> mysqldump | gzip > file.sql.gz |
20 |
>> |
21 |
>> OR going back to my generic question if I pipe line like "type | sort |
22 |
>> | unique > output" does that only use 1x or 3x the disk space? |
23 |
>> |
24 |
>> Thanks in advance! |
25 |
>> |
26 |
>> Matt |
27 |
>> |
28 |
>> P.S. If the answer is "it depends" how do know what it depends on? |
29 |
>> |
30 |
> Everyone already answered the disk space question. I want to add just |
31 |
> this: It also saves you lots of i/o-bandwidth: only the compressed data |
32 |
> gets written to disk. As i/o is the most common bottleneck, it is often |
33 |
> an imperative to do as much as possible in a pipe. If you're lucky it |
34 |
> can also mean, that multiple programs run at the same time, resulting in |
35 |
> higher throughput. Lucky is, when consumer and producer (right and left |
36 |
> of pipe) can work simultaneously because the buffer is big enough. You |
37 |
> can see this every time you (un)pack a tar.gz. |
38 |
|
39 |
And if you have a huge amount of data where compression causes CPU to |
40 |
become the bottleneck you can use something like pbzip2 which uses all |
41 |
CPUs/cores in parallel to speed up [de]compression. :) |