1 |
On 09/09/2010 07:24 PM, Matt Neimeyer wrote: |
2 |
> My generic question is: When I'm using a pipe line series of commands |
3 |
> do I use up more/less space than doing things in sequence? |
4 |
> |
5 |
> For example, I have a development Gentoo VM that has a hard drive that |
6 |
> is too small... I wanted to move a database off of that onto another |
7 |
> machine but when I tried the following I filled my partition and 'evil |
8 |
> things' happened... |
9 |
> |
10 |
> mysqldump blah... |
11 |
> gzip blah... |
12 |
> |
13 |
> In this specific case I added another virtual drive, mounted that and |
14 |
> went on with life but I'm curious if I could have gotten away with the |
15 |
> pipe line instead. Will doing something like this still use "twice" |
16 |
> the space? |
17 |
> |
18 |
> mysqldump | gzip > file.sql.gz |
19 |
> |
20 |
> OR going back to my generic question if I pipe line like "type | sort |
21 |
> | unique > output" does that only use 1x or 3x the disk space? |
22 |
> |
23 |
> Thanks in advance! |
24 |
> |
25 |
> Matt |
26 |
> |
27 |
> P.S. If the answer is "it depends" how do know what it depends on? |
28 |
> |
29 |
Everyone already answered the disk space question. I want to add just |
30 |
this: It also saves you lots of i/o-bandwidth: only the compressed data |
31 |
gets written to disk. As i/o is the most common bottleneck, it is often |
32 |
an imperative to do as much as possible in a pipe. If you're lucky it |
33 |
can also mean, that multiple programs run at the same time, resulting in |
34 |
higher throughput. Lucky is, when consumer and producer (right and left |
35 |
of pipe) can work simultaneously because the buffer is big enough. You |
36 |
can see this every time you (un)pack a tar.gz. |
37 |
|
38 |
Bye, |
39 |
Daniel |
40 |
|
41 |
|
42 |
-- |
43 |
PGP key @ http://pgpkeys.pca.dfn.de/pks/lookup?search=0xBB9D4887&op=get |
44 |
# gpg --recv-keys --keyserver hkp://subkeys.pgp.net 0xBB9D4887 |