Gentoo Archives: gentoo-user

From:	Paul Hartman <paul.hartman+gentoo@×××××.com>
To:	gentoo-user@l.g.o
Subject:	Re: [gentoo-user] Pipe Lines - A really basic question
Date:	Fri, 10 Sep 2010 16:04:47
Message-Id:	`AANLkTik-J-gJe_H7Y9uogibmQLARZjAs1U1hx4_+TswO@mail.gmail.com`
In Reply to:	Re: [gentoo-user] Pipe Lines - A really basic question by Daniel Troeder

1	On Thu, Sep 9, 2010 at 3:46 PM, Daniel Troeder <daniel@×××××××××.com> wrote:
2	> On 09/09/2010 07:24 PM, Matt Neimeyer wrote:
3	>> My generic question is: When I'm using a pipe line series of commands
4	>> do I use up more/less space than doing things in sequence?
5	>>
6	>> For example, I have a development Gentoo VM that has a hard drive that
7	>> is too small... I wanted to move a database off of that onto another
8	>> machine but when I tried the following I filled my partition and 'evil
9	>> things' happened...
10	>>
11	>> mysqldump blah...
12	>> gzip blah...
13	>>
14	>> In this specific case I added another virtual drive, mounted that and
15	>> went on with life but I'm curious if I could have gotten away with the
16	>> pipe line instead. Will doing something like this still use "twice"
17	>> the space?
18	>>
19	>> mysqldump \| gzip > file.sql.gz
20	>>
21	>> OR going back to my generic question if I pipe line like "type \| sort
22	>> \| unique > output" does that only use 1x or 3x the disk space?
23	>>
24	>> Thanks in advance!
25	>>
26	>> Matt
27	>>
28	>> P.S. If the answer is "it depends" how do know what it depends on?
29	>>
30	> Everyone already answered the disk space question. I want to add just
31	> this: It also saves you lots of i/o-bandwidth: only the compressed data
32	> gets written to disk. As i/o is the most common bottleneck, it is often
33	> an imperative to do as much as possible in a pipe. If you're lucky it
34	> can also mean, that multiple programs run at the same time, resulting in
35	> higher throughput. Lucky is, when consumer and producer (right and left
36	> of pipe) can work simultaneously because the buffer is big enough. You
37	> can see this every time you (un)pack a tar.gz.
38
39	And if you have a huge amount of data where compression causes CPU to
40	become the bottleneck you can use something like pbzip2 which uses all
41	CPUs/cores in parallel to speed up [de]compression. :)

Report Message

Find on MARC Find on Google Groups