On Mon, Apr 21, 2003 at 03:44:21AM -0400, Evan Powers wrote:
> Hmm.... Have you timed your script yet? What sorts of run times are you
> getting with that implementation?
Comparing the execution time of my implementation to yours...
Lets say mine has some issues. It runs find on the entire
filesystem tree, reads the complete output of find into memory,
and performs regex-based filtering after the fact instead of
as command-line parameters to find like you do.
To be honest, I'm ashamed that creating two sorted manifests and
comparing didn't occur to me. Your solution is annoyingly compact
and simple :)
I tweaked your script to not use qpkg -nc -l but inline Perl
parsing, and got some amazing results (roughly eight times faster on
this system).
The numbers...
Your original script:
$ time ./script-cruft.sh
./script-cruft.sh 14.56s user 0.92s system 100% cpu 15.473 total
My tweaked version:
$ time ./cruft-script-fast.sh
./script-cruft-fast.sh > fast 1.33s user 0.46s system 99% cpu 1.792 total
My Python script (oh the shame):
$ time ./gtfilelint -C gtfilelint.conf -o output.list
./gtfilelint <...> 15.86s user 7.15s system 99% cpu 23.105 total
These times are after Linux caching has kicked in. Executed
the scripts multiple times, reported only the final times. Before
caching, my script took about 160s, then 50, then 40, then 32, and
finally 23.
I guess I'll be using the tweaked version of your script from now on
(attached) :)
Interestingly, delving into the innards of epm and qpkg, revealed a
bug in their CONTENTS parsing code...They can't handle filenames with
spaces in them. They truncate the filename at the place a space occurs.
Other than that, the output generated by my tweaked version and your
original should be identical for the same set of paths to exclude.
Leon
--
in the beginning, was the code.
|