1 |
On Thu, 9 Feb 2006 20:09:41 +0100 |
2 |
Bernhard Auzinger <e0026053@×××××××××××××××××.at> wrote: |
3 |
|
4 |
> I mean, |
5 |
> how do you get the code of your forked bashes away from your cpu |
6 |
> cache to have it free for kernel code? |
7 |
|
8 |
Forked bash's all use the same image so don't stress the on-chip |
9 |
instruction cache any more than one bash does. You do lose memory for |
10 |
the data needed for each bash, of course - hence the loss of at least |
11 |
one page per process. However unless you actually start swapping to |
12 |
disc (which won't happen if Duncan has no swap set up :) ) this isn't a |
13 |
problem. |
14 |
|
15 |
> A long time ago . . ., I was testing some CFLAGS on my own programs. |
16 |
> I wrote a fast-fourier algorithm myself, only to see the "impressive" |
17 |
> difference between Os, O3 and some other optimisation flags. I fed my |
18 |
> fast-fourier algorithm with a large amount of input. But no matter |
19 |
> how hard I tried to get it faster by changing the flags, it didn't |
20 |
> work. The difference is marginal and not every flag brings |
21 |
> improvement for every program. The only thing that changed a lot was |
22 |
> the time gcc needs to perform those optimisations. |
23 |
|
24 |
I'd guess that your FFT was spending most of its time in unavoidable |
25 |
floating point ops which may explain how various options made little |
26 |
difference. |
27 |
|
28 |
You're right of course - in the end the only way to find out what's |
29 |
actually fastest on a given machine is to try various flags and see |
30 |
what happens. Multiple level caches, long instruction pipelines, |
31 |
paging architecture an a myriad of other things all go to make it |
32 |
difficult to predict exactly what will and won't be faster. |
33 |
|
34 |
|
35 |
P.S. When replying to Duncan's extensive posts, could we trim parts |
36 |
that aren't relevant to the reply? |
37 |
|
38 |
-- |
39 |
Kevin F. Quinn |