1 |
Richard Freeman <rich@××××××××××××××.net> posted |
2 |
4509E998.7020503@××××××××××××××.net, excerpted below, on Thu, 14 Sep 2006 |
3 |
19:45:28 -0400: |
4 |
|
5 |
> Duncan wrote: [snip] |
6 |
> |
7 |
> Hmm - no -ftree-vectorize? Care to comment on that? I hear that it can |
8 |
> be buggy with a few packages, but I'm guessing it is worth having in |
9 |
> there in general. |
10 |
|
11 |
The gcc manpage is a bit sparse (understatement) on vectorize, but the next |
12 |
entry has a bit more info. |
13 |
|
14 |
<quote, reformatted for posting> |
15 |
|
16 |
-ftree-vectorize |
17 |
Perform loop vectorization on trees. |
18 |
|
19 |
-ftree-vect-loop-version |
20 |
Perform loop versioning when doing loop vectorization on trees. When a |
21 |
loop appears to be vector-izable except that data alignment or data |
22 |
dependence cannot be determined at compile time then vectorized and |
23 |
non-vectorized versions of the loop are generated along with runtime |
24 |
checks for alignment or dependence to control which version is |
25 |
executed. This option is enabled by default except at level -Os where |
26 |
it is disabled. |
27 |
|
28 |
</quote> |
29 |
|
30 |
I'm unclear as to what "vectorization" means as used here. My |
31 |
understanding of "vector" is as a synonym for "line", thus implying loop |
32 |
unrolling of some form or another, which will increase size. As I |
33 |
explained in the grandparent, I believe such optimizations to be |
34 |
counterproductive on modern processors due to the extreme cost of cache |
35 |
misses as opposed to slight cycle inefficiencies. |
36 |
|
37 |
I am however aware that vectorization has a somewhat different meaning in |
38 |
programming terms than the above, but am not sufficiently educated on the |
39 |
topic to make an informed choice, so I've simply left gcc to go with its |
40 |
default choice given my overall stated intention of -Os. |
41 |
|
42 |
If you can sufficiently explain the concept to me such that I |
43 |
understand enough about it to feel comfortable going with other than the |
44 |
default (which means I can explain why I chose it and why it won't |
45 |
interfere with my overall strategy as outlined in the grandparent, or is |
46 |
worth it even if it does), I'd be very grateful! =8^) |
47 |
|
48 |
BTW, I'm also looking for a good reference on LDFLAGS. I'm using one ATM |
49 |
(LDFLAGS="-Wl,-z,now", which I naturally can explain if asked but will |
50 |
skip for the moment), but have seen mention of a couple others that look |
51 |
interesting, but haven't come across anything detailed enough on them to |
52 |
justify further divergence from the default at this time. man gcc just |
53 |
doesn't do it, in this case. =8^( |
54 |
|
55 |
-- |
56 |
Duncan - List replies preferred. No HTML msgs. |
57 |
"Every nonfree program has a lord, a master -- |
58 |
and if you use the program, he is your master." Richard Stallman |
59 |
|
60 |
-- |
61 |
gentoo-amd64@g.o mailing list |