Gentoo Archives: gentoo-amd64

From: "John S. Yates
To: gentoo-amd64@l.g.o
Subject: [gentoo-amd64] vectorization (was: gcc4 CFLAGS)
Date: Tue, 19 Sep 2006 00:58:57
Message-Id: pofug2h18lrv12l3ua2r0s0gt1trkij79r@4ax.com
In Reply to: [gentoo-amd64] gcc4 CFLAGS Was: gcc 4.1 upgrade - bad desktop interactivity anyone? by Duncan <1i5t5.duncan@cox.net>
1 On Fri, 15 Sep 2006 16:47:14 +0000 (UTC), you wrote:
2
3 > I am however aware that vectorization has a somewhat different meaning in
4 > programming terms than the above, but am not sufficiently educated on the
5 > topic to make an informed choice, so I've simply left gcc to go with its
6 > default choice given my overall stated intention of -Os.
7
8 Older super-computers, especially those designed or inspired by Seymour Cray
9 Cray, included "vector registers". These were multiple registers (typically
10 a small power of 2, say 32, 64, or 128) that could be manipulated as a unit.
11 This was an earlier form of SIMD. By issuing a single instruction -- such as
12 a vector load or vector add -- you repeated the same operation on a sequence
13 of operands. The crucial difference was that these vector operations had some
14 start-up overhead and then ran autonomously delivering one result every clock
15 tick for the length of the vector register.
16
17 While some compilers added proprietary language extension to support vector
18 values as actual data type, most numeric code was written in scalar form.
19 To make such super-computers useful it was crucial that compilers be able to
20 recognize when a scalar loop could be implemented using the machine's vector
21 facilities. Fundamentally this came down to figuring out when successive
22 loop iterations were independent and hence could execute in parallel. Since
23 the compiler was attempting to re-express scalar loops as loops using vector
24 primitives the optimization became know as "vectorization".
25
26 In essence a multi-media SIMD mechanism is very similar. A 64 bit register
27 containing 4 16-bit operands is essentially a length 4 "vector register".
28 Finding opportunities to use such SIMD instructions in scalar code requires
29 exactly the same forms of analysis and optimization.
30
31 /john
32
33
34 --
35 gentoo-amd64@g.o mailing list

Replies

Subject Author
Re: [gentoo-amd64] vectorization (was: gcc4 CFLAGS) Peter Humphrey <prh@××××××××××.uk>