Gentoo Archives: gentoo-science

From: "M. Edward (Ed) Borasky" <znmeb@×××××××.net>
To: gentoo-science@l.g.o
Cc: "Adam Piątyszek" <ediap@×××××××××××××.PL>
Subject: Re: [gentoo-science] Re: [Fwd: Re: [atlas-devel] 1) ATLAS shared libraries; 2) "ASM" -> "ASM VOLATILE"]
Date: Fri, 25 Aug 2006 14:24:29
Message-Id: 44EF07C7.7040003@cesmail.net
In Reply to: [gentoo-science] Re: [Fwd: Re: [atlas-devel] 1) ATLAS shared libraries; 2) "ASM" -> "ASM VOLATILE"] by Markus Dittrich
1 Markus Dittrich wrote:
2 > On Fri, 25 Aug 2006, Adam Pityszek wrote:
3 >
4 >>> Dear Markus, gentoo-science guys,
5 >>>
6 >>> Please find below the reply from Clint to my yesterday's email related to
7 >>> our work on ATLAS shared libraries in Gentoo.
8 >>>
9 >>> Markus, I think we can help with answering the questions (2) and (3). Of
10 >>> course, volunteers from gentoo-science are welcome as well.
11 >>>
12 >>> BR,
13 >>> /ediap
14 >>>
15 >>> (1) Is it true that the extra pointer may still be used if we restore
16 >>> it at
17 >>> end of assembly routine?
18 >>> (2) Does throwing the -fpic or other required compiler flag changes
19 >>> change
20 >>> the best cases (thus necessitating doubling the arch defaults)?
21 >>> (3) What is the overall performance affect when using .so?
22 >>>
23 >>> I've tried to answer (1) by looking at some docs, but never got convinced
24 >>> either way. I've been meaning to write a resister stress-test to see if
25 >>> I can make gcc use the reserved register in a function w/o global data.
26 >>> Perhaps you know?
27 >>>
28 >>> You guys could help with (2) & (3) if you like. You could build
29 >>> out-of-box
30 >>> to .a on whatever machines you can, and then build it to .so using your
31 >>> gentoo harness, and post some head-to-head timings . . . If, as we
32 >>> suspect,
33 >>> the difference is essentially zero, that makes .so a lot more
34 >>> attractive . . .
35 >>>
36 >
37 > Hi Adam,
38 >
39 > Thanks for talking to upstream about this and Clint's response
40 > sounds encouraging. We could definitely help out with 2) and 3);
41 > it would be good to know anyway how well we do with our shared libs. In
42 > doing so we should also test the impact of using
43 > the 387 floating point unit versus the sse instruction set. According to
44 > Clint, the former can give a significant performance
45 > gain on some CPU's. If that is the case it might be worth a note in the
46 > ebuild to make our users aware of it.
47 >
48 > We should get a hold of a nice benchmark suite for this purpose; Clint
49 > has posted one on this gcc bug
50 > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27827
51 > which we might be able to use. I'll have a look at it.
52 >
53 > Best,
54 > Markus
55 >
56 >
57 > -- Markus Dittrich (markusle)
58 > Gentoo Linux Developer
59 > Scientific applications
60
61 If you have the time, you can turn off all of the pre-conceived notions
62 Atlas has about your architecture and let it benchmark itself. In fact,
63 for the hard-core number crunchers, you might actually want to put a USE
64 flag in the ebuild to do a "brute-force" assume-nothing compile, warning
65 them that it takes a long time and that it should be run after an
66 "emerge -f" with Linux in single-user mode. My recollection is that it
67 used to take about 8 hours on a 1.3 GHz Athlon Thunderbird.
68 --
69 gentoo-science@g.o mailing list