1 |
Dear Markus, gentoo-science guys, |
2 |
|
3 |
Please find below the reply from Clint to my yesterday's email related to |
4 |
our work on ATLAS shared libraries in Gentoo. |
5 |
|
6 |
Markus, I think we can help with answering the questions (2) and (3). Of |
7 |
course, volunteers from gentoo-science are welcome as well. |
8 |
|
9 |
BR, |
10 |
/ediap |
11 |
|
12 |
|
13 |
-------- Original message -------- |
14 |
Subject: Re: [atlas-devel] 1) ATLAS shared libraries; 2) "ASM" -> "ASM |
15 |
VOLATILE" |
16 |
Date: Thu, 24 Aug 2006 17:44:19 -0500 |
17 |
From: Clint Whaley <whaley@×××××××.edu> |
18 |
Reply-To: List for developer discussion, NOT SUPPORT. |
19 |
<math-atlas-devel@×××××××××××××××××.net> |
20 |
To: math-atlas-devel@×××××××××××××××××.net |
21 |
|
22 |
Adam, |
23 |
|
24 |
>1) In parallel to your great work on new ATLAS releases, one Gentoo |
25 |
>developer (Markus) and I have been working on preparing an updated set of |
26 |
>patches to build both static and shared libraries of ATLAS. |
27 |
|
28 |
Great! |
29 |
|
30 |
>I am conscious that you recommended using ATLAS as a static library only, |
31 |
>due to its better performance (I do not know the real difference, though). |
32 |
>But in Gentoo, shared libraries are preferred. |
33 |
>Could you please comment on the performance differences and possible |
34 |
>extension of ATLAS official package with optional support of shared |
35 |
>libraries? We are keen on supporting you with our patches. |
36 |
|
37 |
OK, ATLAS still defaults to .a because it's what I use :) Back in the day, |
38 |
ATLAS was mainly used in HPC, characterized by big applications that often |
39 |
ran on parallel machines. Linking in an extra lib was the least of these |
40 |
guys worries. |
41 |
|
42 |
I'm still an HPC guy at heart, so I always use .a when available. However, |
43 |
I do not believe the performance difference should be noticable to the average |
44 |
user. Here's what I *think* is the affects of shared libs: |
45 |
(1) An extra register is used to store the ptr to a table in memory for |
46 |
global memory (this is what -fpic does, I think) |
47 |
-- I don't think this hurts ATLAS, because ATLAS doesn't use global |
48 |
memory. I assume (I don't know) that ATLAS's assembly is still |
49 |
allowed to use that register, as long as it save/restores it . . . |
50 |
(2) The first time the routine is called, there is greater overhead in a |
51 |
.so, 'cause you have to load the shared object at that time |
52 |
-- not sure how much worse this is that .a, since you usually have |
53 |
to hit the disk on first load; .so is probably more likely to be |
54 |
on completely different pages, I guess . . . |
55 |
|
56 |
In these days where ATLAS is used for a whole lot of non-HPC things, as well |
57 |
as being wrapped and plugged into high-level things like PSEs and Python, |
58 |
my suspicion is that the *majority* of users would like shared libraries. |
59 |
|
60 |
So, supporting an out-of-box build to .so is definitely in my plans, |
61 |
I just haven't got around to doing the work yet. Because I have no real |
62 |
experience with shared libs, I have questions that will need to be |
63 |
investigated before I can do this: |
64 |
(1) Is it true that the extra pointer may still be used if we restore it at |
65 |
end of assembly routine? |
66 |
(2) Does throwing the -fpic or other required compiler flag changes change |
67 |
the best cases (thus necessitating doubling the arch defaults)? |
68 |
(3) What is the overall performance affect when using .so? |
69 |
|
70 |
I've tried to answer (1) by looking at some docs, but never got convinced |
71 |
either way. I've been meaning to write a resister stress-test to see if |
72 |
I can make gcc use the reserved register in a function w/o global data. |
73 |
Perhaps you know? |
74 |
|
75 |
You guys could help with (2) & (3) if you like. You could build out-of-box |
76 |
to .a on whatever machines you can, and then build it to .so using your |
77 |
gentoo harness, and post some head-to-head timings . . . If, as we suspect, |
78 |
the difference is essentially zero, that makes .so a lot more attractive . . . |
79 |
|
80 |
I doubt I'll spend a large amount of time getting .so in before getting |
81 |
a new stable out, but if it doesn't require a huge amount of changes, |
82 |
and someone can outline it to me so that I can see the tricks work |
83 |
generally (i.e., not just one version of Linux), I'd certainly welcome |
84 |
help with this . . . |
85 |
|
86 |
>To build shared ATLAS we replaced most of the compiler variables with |
87 |
>their special redefinitions using the "libtool". You can have a look at |
88 |
>the patch in this bug-report: https://bugs.gentoo.org/show_bug.cgi?id=144314 |
89 |
|
90 |
On one of the comments there, you'll be happy to know I just added a |
91 |
--with-netlib-lapack to config which allows ATLAS to automatically build the |
92 |
combined lapack library, assuming netlib lapack has been installed prior to |
93 |
the ATLAS build . . . |
94 |
|
95 |
>2) BTW, in "include/contrib/camm_dpa.h" header file, we needed to change |
96 |
>the "ASM" into "ASM VOLATILE" to build shared libraries. I wonder, if you |
97 |
>can incorporate this change in the official ATLAS sources. Of course, when |
98 |
>you are sure that it won't break something (I am not an assembler expert |
99 |
>at all). ;) |
100 |
|
101 |
This is a file written by Camm, who's the Debian ATLAS maintainer. He |
102 |
also builds to .so, I think. So, Camm, is it OK to make this change? |
103 |
|
104 |
Cheers, |
105 |
Clint |
106 |
_______________________________________________ |
107 |
Math-atlas-devel mailing list |
108 |
Math-atlas-devel@×××××××××××××××××.net |
109 |
https://lists.sourceforge.net/lists/listinfo/math-atlas-devel |