1 |
(Apologize the previous mail which is cut off badly.. I shouldn't write emails |
2 |
before having a sufficient amount of caffeine in my veins ;)) |
3 |
|
4 |
Duncan wrote: |
5 |
>>Athlon64 something (forgot what, but shouldn't matter anyway) with 1 MB |
6 |
>>L2-cache is 4% faster than an Athlon64 of the same frequency but with only 512kB |
7 |
>>L2-cache. The bigger the cache sizes you compare get, the smaller the |
8 |
>>performance increase. Since you run a dual Opteron system with 1 MB L2 |
9 |
>>cache per CPU I tend to say that the actual performance increase you |
10 |
>>experience is about 3%. But then I didn't take into account that -Os |
11 |
>>leaves out a few optimizations which would be included by -O2, the |
12 |
>>default optimization level, which actually makes the code a bit slower |
13 |
>>when compared to -O2. So, the performance increase you really experience |
14 |
>>shrinks to about 0-2%. I'd tend to proclaim that -O2 is even faster for |
15 |
>>most of the code, but that's only my feeling. |
16 |
> |
17 |
> |
18 |
> Interesting, indeed. I'd counter that it likely has to do with how many |
19 |
> tasks are being juggled as well, plus the number of kernel/user context |
20 |
> switches, of course. I wonder under what load, and with what task-type, |
21 |
> the above 4% difference was measured. |
22 |
> |
23 |
> Of course, the definitive way to end the argument would be to do some |
24 |
> profiling and get some hard numbers, but I don't think either you or I |
25 |
> consider it an important enough factor in our lives to go to /that/ sort |
26 |
> of trouble. <g> |
27 |
|
28 |
Indeed, I'd rather say I have no clue than go and perform tests :D |
29 |
|
30 |
>>You are referring a lot to the gcc manpage, but obviously you missed |
31 |
>>this part: |
32 |
>> |
33 |
>> -fomit-frame-pointer |
34 |
>> Don't keep the frame pointer in a register for functions that |
35 |
>> don't need one. This avoids the instructions to save, set up |
36 |
>> and restore frame pointers; it also makes an extra register |
37 |
>> available in many functions. It also makes debugging |
38 |
>> impossible on some machines. |
39 |
>> |
40 |
>> On some machines, such as the VAX, this flag has no effect, |
41 |
>> because the standard calling sequence automatically handles |
42 |
>> the frame pointer and nothing is saved by pretending it |
43 |
>> doesn't exist. The machine-description macro |
44 |
>> "FRAME_POINTER_REQUIRED" controls whether a target machine |
45 |
>> supports this flag. |
46 |
>> |
47 |
>> Enabled at levels -O, -O2, -O3, -Os. |
48 |
>> |
49 |
>>I have to say that I am a bit disappointed now. You seemed to be one of |
50 |
>>those people who actually inform themselves before sticking new flags |
51 |
>>into their CFLAGS. |
52 |
> |
53 |
> |
54 |
> ?? |
55 |
> |
56 |
> I'm not sure which way you mean this. It was in my CFLAGS list, but I |
57 |
> didn't discuss it as it's fairly common (from my observation, nearly as |
58 |
> common as -pipe) and seems fairly non-controversial on Gentoo. Did you |
59 |
> miss it in my CFLAGS and are saying I should be using it, or did you see |
60 |
> it and are saying its unnecessary and redundant because it's enabled by |
61 |
> the -Os? |
62 |
|
63 |
I was referring to the latter, yes. I was reading this mail because I finally |
64 |
found somebody who can even *explain* why he sticks to certain flags, which |
65 |
is pretty rare IMO, so I hoped you would also explain why -fomit-frame-pointer |
66 |
isn't needed when having -O? in CFLAGS. Don't get me wrong, there's no problem |
67 |
with having both in CFLAGS, but I know that way too many people have strange |
68 |
CFLAGS without having a clue what these flags actually do, and usually you can |
69 |
expose them by grepping their CFLAGS for -fomit-frame-pointer. |
70 |
|
71 |
> If the latter, yes, but as mentioned above in the context of glibc, -Os is |
72 |
> sometimes stripped. In that case, the redundancy of having the basic |
73 |
> -fomit-frame-pointer is useful, unless it's also stripped, but as I said, |
74 |
> it seems much less controversial than some flags and is often |
75 |
> specifically allowed where most are stripped. |
76 |
|
77 |
AFAIK the toolchain eclass doesn't strip -Os, it replaces -O? with -O2, which |
78 |
also enables -fomit-frame-pointer. |
79 |
|
80 |
> Or, are you saying I should avoid it due to the debugging implications? I |
81 |
> don't quite get it. |
82 |
|
83 |
On amd64 frame-pointers aren't needed to do debugging, so it doesn't have any |
84 |
impact. |
85 |
|
86 |
>>Didn't know about this. Have you filed a bug yet on the topic? Or is |
87 |
>>there already one? |
88 |
> |
89 |
> |
90 |
> There is one. I don't recall if I filed it or if it was already there, |
91 |
> but both JH and the portage folks know about the issue. IIRC, the portage |
92 |
> folks decided it was their side that needed changed, but that required |
93 |
> changes to the distcc package, and I don't know how that has gone since I |
94 |
> don't use distcc, except that I was slightly surprised to see the warning |
95 |
> in portage 2.1 still. |
96 |
|
97 |
Ah, very good :) |
98 |
|
99 |
>>I really wonder how you would paralellize unpacking and configuring a |
100 |
>>package. |
101 |
> |
102 |
> |
103 |
> That's what was nice about configcache, which was supposed to be in the |
104 |
> next portage, but I haven't seen or heard anything about it for awhile, |
105 |
> and the next portage, 2.1, is what I'm using. configcache seriously |
106 |
> shortened that stage of the build, leaving more of it parallelized, but... |
107 |
> |
108 |
> I was using it for awhile, patching successive versions of portage, but it |
109 |
> broke about the time sandbox split, the dev said he wasn't maintaining the |
110 |
> old version since it was going in the new portage, and I tried updating |
111 |
> the patch but eventually ran into what I think were unrelated issues but |
112 |
> decided to drop that in one of my troubleshooting steps and never picked |
113 |
> it up again. |
114 |
> |
115 |
> I'd certainly like to have it back again, tho. If it's working in 2.1, |
116 |
> I've not seen it documented or seen any hints in the emerge output, as |
117 |
> were there before. You seen or heard anything? |
118 |
|
119 |
Good news ;) |
120 |
|
121 |
ferringb has been asking for testing for quite a while now and recently he sent |
122 |
a mail to the portage-dev mailing list, basically saying that if nobody steps up |
123 |
with a good reason, he will include the confcache patch with the next release. |
124 |
|
125 |
> BTW, what is your opinion on -ftracer? Several devs I've noticed use it, |
126 |
> but the manpage says it's not that useful without active profiling, which |
127 |
> means compiling, profiling, and recompiling, AFAIK. It's possible the |
128 |
> devs running it do that, but I doubt it, and otherwise, I don't see that |
129 |
> it should be that useful? I don't know if you run it, but since I've got |
130 |
> your attention, I thought I'd ask what you think about it. Is there |
131 |
> something of significance I'm missing, or are they, or are they actually |
132 |
> doing that compile/profile/recompile thing? It just doesn't make sense to |
133 |
> me. I've seen it in several user posted CFLAGS as well, but I'll bet a |
134 |
> good portion of them are simply because they saw it in a dev's CFLAGS and |
135 |
> decided it looked useful, not because they understand any implications |
136 |
> stated in the manpage. (Not that I always do either, but... <g>) |
137 |
|
138 |
I don't use it, but not for a certain reason, so I really can't comment on this. |
139 |
|
140 |
-- |
141 |
Simon Stelling |
142 |
Gentoo/AMD64 Operational Co-Lead |
143 |
blubb@g.o |
144 |
-- |
145 |
gentoo-amd64@g.o mailing list |
146 |
|
147 |
|
148 |
-- |
149 |
gentoo-amd64@g.o mailing list |