1 |
On 11/04/2010 12:27, Mick wrote: |
2 |
> On Sunday 11 April 2010 11:43:26 zeerak.w@×××××.com wrote: |
3 |
>> On Sun, Apr 11, 2010 at 03:20:50AM +0100, Kerin Millar wrote: |
4 |
>>> On 10/04/2010 23:06, luis jure wrote: |
5 |
>>>> hello list, |
6 |
>>>> |
7 |
>>>> after many years without a hardware upgrade, i'll be receiving my new |
8 |
>>>> computer next week: intel i7 920 cpu, 6 GB ram, asus p6t mobo. |
9 |
>>>> |
10 |
>>>> i'm pretty excited, i imagine that at first i'll be shocked at the |
11 |
>>>> difference with the ancient machine i'm using now. |
12 |
>>>> |
13 |
>>>> now my question: searching a bit for the best compilation flags for |
14 |
>>>> this processor, i found this at gentoo-wiki: |
15 |
>>>> |
16 |
>>>> CHOST="x86_64-pc-linux-gnu" |
17 |
>>>> CFLAGS="-march=core2 -msse4 -mcx16 -msahf -O2 -pipe" |
18 |
>>>> CXXFLAGS="${CFLAGS}" |
19 |
>>>> (http://en.gentoo-wiki.com/wiki/Safe_Cflags/Intel) |
20 |
>>>> |
21 |
>>>> on the other hand, a thread at http://forums.gentoo.org says that the |
22 |
>>>> wiki page is outdated, and that -march=native should do the job without |
23 |
>>>> any further tweaks like -msse4 etc. |
24 |
>>> |
25 |
>>> That is correct; -march=native will indeed do the job. The CFLAGS |
26 |
>>> example you cite is clearly an interpretation of the flags that the |
27 |
>>> native target would result in anyway. |
28 |
>>> |
29 |
>>> With respect to my Intel Xeon E3113, -march=native appears to equate to: |
30 |
>>> |
31 |
>>> -march=core2 -mtune=core2 -msahf -msse4.1 --param l1-cache-size=32 |
32 |
>>> --param l1-cache-line-size=64 |
33 |
>>> |
34 |
>>> In short, use "native" and let the compiler take care of the details. |
35 |
>>> |
36 |
>>> Cheers, |
37 |
>>> |
38 |
>>> --Kerin |
39 |
>> |
40 |
>> There's a thread in Installing Gentoo where a dev (can't remember which), |
41 |
>> that says native isn't the best option, but the best option indeed is to |
42 |
>> specify your arch. See these threads: |
43 |
>> http://forums.gentoo.org/viewtopic-t-821639.html |
44 |
>> http://forums.gentoo.org/viewtopic-t-821370.html |
45 |
> |
46 |
> OK, but: |
47 |
> |
48 |
> $ gcc -### -march=native -E /usr/include/stdlib.h 2>&1 | grep |
49 |
> "/usr/libexec/gcc/.*cc1" |
50 |
> "/usr/libexec/gcc/x86_64-pc-linux-gnu/4.3.4/cc1" "-E" "-quiet" |
51 |
> "/usr/include/stdlib.h" "-D_FORTIFY_SOURCE=2" "-march=core2" "-mcx16" "-msahf" |
52 |
> "--param" "l1-cache-size=32" "--param" "l1-cache-line-size=64" "-mtune=core2" |
53 |
> |
54 |
> the above shows that it uses smaller cache sizes than what my cpu has |
55 |
> according to lshw: |
56 |
|
57 |
Hmm. Well, as far as I'm aware, Nehalem - like my Wolfdale-based |
58 |
processor - has 64KB of L1 cache per core, with 32KB serving as an |
59 |
instruction cache and 32KB serving as a data cache. My tentative |
60 |
supposition would be that gcc is taking into account the size of the |
61 |
instruction cache. As for the cache line size, that's measured in bytes, |
62 |
and is indeed 64B on the majority of (if not all) x86 processors, |
63 |
Nehalem included. |
64 |
|
65 |
However, the result you're getting from lshw does seem somewhat |
66 |
contradictory. gcc uses a cpuid instruction to determine the appropriate |
67 |
values, but you might also like to check using sysfs: |
68 |
|
69 |
$ paste <(cat /sys/devices/system/cpu/cpu0/cache/index?/type) <(cat |
70 |
/sys/devices/system/cpu/cpu0/cache/index?/size) | sed -re 's/\W+/: /' |
71 |
|
72 |
On my system that results in the following: |
73 |
|
74 |
Data: 32K |
75 |
Instruction: 32K |
76 |
Unified: 6144K |
77 |
|
78 |
> Perhaps I should stick with march=core2 and additionally be adding "--param" |
79 |
> and the L0, L1, L2 cache sizes? |
80 |
|
81 |
I would suggest to leave it alone. At least, not without raising it with |
82 |
a gcc developer or someone with a formal understanding of CPU architecture. |
83 |
|
84 |
Cheers, |
85 |
|
86 |
--Kerin |