1 |
On Sunday 11 April 2010 11:43:26 zeerak.w@×××××.com wrote: |
2 |
> On Sun, Apr 11, 2010 at 03:20:50AM +0100, Kerin Millar wrote: |
3 |
> > On 10/04/2010 23:06, luis jure wrote: |
4 |
> > > hello list, |
5 |
> > > |
6 |
> > > after many years without a hardware upgrade, i'll be receiving my new |
7 |
> > > computer next week: intel i7 920 cpu, 6 GB ram, asus p6t mobo. |
8 |
> > > |
9 |
> > > i'm pretty excited, i imagine that at first i'll be shocked at the |
10 |
> > > difference with the ancient machine i'm using now. |
11 |
> > > |
12 |
> > > now my question: searching a bit for the best compilation flags for |
13 |
> > > this processor, i found this at gentoo-wiki: |
14 |
> > > |
15 |
> > > CHOST="x86_64-pc-linux-gnu" |
16 |
> > > CFLAGS="-march=core2 -msse4 -mcx16 -msahf -O2 -pipe" |
17 |
> > > CXXFLAGS="${CFLAGS}" |
18 |
> > > (http://en.gentoo-wiki.com/wiki/Safe_Cflags/Intel) |
19 |
> > > |
20 |
> > > on the other hand, a thread at http://forums.gentoo.org says that the |
21 |
> > > wiki page is outdated, and that -march=native should do the job without |
22 |
> > > any further tweaks like -msse4 etc. |
23 |
> > |
24 |
> > That is correct; -march=native will indeed do the job. The CFLAGS |
25 |
> > example you cite is clearly an interpretation of the flags that the |
26 |
> > native target would result in anyway. |
27 |
> > |
28 |
> > With respect to my Intel Xeon E3113, -march=native appears to equate to: |
29 |
> > |
30 |
> > -march=core2 -mtune=core2 -msahf -msse4.1 --param l1-cache-size=32 |
31 |
> > --param l1-cache-line-size=64 |
32 |
> > |
33 |
> > In short, use "native" and let the compiler take care of the details. |
34 |
> > |
35 |
> > Cheers, |
36 |
> > |
37 |
> > --Kerin |
38 |
> |
39 |
> There's a thread in Installing Gentoo where a dev (can't remember which), |
40 |
> that says native isn't the best option, but the best option indeed is to |
41 |
> specify your arch. See these threads: |
42 |
> http://forums.gentoo.org/viewtopic-t-821639.html |
43 |
> http://forums.gentoo.org/viewtopic-t-821370.html |
44 |
|
45 |
OK, but: |
46 |
|
47 |
$ gcc -### -march=native -E /usr/include/stdlib.h 2>&1 | grep |
48 |
"/usr/libexec/gcc/.*cc1" |
49 |
"/usr/libexec/gcc/x86_64-pc-linux-gnu/4.3.4/cc1" "-E" "-quiet" |
50 |
"/usr/include/stdlib.h" "-D_FORTIFY_SOURCE=2" "-march=core2" "-mcx16" "-msahf" |
51 |
"--param" "l1-cache-size=32" "--param" "l1-cache-line-size=64" "-mtune=core2" |
52 |
|
53 |
the above shows that it uses smaller cache sizes than what my cpu has |
54 |
according to lshw: |
55 |
========================================== |
56 |
*-cpu |
57 |
description: CPU |
58 |
product: Intel(R) Core(TM) i7 CPU Q 720 @ 1.60GHz |
59 |
vendor: Intel Corp. |
60 |
physical id: 5 |
61 |
bus info: cpu@0 |
62 |
version: CPU Version |
63 |
slot: U2E1 |
64 |
size: 931MHz |
65 |
capacity: 4096MHz |
66 |
width: 64 bits |
67 |
clock: 133MHz |
68 |
capabilities: fpu fpu_exception wp vme de pse tsc msr pae mce cx8 |
69 |
apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht |
70 |
tm pbe syscall nx rdtscp x86-64 constant_tsc arch_perfmon pebs bts rep_good |
71 |
xtopology nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 |
72 |
ssse3 cx16 xtpr pdcm sse4_1 sse4_2 popcnt lahf_lm ida tpr_shadow vnmi |
73 |
flexpriority ept vpid cpufreq |
74 |
*-cache:0 |
75 |
description: L1 cache |
76 |
physical id: 6 |
77 |
slot: L1 Cache |
78 |
size: 128KiB |
79 |
capacity: 128KiB |
80 |
capabilities: asynchronous internal write-through data |
81 |
*-cache:1 |
82 |
description: L2 cache |
83 |
physical id: 7 |
84 |
slot: L2 Cache |
85 |
size: 1MiB |
86 |
capacity: 1MiB |
87 |
capabilities: burst internal write-through unified |
88 |
*-cache:2 |
89 |
description: L3 cache |
90 |
physical id: 8 |
91 |
slot: L3 Cache |
92 |
size: 6MiB |
93 |
capacity: 8MiB |
94 |
capabilities: burst internal write-back |
95 |
========================================== |
96 |
|
97 |
Now, in my current cflags I have: |
98 |
|
99 |
CFLAGS="-march=core2 -msse4 -mcx16 -msahf -O2 -pipe" |
100 |
|
101 |
Perhaps I should stick with march=core2 and additionally be adding "--param" |
102 |
and the L0, L1, L2 cache sizes? |
103 |
-- |
104 |
Regards, |
105 |
Mick |