Re: [gentoo-amd64] Re: gcc 4.1 + CFLAGS - gentoo-amd64

From:	Samir Mishra <sqmishra@×××.ae>
To:	gentoo-amd64@l.g.o
Subject:	Re: [gentoo-amd64] Re: gcc 4.1 + CFLAGS
Date:	Sat, 10 Jun 2006 22:20:27
Message-Id:	`448B34EC.9030009@eim.ae`
In Reply to:	[gentoo-amd64] Re: gcc 4.1 + CFLAGS by Duncan <1i5t5.duncan@cox.net>

1

Duncan wrote:

2

> A couple questions:

3

>

4

> -mcmodel=medium:  Here's the gcc 4.1.1 manpage entry:

5

>

6

> -mcmodel=medium

7

>   Generate code for the medium model: The program is linked in the lower 2

8

>   GB of the address space but symbols can be located any-where in the

9

>   address space.  Programs can be statically or dynamically linked, but

10

>   building of shared libraries are not supported with the medium model.

11

>

12

> What about that last part -- shared libraries not supported?

13

>

14

>

15

palladium ~ # file /bin/bash

16

/bin/bash: ELF 64-bit LSB executable, AMD x86-64, version 1 (SYSV), for 

17

GNU/Linux 2.6.11, dynamically linked (uses shared libs), stripped

18

palladium ~ # file /lib/libc-2.4.so

19

/lib/libc-2.4.so: ELF 64-bit LSB shared object, AMD x86-64, version 1 

20

(SYSV), stripped

21

palladium ~ # file /lib/libgcc_s.so.1

22

/lib/libgcc_s.so.1: ELF 64-bit LSB shared object, AMD x86-64, version 1 

23

(SYSV), stripped

24

25

I assume this is how to  check if a n executable or library is shared. I 

26

picked 3 common ones at random. I think I read somewhere that the above 

27

statement regarding lack of support for building shared libraries is not 

28

accurate. I assumed it to be inaccurate too since I get the above output 

29

against all files on my machine. If I tested the libraries incorrectly, 

30

could you indicate what would be the right way to check if GCC is 

31

generating shared libraries or not? Memory usage seems reasonable for 

32

all of the applications, but given how much I know, quite possible I got 

33

it wrong.

34

> Also...  from my understanding, the address-space references here are

35

> simply to the program's view of memory, from it's perspective.  More

36

> specifically, it's not to absolute address space (the >2 gig memory you

37

> mention), but to the program's virtual address space, the protected-mode

38

> virtual address space unique to it, and rather independent from its actual

39

> placement in physical memory or its placement in terms of the kernel's

40

> virtual address space.

41

>

42

> As I understand it, therefore, the only reason to use -mcmodel=medium

43

> would be if the program itself was expected to manipulate memory on the

44

> order of gigs at a time.  While this can legitimately happen, say with a

45

> big database (think Oracle, which very likely uses such an option), the

46

> vast majority of applications don't work with anywhere near that size of

47

> data, at least not directly.  While they might handle gigs of data over

48

> the lifetime of the program, it's megabytes not gigabytes at a time, and

49

> much of it may be indirectly, thru the kernel, so they have little need

50

> for a memory allocation model that can allocate structures of multiple

51

> gigs at a time or to access anything outside a two-gigabyte model.  Even

52

> on the big machines running apps where use of this model might be

53

> legitimately called for, it would be inadvisable to compile /everything/

54

> on the machine with it, only the specific apps that need it.

55

>

56

>

57

I thought otherwise. I guess I'll have to do some research on this. In 

58

case you do come across any interesting pointers or references, please 

59

do let me know.

60

> This would also explain why it only applies to applications, not shared

61

> libraries, as only the linked application would need to reference the full

62

> multi-gig allocations -- any library functions would presumably be working

63

> with much smaller sections of that dataset.

64

>

65

> Note that I'm specifically /not/ saying you are incorrect, only that it

66

> disagrees with what I (think I) know, which itself may be incorrect.  I'm

67

> running 8 gig of memory here, but it's a fairly new upgrade, so a new

68

> experience, and it's quite possible what I thought I knew is wrong.  If

69

> so, I'd like to know it, the better to make use of the memory I have here,

70

> as well.  So, if you have references as to why you are using that memory

71

> model, post 'em, as I'd certainly like to read them myself!

72

>

73

> -ftracer: Of interest here are a couple other manpage entries:

74

>

75

>

76

I thought -ftracer, like -fweb, automatically did the profiling and the 

77

optimization. I was under the impression that -ftracer not recommended 

78

since it increases compile times (which I wasn't worried about) but 

79

produces "better" executables. If the profiling has to be done manually, 

80

prior to optimization, I definitely need to take this one out for my CFLAGS.

81

> -fprofile-generate

82

>   Enable options usually used for instrumenting application to produce

83

>   profile useful for later recompilation with profile feedback based

84

>   optimization.  [snip]

85

>

86

> -fprofile-use

87

>   Enable profile feedback directed optimizations, and optimizations

88

>   generally profitable only with profile feedback available.

89

>

90

>   The following options are enabled: "-fbranch-probabilities", "-fvpt",

91

>   "-funroll-loops", "-fpeel-loops", "-ftracer", "-fno-loop-optimize".

92

>

93

>

94

BTW I've heard bad things about using -fbranch-probabilities and -fweb. 

95

In fact, I tried -fweb with -ftracer and I began experiencing numerous 

96

compilation failures. Everything went back to OK when I took it out.

97

> I don't see either of these in your CFLAGS, and you don't mention

98

> compiling, then profiling the program, then recompiling.  Without that,

99

> the -ftracer flag, which the manpage says "perform[s] tail duplication to

100

> enlarge superblock size", would appear to enlarge your binaries for little

101

> or no performance gain.

102

>

103

> That said, I've seen a couple developers that use the -ftracer flag as

104

> well, in various bug reports and the like, which has surprised me a bit

105

> since it's common for devs to complain about others using flags they don't

106

> understand the implications of. However, I've yet to find someone who can

107

> suitably explain /why/ they use it, given the above and the rather

108

> unlikely chance that they actually bother with that profiling and

109

> recompilation with every package they use the flag with, tho I don't of

110

> course go around posting the question to every bug where I see it used,

111

> and I've had limited chance to ask the question in a suitable context, so

112

> I've not asked that many about it.

113

>

114

> Again, if you have a good reason to choose that flag, or if you've

115

> actually done tests and /know/ it improves performance, please post it, so

116

> folks (like me! =8^) can be informed and perhaps optimize their system

117

> similarly. Until then, I can't reasonably defend or explain using the

118

> flag if I'm not going to go to the trouble of profiling and recompiling,

119

> and I don't think the performance gains would be enough to justify that

120

> (if I were compiling for use by an entire distribution, not just me, the

121

> effort would of course be justified) so I don't choose to use it.

122

>

123

124

In conclusion, I THOUGHT I had good reason, but I guess I need to do a 

125

bit more research.

126

127

I have not done any speed tests, From reading all the speed tests, I 

128

know these don't make any sense, because in a real use situation, many 

129

other factors matter. And anyway, I don't have the knowledge to them 

130

right :)

131

132

133

Thx.

134

--

135

gentoo-amd64@g.o mailing list

Gentoo Archives: gentoo-amd64

Replies

1	Duncan wrote:
2	> A couple questions:
3	>
4	> -mcmodel=medium: Here's the gcc 4.1.1 manpage entry:
5	>
6	> -mcmodel=medium
7	> Generate code for the medium model: The program is linked in the lower 2
8	> GB of the address space but symbols can be located any-where in the
9	> address space. Programs can be statically or dynamically linked, but
10	> building of shared libraries are not supported with the medium model.
11	>
12	> What about that last part -- shared libraries not supported?
13	>
14	>
15	palladium ~ # file /bin/bash
16	/bin/bash: ELF 64-bit LSB executable, AMD x86-64, version 1 (SYSV), for
17	GNU/Linux 2.6.11, dynamically linked (uses shared libs), stripped
18	palladium ~ # file /lib/libc-2.4.so
19	/lib/libc-2.4.so: ELF 64-bit LSB shared object, AMD x86-64, version 1
20	(SYSV), stripped
21	palladium ~ # file /lib/libgcc_s.so.1
22	/lib/libgcc_s.so.1: ELF 64-bit LSB shared object, AMD x86-64, version 1
23	(SYSV), stripped
24
25	I assume this is how to check if a n executable or library is shared. I
26	picked 3 common ones at random. I think I read somewhere that the above
27	statement regarding lack of support for building shared libraries is not
28	accurate. I assumed it to be inaccurate too since I get the above output
29	against all files on my machine. If I tested the libraries incorrectly,
30	could you indicate what would be the right way to check if GCC is
31	generating shared libraries or not? Memory usage seems reasonable for
32	all of the applications, but given how much I know, quite possible I got
33	it wrong.
34	> Also... from my understanding, the address-space references here are
35	> simply to the program's view of memory, from it's perspective. More
36	> specifically, it's not to absolute address space (the >2 gig memory you
37	> mention), but to the program's virtual address space, the protected-mode
38	> virtual address space unique to it, and rather independent from its actual
39	> placement in physical memory or its placement in terms of the kernel's
40	> virtual address space.
41	>
42	> As I understand it, therefore, the only reason to use -mcmodel=medium
43	> would be if the program itself was expected to manipulate memory on the
44	> order of gigs at a time. While this can legitimately happen, say with a
45	> big database (think Oracle, which very likely uses such an option), the
46	> vast majority of applications don't work with anywhere near that size of
47	> data, at least not directly. While they might handle gigs of data over
48	> the lifetime of the program, it's megabytes not gigabytes at a time, and
49	> much of it may be indirectly, thru the kernel, so they have little need
50	> for a memory allocation model that can allocate structures of multiple
51	> gigs at a time or to access anything outside a two-gigabyte model. Even
52	> on the big machines running apps where use of this model might be
53	> legitimately called for, it would be inadvisable to compile /everything/
54	> on the machine with it, only the specific apps that need it.
55	>
56	>
57	I thought otherwise. I guess I'll have to do some research on this. In
58	case you do come across any interesting pointers or references, please
59	do let me know.
60	> This would also explain why it only applies to applications, not shared
61	> libraries, as only the linked application would need to reference the full
62	> multi-gig allocations -- any library functions would presumably be working
63	> with much smaller sections of that dataset.
64	>
65	> Note that I'm specifically /not/ saying you are incorrect, only that it
66	> disagrees with what I (think I) know, which itself may be incorrect. I'm
67	> running 8 gig of memory here, but it's a fairly new upgrade, so a new
68	> experience, and it's quite possible what I thought I knew is wrong. If
69	> so, I'd like to know it, the better to make use of the memory I have here,
70	> as well. So, if you have references as to why you are using that memory
71	> model, post 'em, as I'd certainly like to read them myself!
72	>
73	> -ftracer: Of interest here are a couple other manpage entries:
74	>
75	>
76	I thought -ftracer, like -fweb, automatically did the profiling and the
77	optimization. I was under the impression that -ftracer not recommended
78	since it increases compile times (which I wasn't worried about) but
79	produces "better" executables. If the profiling has to be done manually,
80	prior to optimization, I definitely need to take this one out for my CFLAGS.
81	> -fprofile-generate
82	> Enable options usually used for instrumenting application to produce
83	> profile useful for later recompilation with profile feedback based
84	> optimization. [snip]
85	>
86	> -fprofile-use
87	> Enable profile feedback directed optimizations, and optimizations
88	> generally profitable only with profile feedback available.
89	>
90	> The following options are enabled: "-fbranch-probabilities", "-fvpt",
91	> "-funroll-loops", "-fpeel-loops", "-ftracer", "-fno-loop-optimize".
92	>
93	>
94	BTW I've heard bad things about using -fbranch-probabilities and -fweb.
95	In fact, I tried -fweb with -ftracer and I began experiencing numerous
96	compilation failures. Everything went back to OK when I took it out.
97	> I don't see either of these in your CFLAGS, and you don't mention
98	> compiling, then profiling the program, then recompiling. Without that,
99	> the -ftracer flag, which the manpage says "perform[s] tail duplication to
100	> enlarge superblock size", would appear to enlarge your binaries for little
101	> or no performance gain.
102	>
103	> That said, I've seen a couple developers that use the -ftracer flag as
104	> well, in various bug reports and the like, which has surprised me a bit
105	> since it's common for devs to complain about others using flags they don't
106	> understand the implications of. However, I've yet to find someone who can
107	> suitably explain /why/ they use it, given the above and the rather
108	> unlikely chance that they actually bother with that profiling and
109	> recompilation with every package they use the flag with, tho I don't of
110	> course go around posting the question to every bug where I see it used,
111	> and I've had limited chance to ask the question in a suitable context, so
112	> I've not asked that many about it.
113	>
114	> Again, if you have a good reason to choose that flag, or if you've
115	> actually done tests and /know/ it improves performance, please post it, so
116	> folks (like me! =8^) can be informed and perhaps optimize their system
117	> similarly. Until then, I can't reasonably defend or explain using the
118	> flag if I'm not going to go to the trouble of profiling and recompiling,
119	> and I don't think the performance gains would be enough to justify that
120	> (if I were compiling for use by an entire distribution, not just me, the
121	> effort would of course be justified) so I don't choose to use it.
122	>
123
124	In conclusion, I THOUGHT I had good reason, but I guess I need to do a
125	bit more research.
126
127	I have not done any speed tests, From reading all the speed tests, I
128	know these don't make any sense, because in a real use situation, many
129	other factors matter. And anyway, I don't have the knowledge to them
130	right :)
131
132
133	Thx.
134	--
135	gentoo-amd64@g.o mailing list