[gentoo-dev] Re: FYI: Rules for distro-friendly packages - gentoo-dev

From:	Nikos Chantziaras <realnc@×××××.de>
To:	gentoo-dev@l.g.o
Subject:	[gentoo-dev] Re: FYI: Rules for distro-friendly packages
Date:	Sun, 27 Jun 2010 14:47:46
Message-Id:	`i07o8g$ug0$1@dough.gmane.org`
In Reply to:	Re: [gentoo-dev] Re: FYI: Rules for distro-friendly packages by "Harald van Dĳk"

1

On 06/27/2010 03:23 PM, Harald van Dĳk wrote:

2

> On Sun, Jun 27, 2010 at 02:56:33PM +0300, Nikos Chantziaras wrote:

3

>> On 06/27/2010 01:47 PM, Enrico Weigelt wrote:

4

>>> * Nikos Chantziaras<realnc@×××××.de>   schrieb:

5

>>>

6

>>>> Did it actually occur to anyone that warnings are not errors?

7

>>>> You can have them for correct code.  A warning means you might

8

>>>> want to look at the code to check whether there's some real

9

>>>> error there.  It doesn't mean the code is broken.

10

>>>

11

>>> In my personal experience, most times a warning comes it, the

12

>>> code *is* broken (but *might* work in most situations).

13

>>

14

>> That's the key to it: most times.  Granted, without -Wall (or any

15

>> other options that tweaks the default warning level) we can be very

16

>> sure that the warning is the result of a mistake by the developer.

17

>> But with -Wall, many warnings are totally not interesting ("unused

18

>> parameter") and some even try to outsmart the programmer even

19

>> though he/she knows better ("taking address of variable declared

20

>> register").  In that last example, fixing it would even be wrong

21

>> when you consider the optimizer and the fuzzy meaning of "register"

22

>> which the compiler is totally free to ignore.

23

>

24

> The compiler is not totally free to ignore the register keyword.

25

> Both the C and the C++ standards require that the compiler complain

26

> when taking the address of a register variable. Other compilers will

27

> issue a hard error for it. Fixing the code to not declare the

28

> variable as register would be the correct thing to do.

29

30

No, it would not be the correct thing to do, because of the following. 

31

(This is part of a discussion between me and someone quite smarter than 

32

me, who explained the issue in detail.)

33

34

The basic issue is that the code takes the address of the variable in

35

question in expressions passed as parameters to certain function calls.

36

These function calls all happen to be in-linable functions, and it

37

happens that in each function, the address operator is always canceled

38

out by a '*' dereference operator - in other words, we have '*&p', which

39

the compiler can turn into just plain 'p' when the calls are in-lined,

40

eliminating the need to actually take the address of 'p'.

41

42

A compiler is always free to ignore 'register' declarations *anyway*,

43

even if enregistration is possible.  Therefore a warning that it's not

44

possible to obey 'register' is unnecessary, because it's explicit in the

45

language definition that 'register' is not binding.  It simply is not

46

possible for an ignored 'register' attribute to cause unexpected

47

behavior.  Warnings really should only be generated for situations where

48

it is likely that the programmer expects different behavior than the

49

compiler will deliver; in the case of an ignored 'register' attribute,

50

the programmer is *required* to expect that the attribute might be

51

ignored, so a warning to this effect is superfluous.

52

53

Now, I understand why they generate the warning - it's because the

54

compiler believes that the program code itself makes enregistration

55

impossible, not because the compiler has chosen for optimization

56

purposes to ignore the 'register' request.  However, as we'll see

57

shortly, the program code doesn't truly make enregistration impossible;

58

it is merely impossible in some interpretations of the code.  Therefore

59

we really are back to the compiler choosing to ignore the 'register'

60

request due to its own optimization decisions; the 'register' request is

61

made impossible far downstream of the actual decisions that the compiler

62

makes (which have to do with in-line vs out-of-line calls), but it

63

really is compiler decisions that make it impossible, not the inherent

64

structure of the code.

65

66

When a function is in-lined, the compiler is not required to generate

67

the same code it would generate for the most general case of the same

68

function call, as long as the meaning is the same.

69

70

For example, suppose we have some code that contains a call to a

71

function like so:

72

73

    a = myFunc(a + 7, 3);

74

75

In the general out-of-line case, the compiler must generate some

76

machine-code instructions like this:

77

78

    push #3

79

    mov [a], d0

80

    add #7, d0

81

    push d0

82

    call #myFunc

83

    mov d0, [a]

84

85

The compiler doesn't have access to the inner workings of myFunc, so it

86

must generate the appropriate code for the generic interface to an

87

external function.

88

89

Now, suppose the function is defined like so:

90

91

   int myFunc(int a, int b) { return a - 6; }

92

93

and further suppose that the compiler decides to in-line this function.

94

In-lining means the compiler will generate the code that implements the

95

function directly in the caller; there will be no call to an external

96

linkage point.  This means the compiler can implement the linkage to the

97

function with a custom one-off interface for this particular invocation

98

- every in-line invocation can be customized to the exact context where

99

it appears.  So, for example, if we call myFunc right now and registers

100

d1 and d2 happens to be available, we can put the parameters in d1 and

101

d2, and the generated function will refer to those registers for the

102

parameters rather than having to look in the stack.  Later on, if we

103

generate a separate call to the same function, but registers d3 and d7

104

are the ones available, we can use those instead.  Each generated copy

105

of the function can fit its exact context.

106

107

Furthermore, looking at this function and at the arguments passed, we

108

can see that the formal parameter 'b' has no effect on the function's

109

results, and the actual parameter '3' passed for 'b' has no side

110

effects.  Therefore, the compiler is free to completely ignore this

111

parameter - there's no need to generate any code for it at all, since we

112

have sufficient knowledge to see that it has no effect on the meaning of

113

the code.

114

115

Further still, we can globally optimize the entire function.  So, we can

116

see that myFunc(a+7, 3) is going to turn into the expression (a+7-6).

117

We can fold constants to arrive at (a+1) as the result of the function.

118

We can therefore generate the entire code for the function's invocation

119

like so:

120

121

    inc [a]

122

123

Okay, now let's look at the &p case.  In the specific examples in

124

vmrun.cpp, we have a bunch of function invocations like this:

125

126

   register const char *p;

127

   int x = myfunc(&p);

128

129

In the most general case, we have to generate code like this:

130

131

   lea [p], d0        ; load effective address

132

   push d0

133

   call #myfunc

134

   mov d0, [x]

135

136

So, in the most general case of a call with external linkage, we need

137

'p' to have a main memory address so that we can push it on the stack as

138

the parameter to this call.  Registers don't have main memory addresses,

139

so 'p' can't go in a register.

140

141

However, we know what myfunc() looks like:

142

143

   char myfunc(const char **p)

144

{

145

       char c = **p;

146

       *p += 1;

147

       return c;

148

}

149

150

If the compiler chooses to in-line this function, it can globally

151

optimize its linkage and implementation as we saw earlier.  So, the

152

compiler can rewrite the code like so:

153

154

   register const char *p;

155

   int x = **(&p);

156

   *(&p) += 1;

157

158

which can be further rewritten to:

159

160

   register const char *p;

161

   int x = *p;

162

   p += 1;

163

164

Now we can generate the machine code for the final optimized form:

165

166

   mov [p], a0         ; get the *value* of p into index register 0

167

   mov.byte [a0+0], d0 ; get the value index register 0 points to

168

   mov.byte d0, [x]    ; store it in x

169

   inc [p]             ; inc the value of p

170

171

do we need a main memory address for p.  This means the compiler

172

can keep p in a register, say d5:

173

174

   mov d5, a0

175

   mov.byte [a0+0], d0

176

   mov.byte d0, [x]

177

   inc d5

178

179

And this is indeed exactly what the code that comes out of most 

180

compilers looks like (changed from my abstract machine to 32-bit x86, of 

181

course).

182

183

So: if the compiler chooses to in-line the functions that are called

184

with '&p' as a parameter, and the compiler performs the available

185

optimizations on those calls once they're in-lined, then a memory

186

address for 'p' is never needed.  Thus there is a valid interpretation

187

of the code where 'register p' can be obeyed.  If the compiler doesn't

188

choose to in-line the functions or make those optimizations, then the

189

compiler will be unable to satisfy the 'register p' request and will be

190

forced to put 'p' in addressable main memory.  But it really is entirely

191

up to the compiler whether to obey the 'register p' request; the

192

program's structure does not make the request impossible to satisfy.

193

Therefore there is no reason for the compiler to warn about this, any

194

more than there would be if the compiler chose not to obey the 'register

195

p' simply because it thought it could make more optimal use of the

196

available registers.  That GCC warns is understandable, in that a

197

superficial reading of the code would not reveal the optimization

198

opportunity; but the warning is nonetheless unnecessary, and the

199

'register' does provide useful optimization hinting.

200

201

202

OK, long read, but the the conclusion is that "fixing the code to not 

203

declare the variable as register would be the correct thing to do" it 

204

*not* the correct thing to do.  The correct thing to do is to ignore the 

205

warning, which is not possible if warnings are turned into errors.

206

207

You also mentioned that "other compilers will issue a hard error for 

208

it."  That sounds rather strange, and I wonder which compilers that 

209

might be; someone should file a bug report against them ;)

Gentoo Archives: gentoo-dev

Replies

1	On 06/27/2010 03:23 PM, Harald van Dĳk wrote:
2	> On Sun, Jun 27, 2010 at 02:56:33PM +0300, Nikos Chantziaras wrote:
3	>> On 06/27/2010 01:47 PM, Enrico Weigelt wrote:
4	>>> * Nikos Chantziaras<realnc@×××××.de> schrieb:
5	>>>
6	>>>> Did it actually occur to anyone that warnings are not errors?
7	>>>> You can have them for correct code. A warning means you might
8	>>>> want to look at the code to check whether there's some real
9	>>>> error there. It doesn't mean the code is broken.
10	>>>
11	>>> In my personal experience, most times a warning comes it, the
12	>>> code is broken (but might work in most situations).
13	>>
14	>> That's the key to it: most times. Granted, without -Wall (or any
15	>> other options that tweaks the default warning level) we can be very
16	>> sure that the warning is the result of a mistake by the developer.
17	>> But with -Wall, many warnings are totally not interesting ("unused
18	>> parameter") and some even try to outsmart the programmer even
19	>> though he/she knows better ("taking address of variable declared
20	>> register"). In that last example, fixing it would even be wrong
21	>> when you consider the optimizer and the fuzzy meaning of "register"
22	>> which the compiler is totally free to ignore.
23	>
24	> The compiler is not totally free to ignore the register keyword.
25	> Both the C and the C++ standards require that the compiler complain
26	> when taking the address of a register variable. Other compilers will
27	> issue a hard error for it. Fixing the code to not declare the
28	> variable as register would be the correct thing to do.
29
30	No, it would not be the correct thing to do, because of the following.
31	(This is part of a discussion between me and someone quite smarter than
32	me, who explained the issue in detail.)
33
34	The basic issue is that the code takes the address of the variable in
35	question in expressions passed as parameters to certain function calls.
36	These function calls all happen to be in-linable functions, and it
37	happens that in each function, the address operator is always canceled
38	out by a '' dereference operator - in other words, we have '&p', which
39	the compiler can turn into just plain 'p' when the calls are in-lined,
40	eliminating the need to actually take the address of 'p'.
41
42	A compiler is always free to ignore 'register' declarations anyway,
43	even if enregistration is possible. Therefore a warning that it's not
44	possible to obey 'register' is unnecessary, because it's explicit in the
45	language definition that 'register' is not binding. It simply is not
46	possible for an ignored 'register' attribute to cause unexpected
47	behavior. Warnings really should only be generated for situations where
48	it is likely that the programmer expects different behavior than the
49	compiler will deliver; in the case of an ignored 'register' attribute,
50	the programmer is required to expect that the attribute might be
51	ignored, so a warning to this effect is superfluous.
52
53	Now, I understand why they generate the warning - it's because the
54	compiler believes that the program code itself makes enregistration
55	impossible, not because the compiler has chosen for optimization
56	purposes to ignore the 'register' request. However, as we'll see
57	shortly, the program code doesn't truly make enregistration impossible;
58	it is merely impossible in some interpretations of the code. Therefore
59	we really are back to the compiler choosing to ignore the 'register'
60	request due to its own optimization decisions; the 'register' request is
61	made impossible far downstream of the actual decisions that the compiler
62	makes (which have to do with in-line vs out-of-line calls), but it
63	really is compiler decisions that make it impossible, not the inherent
64	structure of the code.
65
66	When a function is in-lined, the compiler is not required to generate
67	the same code it would generate for the most general case of the same
68	function call, as long as the meaning is the same.
69
70	For example, suppose we have some code that contains a call to a
71	function like so:
72
73	a = myFunc(a + 7, 3);
74
75	In the general out-of-line case, the compiler must generate some
76	machine-code instructions like this:
77
78	push #3
79	mov [a], d0
80	add #7, d0
81	push d0
82	call #myFunc
83	mov d0, [a]
84
85	The compiler doesn't have access to the inner workings of myFunc, so it
86	must generate the appropriate code for the generic interface to an
87	external function.
88
89	Now, suppose the function is defined like so:
90
91	int myFunc(int a, int b) { return a - 6; }
92
93	and further suppose that the compiler decides to in-line this function.
94	In-lining means the compiler will generate the code that implements the
95	function directly in the caller; there will be no call to an external
96	linkage point. This means the compiler can implement the linkage to the
97	function with a custom one-off interface for this particular invocation
98	- every in-line invocation can be customized to the exact context where
99	it appears. So, for example, if we call myFunc right now and registers
100	d1 and d2 happens to be available, we can put the parameters in d1 and
101	d2, and the generated function will refer to those registers for the
102	parameters rather than having to look in the stack. Later on, if we
103	generate a separate call to the same function, but registers d3 and d7
104	are the ones available, we can use those instead. Each generated copy
105	of the function can fit its exact context.
106
107	Furthermore, looking at this function and at the arguments passed, we
108	can see that the formal parameter 'b' has no effect on the function's
109	results, and the actual parameter '3' passed for 'b' has no side
110	effects. Therefore, the compiler is free to completely ignore this
111	parameter - there's no need to generate any code for it at all, since we
112	have sufficient knowledge to see that it has no effect on the meaning of
113	the code.
114
115	Further still, we can globally optimize the entire function. So, we can
116	see that myFunc(a+7, 3) is going to turn into the expression (a+7-6).
117	We can fold constants to arrive at (a+1) as the result of the function.
118	We can therefore generate the entire code for the function's invocation
119	like so:
120
121	inc [a]
122
123	Okay, now let's look at the &p case. In the specific examples in
124	vmrun.cpp, we have a bunch of function invocations like this:
125
126	register const char *p;
127	int x = myfunc(&p);
128
129	In the most general case, we have to generate code like this:
130
131	lea [p], d0 ; load effective address
132	push d0
133	call #myfunc
134	mov d0, [x]
135
136	So, in the most general case of a call with external linkage, we need
137	'p' to have a main memory address so that we can push it on the stack as
138	the parameter to this call. Registers don't have main memory addresses,
139	so 'p' can't go in a register.
140
141	However, we know what myfunc() looks like:
142
143	char myfunc(const char **p)
144	{
145	char c = **p;
146	*p += 1;
147	return c;
148	}
149
150	If the compiler chooses to in-line this function, it can globally
151	optimize its linkage and implementation as we saw earlier. So, the
152	compiler can rewrite the code like so:
153
154	register const char *p;
155	int x = **(&p);
156	*(&p) += 1;
157
158	which can be further rewritten to:
159
160	register const char *p;
161	int x = *p;
162	p += 1;
163
164	Now we can generate the machine code for the final optimized form:
165
166	mov [p], a0 ; get the value of p into index register 0
167	mov.byte [a0+0], d0 ; get the value index register 0 points to
168	mov.byte d0, [x] ; store it in x
169	inc [p] ; inc the value of p
170
171	do we need a main memory address for p. This means the compiler
172	can keep p in a register, say d5:
173
174	mov d5, a0
175	mov.byte [a0+0], d0
176	mov.byte d0, [x]
177	inc d5
178
179	And this is indeed exactly what the code that comes out of most
180	compilers looks like (changed from my abstract machine to 32-bit x86, of
181	course).
182
183	So: if the compiler chooses to in-line the functions that are called
184	with '&p' as a parameter, and the compiler performs the available
185	optimizations on those calls once they're in-lined, then a memory
186	address for 'p' is never needed. Thus there is a valid interpretation
187	of the code where 'register p' can be obeyed. If the compiler doesn't
188	choose to in-line the functions or make those optimizations, then the
189	compiler will be unable to satisfy the 'register p' request and will be
190	forced to put 'p' in addressable main memory. But it really is entirely
191	up to the compiler whether to obey the 'register p' request; the
192	program's structure does not make the request impossible to satisfy.
193	Therefore there is no reason for the compiler to warn about this, any
194	more than there would be if the compiler chose not to obey the 'register
195	p' simply because it thought it could make more optimal use of the
196	available registers. That GCC warns is understandable, in that a
197	superficial reading of the code would not reveal the optimization
198	opportunity; but the warning is nonetheless unnecessary, and the
199	'register' does provide useful optimization hinting.
200
201
202	OK, long read, but the the conclusion is that "fixing the code to not
203	declare the variable as register would be the correct thing to do" it
204	not the correct thing to do. The correct thing to do is to ignore the
205	warning, which is not possible if warnings are turned into errors.
206
207	You also mentioned that "other compilers will issue a hard error for
208	it." That sounds rather strange, and I wonder which compilers that
209	might be; someone should file a bug report against them ;)