1 |
On 06/27/2010 03:23 PM, Harald van Dijk wrote: |
2 |
> On Sun, Jun 27, 2010 at 02:56:33PM +0300, Nikos Chantziaras wrote: |
3 |
>> On 06/27/2010 01:47 PM, Enrico Weigelt wrote: |
4 |
>>> * Nikos Chantziaras<realnc@×××××.de> schrieb: |
5 |
>>> |
6 |
>>>> Did it actually occur to anyone that warnings are not errors? |
7 |
>>>> You can have them for correct code. A warning means you might |
8 |
>>>> want to look at the code to check whether there's some real |
9 |
>>>> error there. It doesn't mean the code is broken. |
10 |
>>> |
11 |
>>> In my personal experience, most times a warning comes it, the |
12 |
>>> code *is* broken (but *might* work in most situations). |
13 |
>> |
14 |
>> That's the key to it: most times. Granted, without -Wall (or any |
15 |
>> other options that tweaks the default warning level) we can be very |
16 |
>> sure that the warning is the result of a mistake by the developer. |
17 |
>> But with -Wall, many warnings are totally not interesting ("unused |
18 |
>> parameter") and some even try to outsmart the programmer even |
19 |
>> though he/she knows better ("taking address of variable declared |
20 |
>> register"). In that last example, fixing it would even be wrong |
21 |
>> when you consider the optimizer and the fuzzy meaning of "register" |
22 |
>> which the compiler is totally free to ignore. |
23 |
> |
24 |
> The compiler is not totally free to ignore the register keyword. |
25 |
> Both the C and the C++ standards require that the compiler complain |
26 |
> when taking the address of a register variable. Other compilers will |
27 |
> issue a hard error for it. Fixing the code to not declare the |
28 |
> variable as register would be the correct thing to do. |
29 |
|
30 |
No, it would not be the correct thing to do, because of the following. |
31 |
(This is part of a discussion between me and someone quite smarter than |
32 |
me, who explained the issue in detail.) |
33 |
|
34 |
The basic issue is that the code takes the address of the variable in |
35 |
question in expressions passed as parameters to certain function calls. |
36 |
These function calls all happen to be in-linable functions, and it |
37 |
happens that in each function, the address operator is always canceled |
38 |
out by a '*' dereference operator - in other words, we have '*&p', which |
39 |
the compiler can turn into just plain 'p' when the calls are in-lined, |
40 |
eliminating the need to actually take the address of 'p'. |
41 |
|
42 |
A compiler is always free to ignore 'register' declarations *anyway*, |
43 |
even if enregistration is possible. Therefore a warning that it's not |
44 |
possible to obey 'register' is unnecessary, because it's explicit in the |
45 |
language definition that 'register' is not binding. It simply is not |
46 |
possible for an ignored 'register' attribute to cause unexpected |
47 |
behavior. Warnings really should only be generated for situations where |
48 |
it is likely that the programmer expects different behavior than the |
49 |
compiler will deliver; in the case of an ignored 'register' attribute, |
50 |
the programmer is *required* to expect that the attribute might be |
51 |
ignored, so a warning to this effect is superfluous. |
52 |
|
53 |
Now, I understand why they generate the warning - it's because the |
54 |
compiler believes that the program code itself makes enregistration |
55 |
impossible, not because the compiler has chosen for optimization |
56 |
purposes to ignore the 'register' request. However, as we'll see |
57 |
shortly, the program code doesn't truly make enregistration impossible; |
58 |
it is merely impossible in some interpretations of the code. Therefore |
59 |
we really are back to the compiler choosing to ignore the 'register' |
60 |
request due to its own optimization decisions; the 'register' request is |
61 |
made impossible far downstream of the actual decisions that the compiler |
62 |
makes (which have to do with in-line vs out-of-line calls), but it |
63 |
really is compiler decisions that make it impossible, not the inherent |
64 |
structure of the code. |
65 |
|
66 |
When a function is in-lined, the compiler is not required to generate |
67 |
the same code it would generate for the most general case of the same |
68 |
function call, as long as the meaning is the same. |
69 |
|
70 |
For example, suppose we have some code that contains a call to a |
71 |
function like so: |
72 |
|
73 |
a = myFunc(a + 7, 3); |
74 |
|
75 |
In the general out-of-line case, the compiler must generate some |
76 |
machine-code instructions like this: |
77 |
|
78 |
push #3 |
79 |
mov [a], d0 |
80 |
add #7, d0 |
81 |
push d0 |
82 |
call #myFunc |
83 |
mov d0, [a] |
84 |
|
85 |
The compiler doesn't have access to the inner workings of myFunc, so it |
86 |
must generate the appropriate code for the generic interface to an |
87 |
external function. |
88 |
|
89 |
Now, suppose the function is defined like so: |
90 |
|
91 |
int myFunc(int a, int b) { return a - 6; } |
92 |
|
93 |
and further suppose that the compiler decides to in-line this function. |
94 |
In-lining means the compiler will generate the code that implements the |
95 |
function directly in the caller; there will be no call to an external |
96 |
linkage point. This means the compiler can implement the linkage to the |
97 |
function with a custom one-off interface for this particular invocation |
98 |
- every in-line invocation can be customized to the exact context where |
99 |
it appears. So, for example, if we call myFunc right now and registers |
100 |
d1 and d2 happens to be available, we can put the parameters in d1 and |
101 |
d2, and the generated function will refer to those registers for the |
102 |
parameters rather than having to look in the stack. Later on, if we |
103 |
generate a separate call to the same function, but registers d3 and d7 |
104 |
are the ones available, we can use those instead. Each generated copy |
105 |
of the function can fit its exact context. |
106 |
|
107 |
Furthermore, looking at this function and at the arguments passed, we |
108 |
can see that the formal parameter 'b' has no effect on the function's |
109 |
results, and the actual parameter '3' passed for 'b' has no side |
110 |
effects. Therefore, the compiler is free to completely ignore this |
111 |
parameter - there's no need to generate any code for it at all, since we |
112 |
have sufficient knowledge to see that it has no effect on the meaning of |
113 |
the code. |
114 |
|
115 |
Further still, we can globally optimize the entire function. So, we can |
116 |
see that myFunc(a+7, 3) is going to turn into the expression (a+7-6). |
117 |
We can fold constants to arrive at (a+1) as the result of the function. |
118 |
We can therefore generate the entire code for the function's invocation |
119 |
like so: |
120 |
|
121 |
inc [a] |
122 |
|
123 |
Okay, now let's look at the &p case. In the specific examples in |
124 |
vmrun.cpp, we have a bunch of function invocations like this: |
125 |
|
126 |
register const char *p; |
127 |
int x = myfunc(&p); |
128 |
|
129 |
In the most general case, we have to generate code like this: |
130 |
|
131 |
lea [p], d0 ; load effective address |
132 |
push d0 |
133 |
call #myfunc |
134 |
mov d0, [x] |
135 |
|
136 |
So, in the most general case of a call with external linkage, we need |
137 |
'p' to have a main memory address so that we can push it on the stack as |
138 |
the parameter to this call. Registers don't have main memory addresses, |
139 |
so 'p' can't go in a register. |
140 |
|
141 |
However, we know what myfunc() looks like: |
142 |
|
143 |
char myfunc(const char **p) |
144 |
{ |
145 |
char c = **p; |
146 |
*p += 1; |
147 |
return c; |
148 |
} |
149 |
|
150 |
If the compiler chooses to in-line this function, it can globally |
151 |
optimize its linkage and implementation as we saw earlier. So, the |
152 |
compiler can rewrite the code like so: |
153 |
|
154 |
register const char *p; |
155 |
int x = **(&p); |
156 |
*(&p) += 1; |
157 |
|
158 |
which can be further rewritten to: |
159 |
|
160 |
register const char *p; |
161 |
int x = *p; |
162 |
p += 1; |
163 |
|
164 |
Now we can generate the machine code for the final optimized form: |
165 |
|
166 |
mov [p], a0 ; get the *value* of p into index register 0 |
167 |
mov.byte [a0+0], d0 ; get the value index register 0 points to |
168 |
mov.byte d0, [x] ; store it in x |
169 |
inc [p] ; inc the value of p |
170 |
|
171 |
do we need a main memory address for p. This means the compiler |
172 |
can keep p in a register, say d5: |
173 |
|
174 |
mov d5, a0 |
175 |
mov.byte [a0+0], d0 |
176 |
mov.byte d0, [x] |
177 |
inc d5 |
178 |
|
179 |
And this is indeed exactly what the code that comes out of most |
180 |
compilers looks like (changed from my abstract machine to 32-bit x86, of |
181 |
course). |
182 |
|
183 |
So: if the compiler chooses to in-line the functions that are called |
184 |
with '&p' as a parameter, and the compiler performs the available |
185 |
optimizations on those calls once they're in-lined, then a memory |
186 |
address for 'p' is never needed. Thus there is a valid interpretation |
187 |
of the code where 'register p' can be obeyed. If the compiler doesn't |
188 |
choose to in-line the functions or make those optimizations, then the |
189 |
compiler will be unable to satisfy the 'register p' request and will be |
190 |
forced to put 'p' in addressable main memory. But it really is entirely |
191 |
up to the compiler whether to obey the 'register p' request; the |
192 |
program's structure does not make the request impossible to satisfy. |
193 |
Therefore there is no reason for the compiler to warn about this, any |
194 |
more than there would be if the compiler chose not to obey the 'register |
195 |
p' simply because it thought it could make more optimal use of the |
196 |
available registers. That GCC warns is understandable, in that a |
197 |
superficial reading of the code would not reveal the optimization |
198 |
opportunity; but the warning is nonetheless unnecessary, and the |
199 |
'register' does provide useful optimization hinting. |
200 |
|
201 |
|
202 |
OK, long read, but the the conclusion is that "fixing the code to not |
203 |
declare the variable as register would be the correct thing to do" it |
204 |
*not* the correct thing to do. The correct thing to do is to ignore the |
205 |
warning, which is not possible if warnings are turned into errors. |
206 |
|
207 |
You also mentioned that "other compilers will issue a hard error for |
208 |
it." That sounds rather strange, and I wonder which compilers that |
209 |
might be; someone should file a bug report against them ;) |