Gentoo Archives: gentoo-dev

From: Nikos Chantziaras <realnc@×××××.de>
To: gentoo-dev@l.g.o
Subject: [gentoo-dev] Re: FYI: Rules for distro-friendly packages
Date: Sun, 27 Jun 2010 14:47:46
Message-Id: i07o8g$ug0$1@dough.gmane.org
In Reply to: Re: [gentoo-dev] Re: FYI: Rules for distro-friendly packages by "Harald van Dijk"
1 On 06/27/2010 03:23 PM, Harald van Dijk wrote:
2 > On Sun, Jun 27, 2010 at 02:56:33PM +0300, Nikos Chantziaras wrote:
3 >> On 06/27/2010 01:47 PM, Enrico Weigelt wrote:
4 >>> * Nikos Chantziaras<realnc@×××××.de> schrieb:
5 >>>
6 >>>> Did it actually occur to anyone that warnings are not errors?
7 >>>> You can have them for correct code. A warning means you might
8 >>>> want to look at the code to check whether there's some real
9 >>>> error there. It doesn't mean the code is broken.
10 >>>
11 >>> In my personal experience, most times a warning comes it, the
12 >>> code *is* broken (but *might* work in most situations).
13 >>
14 >> That's the key to it: most times. Granted, without -Wall (or any
15 >> other options that tweaks the default warning level) we can be very
16 >> sure that the warning is the result of a mistake by the developer.
17 >> But with -Wall, many warnings are totally not interesting ("unused
18 >> parameter") and some even try to outsmart the programmer even
19 >> though he/she knows better ("taking address of variable declared
20 >> register"). In that last example, fixing it would even be wrong
21 >> when you consider the optimizer and the fuzzy meaning of "register"
22 >> which the compiler is totally free to ignore.
23 >
24 > The compiler is not totally free to ignore the register keyword.
25 > Both the C and the C++ standards require that the compiler complain
26 > when taking the address of a register variable. Other compilers will
27 > issue a hard error for it. Fixing the code to not declare the
28 > variable as register would be the correct thing to do.
29
30 No, it would not be the correct thing to do, because of the following.
31 (This is part of a discussion between me and someone quite smarter than
32 me, who explained the issue in detail.)
33
34 The basic issue is that the code takes the address of the variable in
35 question in expressions passed as parameters to certain function calls.
36 These function calls all happen to be in-linable functions, and it
37 happens that in each function, the address operator is always canceled
38 out by a '*' dereference operator - in other words, we have '*&p', which
39 the compiler can turn into just plain 'p' when the calls are in-lined,
40 eliminating the need to actually take the address of 'p'.
41
42 A compiler is always free to ignore 'register' declarations *anyway*,
43 even if enregistration is possible. Therefore a warning that it's not
44 possible to obey 'register' is unnecessary, because it's explicit in the
45 language definition that 'register' is not binding. It simply is not
46 possible for an ignored 'register' attribute to cause unexpected
47 behavior. Warnings really should only be generated for situations where
48 it is likely that the programmer expects different behavior than the
49 compiler will deliver; in the case of an ignored 'register' attribute,
50 the programmer is *required* to expect that the attribute might be
51 ignored, so a warning to this effect is superfluous.
52
53 Now, I understand why they generate the warning - it's because the
54 compiler believes that the program code itself makes enregistration
55 impossible, not because the compiler has chosen for optimization
56 purposes to ignore the 'register' request. However, as we'll see
57 shortly, the program code doesn't truly make enregistration impossible;
58 it is merely impossible in some interpretations of the code. Therefore
59 we really are back to the compiler choosing to ignore the 'register'
60 request due to its own optimization decisions; the 'register' request is
61 made impossible far downstream of the actual decisions that the compiler
62 makes (which have to do with in-line vs out-of-line calls), but it
63 really is compiler decisions that make it impossible, not the inherent
64 structure of the code.
65
66 When a function is in-lined, the compiler is not required to generate
67 the same code it would generate for the most general case of the same
68 function call, as long as the meaning is the same.
69
70 For example, suppose we have some code that contains a call to a
71 function like so:
72
73 a = myFunc(a + 7, 3);
74
75 In the general out-of-line case, the compiler must generate some
76 machine-code instructions like this:
77
78 push #3
79 mov [a], d0
80 add #7, d0
81 push d0
82 call #myFunc
83 mov d0, [a]
84
85 The compiler doesn't have access to the inner workings of myFunc, so it
86 must generate the appropriate code for the generic interface to an
87 external function.
88
89 Now, suppose the function is defined like so:
90
91 int myFunc(int a, int b) { return a - 6; }
92
93 and further suppose that the compiler decides to in-line this function.
94 In-lining means the compiler will generate the code that implements the
95 function directly in the caller; there will be no call to an external
96 linkage point. This means the compiler can implement the linkage to the
97 function with a custom one-off interface for this particular invocation
98 - every in-line invocation can be customized to the exact context where
99 it appears. So, for example, if we call myFunc right now and registers
100 d1 and d2 happens to be available, we can put the parameters in d1 and
101 d2, and the generated function will refer to those registers for the
102 parameters rather than having to look in the stack. Later on, if we
103 generate a separate call to the same function, but registers d3 and d7
104 are the ones available, we can use those instead. Each generated copy
105 of the function can fit its exact context.
106
107 Furthermore, looking at this function and at the arguments passed, we
108 can see that the formal parameter 'b' has no effect on the function's
109 results, and the actual parameter '3' passed for 'b' has no side
110 effects. Therefore, the compiler is free to completely ignore this
111 parameter - there's no need to generate any code for it at all, since we
112 have sufficient knowledge to see that it has no effect on the meaning of
113 the code.
114
115 Further still, we can globally optimize the entire function. So, we can
116 see that myFunc(a+7, 3) is going to turn into the expression (a+7-6).
117 We can fold constants to arrive at (a+1) as the result of the function.
118 We can therefore generate the entire code for the function's invocation
119 like so:
120
121 inc [a]
122
123 Okay, now let's look at the &p case. In the specific examples in
124 vmrun.cpp, we have a bunch of function invocations like this:
125
126 register const char *p;
127 int x = myfunc(&p);
128
129 In the most general case, we have to generate code like this:
130
131 lea [p], d0 ; load effective address
132 push d0
133 call #myfunc
134 mov d0, [x]
135
136 So, in the most general case of a call with external linkage, we need
137 'p' to have a main memory address so that we can push it on the stack as
138 the parameter to this call. Registers don't have main memory addresses,
139 so 'p' can't go in a register.
140
141 However, we know what myfunc() looks like:
142
143 char myfunc(const char **p)
144 {
145 char c = **p;
146 *p += 1;
147 return c;
148 }
149
150 If the compiler chooses to in-line this function, it can globally
151 optimize its linkage and implementation as we saw earlier. So, the
152 compiler can rewrite the code like so:
153
154 register const char *p;
155 int x = **(&p);
156 *(&p) += 1;
157
158 which can be further rewritten to:
159
160 register const char *p;
161 int x = *p;
162 p += 1;
163
164 Now we can generate the machine code for the final optimized form:
165
166 mov [p], a0 ; get the *value* of p into index register 0
167 mov.byte [a0+0], d0 ; get the value index register 0 points to
168 mov.byte d0, [x] ; store it in x
169 inc [p] ; inc the value of p
170
171 do we need a main memory address for p. This means the compiler
172 can keep p in a register, say d5:
173
174 mov d5, a0
175 mov.byte [a0+0], d0
176 mov.byte d0, [x]
177 inc d5
178
179 And this is indeed exactly what the code that comes out of most
180 compilers looks like (changed from my abstract machine to 32-bit x86, of
181 course).
182
183 So: if the compiler chooses to in-line the functions that are called
184 with '&p' as a parameter, and the compiler performs the available
185 optimizations on those calls once they're in-lined, then a memory
186 address for 'p' is never needed. Thus there is a valid interpretation
187 of the code where 'register p' can be obeyed. If the compiler doesn't
188 choose to in-line the functions or make those optimizations, then the
189 compiler will be unable to satisfy the 'register p' request and will be
190 forced to put 'p' in addressable main memory. But it really is entirely
191 up to the compiler whether to obey the 'register p' request; the
192 program's structure does not make the request impossible to satisfy.
193 Therefore there is no reason for the compiler to warn about this, any
194 more than there would be if the compiler chose not to obey the 'register
195 p' simply because it thought it could make more optimal use of the
196 available registers. That GCC warns is understandable, in that a
197 superficial reading of the code would not reveal the optimization
198 opportunity; but the warning is nonetheless unnecessary, and the
199 'register' does provide useful optimization hinting.
200
201
202 OK, long read, but the the conclusion is that "fixing the code to not
203 declare the variable as register would be the correct thing to do" it
204 *not* the correct thing to do. The correct thing to do is to ignore the
205 warning, which is not possible if warnings are turned into errors.
206
207 You also mentioned that "other compilers will issue a hard error for
208 it." That sounds rather strange, and I wonder which compilers that
209 might be; someone should file a bug report against them ;)

Replies

Subject Author
Re: [gentoo-dev] Re: FYI: Rules for distro-friendly packages "Harald van Dijk" <truedfx@g.o>