1 |
Mark Knecht posted |
2 |
<5bdc1c8b0510151118p447b72a9la959017a0de1dd08@××××××××××.com>, excerpted |
3 |
below, on Sat, 15 Oct 2005 11:18:12 -0700: |
4 |
|
5 |
> Is anyone else seeign these troubling messages from tie to time? |
6 |
> |
7 |
> mtrr: base(0xe8020000) is not aligned on a size(0x400000) boundary |
8 |
> mtrr: type mismatch for e8000000,8000000 old: write-back new: write-combining |
9 |
> mtrr: type mismatch for e8000000,8000000 old: write-back new: write-combining |
10 |
> mtrr: type mismatch for e8000000,8000000 old: write-back new: write-combining |
11 |
> mtrr: base(0xe8020000) is not aligned on a size(0x400000) boundary |
12 |
> mtrr: type mismatch for e8000000,8000000 old: write-back new: write-combining |
13 |
> mtrr: base(0xe8020000) is not aligned on a size(0x400000) boundary |
14 |
> mtrr: type mismatch for e8000000,8000000 old: write-back new: write-combining |
15 |
> mtrr: base(0xe8020000) is not aligned on a size(0x400000) boundary |
16 |
> mtrr: type mismatch for e8000000,8000000 old: write-back new: write-combining |
17 |
> |
18 |
> Is there somethign I should be doing to fix this? |
19 |
|
20 |
You may well know more about this than I do, but on the off chance this |
21 |
may be new to you and for others... (and because I'm googling and learning |
22 |
a bit in the process myself... =8^) |
23 |
|
24 |
mtrr=memory type range (or region) register. This is definitely kernel |
25 |
domain we are talking about here, often video drivers (so xorg related as |
26 |
well). |
27 |
|
28 |
Here's a link to an old but decently "Engish" explanation of what MTRRs |
29 |
are that doesn't get too technical, but gives you some idea of not only |
30 |
what the theory does, but the effects on real-world performance. (It's |
31 |
talking PentiumPro/PII CPUs and kernel 2.2.0 or later... I /said/ it was |
32 |
old! =8^) |
33 |
|
34 |
http://www.meduna.org/txt_mtrr_en.html |
35 |
|
36 |
Paraphrasing from the above link... |
37 |
|
38 |
Basically, the MTRRs determine the behavior of cache vs regular memory on |
39 |
memory-write, for various memory regions/ranges. |
40 |
|
41 |
* Write-thru means the write is to cache and main memory together, slower |
42 |
than Write-back or Write-combining below but most reliable as any DMAs |
43 |
from main memory are guaranteed to be up-to-date. |
44 |
|
45 |
* Write-back is far faster, allowing the CPU to write to cache only then |
46 |
go about its business, with the update to main memory happening as memory |
47 |
bandwidth permits. The catch is that cache (play on words there! <g>) and |
48 |
main memory can be out of sync momentarily, which could mean either longer |
49 |
waits for the update to happen when it /has/ to happen (if everything is |
50 |
working right and the out-of-date-data is caught and a wait forced until |
51 |
it's updated to valid data when needed), or in the worst-case scenario, if |
52 |
something's not quite working right, glitches, instability and crashes |
53 |
because old and invalid data was used where updated data was expected. |
54 |
|
55 |
* Write-combining is sort of in-between the two above choices. The data |
56 |
is allowed to sit in cache without updating main memory only a |
57 |
comparatively short period, in ordered to allow the possibility of |
58 |
combining several smaller writes into a larger, single, more efficient |
59 |
write. (More efficient because each write has a set amount of overhead in |
60 |
setup and take-down time and data. Thus, combining 8 128-byte |
61 |
transactions into a single 1KB transaction means 1/8 the overhead, thus |
62 |
more effective payload bandwidth, at a cost of more latency, due to |
63 |
waiting for several transactions to accumulate, provided of course they |
64 |
come in before the expiry time forces what's there to be written even if |
65 |
it's not yet a full size transfer.) |
66 |
|
67 |
* There's also uncachable, which turns off caching for reads as well as |
68 |
writes. This will be /very/ slow. |
69 |
|
70 |
Where graphics gets involved is that these MTRRs are often used to program |
71 |
access to video memory over the AGP or whatever bus. The link above lists |
72 |
some of the (then) xfree86 operations that MTRR settings affect and by how |
73 |
much. |
74 |
|
75 |
Another link with some interesting info on the kernel config option |
76 |
(CONFIG_MTRR) and the userland interface to MTRRs (/proc/mtrr) once |
77 |
enabled. The two paragraphs below are excerpted: |
78 |
|
79 |
http://developer.osdl.org/dev/robustmutexes/src/fusyn.hg/Documentation/mtrr.txt |
80 |
|
81 |
The CONFIG_MTRR option creates a /proc/mtrr file which may be used to |
82 |
manipulate your MTRRs. Typically the X server should use this. This should |
83 |
have a reasonably generic interface so that similar control registers on |
84 |
other processors can be easily supported. |
85 |
|
86 |
There are two interfaces to /proc/mtrr: one is an ASCII interface which |
87 |
allows you to read and write. The other is an ioctl() interface. The ASCII |
88 |
interface is meant for administration. The ioctl() interface is meant for |
89 |
C programs (i.e. the X server). The interfaces are described below, with |
90 |
sample commands and C code. |
91 |
|
92 |
</quote> |
93 |
|
94 |
Finally, this, on the mtrr_add command from kernelnewbies: |
95 |
|
96 |
http://kernelnewbies.org/documents/kdoc/kernel-api/r7666.html |
97 |
|
98 |
Memory type region registers control the caching on newer Intel and non |
99 |
Intel processors. This function allows drivers to request an MTRR is |
100 |
added. The details and hardware specifics of each processor's |
101 |
implementation are hidden from the caller, but nevertheless the caller |
102 |
should expect to need to provide a power of two size on an equivalent |
103 |
power of two boundary. |
104 |
|
105 |
If the region cannot be added either because all regions are in use or |
106 |
the CPU cannot support it a negative value is returned. On success the |
107 |
register number for this entry is returned, but should be treated as a |
108 |
cookie only. On a multiprocessor machine the changes are made to all |
109 |
processors. This is required on x86 by the Intel processors. The |
110 |
available types are |
111 |
|
112 |
MTRR_TYPE_UNCACHABLE - No caching |
113 |
|
114 |
MTRR_TYPE_WRBACK - Write data back in bursts whenever |
115 |
|
116 |
MTRR_TYPE_WRCOMB - Write data back soon but allow bursts |
117 |
|
118 |
MTRR_TYPE_WRTHROUGH - Cache reads but not writes |
119 |
|
120 |
</quote> |
121 |
|
122 |
That last contains a mention of boundaries ("power of two size on an |
123 |
equivalent power of two boundary") that appears to pertain to your |
124 |
problem. |
125 |
|
126 |
So... now we know a bit about what MTRRs actually do (control the |
127 |
interaction between cache and a specified portion of memory for write |
128 |
transactions), what they are most often adjusted for (to increase graphics |
129 |
performance, by changing the way writes to graphics memory are cached), |
130 |
and can make a bit of sense out of the messages (the size doesn't match |
131 |
the required base address for the MTRR, something's trying to change the |
132 |
caching method, but using the wrong address and the adjustment is |
133 |
therefore getting a type mismatch). |
134 |
|
135 |
How does that translate into something you can do to fix it? |
136 |
|
137 |
Well, first, you can actually go take a look at /proc/mtrr (don't try to |
138 |
write anything to it, unless you are sure you know what you are doing, but |
139 |
reading it should be fine), and see if you can figure out which entry it's |
140 |
supposed to be changing, if there's one close to that address at all, or |
141 |
if not, what needs created. |
142 |
|
143 |
Beyond that, it depends on what is actually using the MTRR. It's probably |
144 |
your video card driver, but it could be something else. You don't say |
145 |
which card you have, but from the googling I did to find the above, I see |
146 |
that ATI's proprietary drivers at least, have posts about changing MTRRs |
147 |
to increase performance. Whatever your video card driver is, that's |
148 |
probably (but not for certain) what's causing the log messages. |
149 |
Therefore, take a look at /var/log/Xorg.0.log, and see if you can match up |
150 |
any possible MTRR messages listed there. |
151 |
|
152 |
Next, take a look at the driver documentation and your xorg.conf file |
153 |
and boot loader config, and see what sort of adjustments you might need to |
154 |
make. |
155 |
|
156 |
Of course, if it's /not/ video driver related, you'll likely have to |
157 |
figure out what else is accessing the MTRRs and how to reconfigure it |
158 |
correctly. Taking a wild guess, I'd say check anything that's likely to |
159 |
be using DMA, thus, stuff like NICs or storage devices, and their drivers. |
160 |
|
161 |
Oh, MTRRs are also used in connection with mapping around the PCI device |
162 |
memory hole just below 4GB, if you have 4G or more of memory. |
163 |
|
164 |
Hmm... Now I'm beginning to integrate what I just learned here, with some |
165 |
other stuff I just read, with stuff I knew before, and a real-time look |
166 |
at /proc/mtrr on my system... and things are beginning to "click". You |
167 |
get to see my understanding developing as I write this! <g> |
168 |
|
169 |
>From an Opteron BIOS integrator's pdf @ amd... they recommend one of the |
170 |
variable MTRRs (there are some fixed ones covering the memory space from |
171 |
640k to 1 MB as well, that must be what those common settings in BIOS must |
172 |
be for) be set to cover the entire physical memory range... And so I |
173 |
see... I have a gig of memory and see a 1024 MB MTRR set @ base-address |
174 |
0x 0000 0000 (0 MB), type write-back (so read/write cacheable). That makes |
175 |
sense as it's telling the CPU(s, 2 in my case) that all of my main memory |
176 |
is fully cacheable, no special restrictions needed! |
177 |
|
178 |
Apparently, some CPUs only have two variable MTRRs, and if one is used to |
179 |
cover all of physical main-memory, that leaves only one available, which |
180 |
would be used by the video driver, so that's how Linux is normally setup. |
181 |
Again apparently, modern x86 (and presumably x86_64 as well) CPUs from |
182 |
both AMD and Intel have eight such MTRRs, so more ranges can be programed |
183 |
as needed. |
184 |
|
185 |
Now this is just a guess based on the fixed MTRs (memory type ranges, I'm |
186 |
using MTRR to reference the register, MTR to reference the range in |
187 |
memory it controls, here referring to the previously mentioned fixed range |
188 |
MTRRs/MTRs between 640k and 1M) being included in the main-mem variable |
189 |
range, and variable MTRs within other variable MTRs might not work quite |
190 |
the same (tho I expect they do), but going on that... smaller ones in |
191 |
larger ones can act as exceptions. |
192 |
|
193 |
That would explain the BIOS MTRR setting in many AMD64 systems for > 3.5 |
194 |
GB physical memory -- continuous or explicit hole for the 3.5 - 4 GB |
195 |
(0x e000 0000 - 0x ffff ffff) top-of-32-bit-memory-space PCI hole. The |
196 |
continuous option will map the entire X GB of memory as a single MTR, |
197 |
presumably (presumably since I've never run > 4 GB myself, and I'm making |
198 |
connections faster than I could look them up to verify them) with |
199 |
additional MTRs such as the video memory one overlaid on top of it, in the |
200 |
3.5-4.0 GB PCI device space. Non-contiguous or explicit-hole would map an |
201 |
explicit hole to the 3.5-4 GB area. Note that this would map additional |
202 |
memory up above the 4 GB 32-bit barrier, out of reach of most 32-bit |
203 |
systems, explaining some comments on forums.amd.com I was reading. If |
204 |
this isn't handled correctly, even 64-bit systems won't see all their |
205 |
memory if it's more than 3.5 GB, since part of it will be hidden by the |
206 |
PCI device overlay 3.5 - 4.0 GB. |
207 |
|
208 |
Note that at least here (Tyan s2885 dual Opteron board), the BIOS actually |
209 |
has two related settings controlling the way > 3.5 GB of physical memory |
210 |
is mapped. One apparently controls the actual memory addresses, whether |
211 |
they skip the 3.5-4 GB range or not, the other controls the MTRRs, |
212 |
continuous or not. If the two don't match, it could mean all of memory is |
213 |
seen but not all of it is marked as cacheable, slowing access to the |
214 |
uncached memory range. |
215 |
|
216 |
... Another leap of understanding... Remember those ranges need to be in |
217 |
power-of-2 sizes and on matching power-of-2 boundaries? I happen to have |
218 |
a gig of memory, an even power of two, so one MTR covers it exactly. |
219 |
That wouldn't work for those with 3/4 gig or 1.5 or 3 gig or some such. |
220 |
OTOH, I noticed a count=1 at the end of the two ranges I have mapped in my |
221 |
/proc/mtrr, so it would appear that could be remedied by using say 3 |
222 |
half-gig MTRs (count=3) stacked end-to-end, or a 1 gig and a half gig |
223 |
(thus two), to map that 1.5 gig area. I'm not sure if the count= would |
224 |
mean it's using additional MTRRs or not, tho I'd expect it would indeed |
225 |
mean that. Thus, non-power-of-two memory layouts will probably mean more |
226 |
MTRRs used to map the full memory MTR. |
227 |
|
228 |
(If there's anyone with an odd amount of memory that could verify, it'd be |
229 |
nice...) |
230 |
|
231 |
My second MTRR is set to cover the 128 MB range dedicated to my video card |
232 |
(it's got 256 MB but on 128 MB is being used, it would appear, but it's no |
233 |
big deal since I don't do much 3D anyway because I'm running dual head |
234 |
video @ 2048x1536 each). Here, it's the 128 MB beginning at exactly 3.5 |
235 |
MB base address (0x e000 0000), write combining. |
236 |
|
237 |
OK... 0x e000 0000 is the 3.5 GB boundary, so your 0x e802 0000 is indeed |
238 |
in the PCI device area... 0x 800 0000 is 128 MB, 0x 2 0000 is 8 KB, so the |
239 |
base address it's attempting to use is 3.5 GB + 128 MB + 8 KB. The |
240 |
requested size is 0x 40 0000 or 4 MB, so the closest boundary would be the |
241 |
0x e800 0000 you see it trying for and getting the type mismatch. |
242 |
|
243 |
Note also that the AMD docs say disable caching (which would mean flush |
244 |
all pending writes) before making other MTRR changes, also. |
245 |
|
246 |
What I'd guess is happening here, then, is that these errors are occurring |
247 |
when you start X, and the graphics video driver tries to overlay a |
248 |
write-combining MTRR over top of (part of) the previously mapped |
249 |
write-back MTRR covering all of main-memory (which would appear to be 4GB |
250 |
or better, thus overlapping the 3.5-4.0 GB PCI device area). They will |
251 |
mean problems with the video rendering, either glitches or not as fast as |
252 |
it could be. (Write-back being less strict than write-combining, I'm |
253 |
thinking it could mean glitches, whenever the video card tries to draw |
254 |
memory that's not current with that in cache. However, it may not appear |
255 |
except under heavy 3D use, such as in games.) |
256 |
|
257 |
If that's the case, in addition to the MTRR remapping you may do either |
258 |
from your boot scripts or by reconfiguring your graphics card driver in |
259 |
xorg.conf, you may have to reset your BIOS' MTRR settings to specifically |
260 |
map out the pci-area hole. |
261 |
|
262 |
That's about all I have ATM... Hope this is as informative for others as |
263 |
it just was for me! <g> (And, if anyone's an expert on this stuff and I |
264 |
got it wrong, please inform me where! Better to correct any mistakes I |
265 |
have now while it's new than after I build a bunch more suppositions on a |
266 |
faulty foundation!) |
267 |
|
268 |
-- |
269 |
Duncan - List replies preferred. No HTML msgs. |
270 |
"Every nonfree program has a lord, a master -- |
271 |
and if you use the program, he is your master." Richard Stallman in |
272 |
http://www.linuxdevcenter.com/pub/a/linux/2004/12/22/rms_interview.html |
273 |
|
274 |
|
275 |
-- |
276 |
gentoo-amd64@g.o mailing list |