Gentoo Archives: gentoo-amd64

From: Duncan <1i5t5.duncan@×××.net>
To: gentoo-amd64@l.g.o
Subject: [gentoo-amd64] Re: Memory usage; 32 bit vs 64 bit.
Date: Tue, 04 Jan 2011 02:05:06
Message-Id: pan.2011.01.04.01.27.32@cox.net
In Reply to: Re: [gentoo-amd64] Memory usage; 32 bit vs 64 bit. by Alex Alexander
1 Alex Alexander posted on Mon, 03 Jan 2011 21:37:08 +0200 as excerpted:
2
3 > On 3 Jan 2011, at 20:34, Dale <rdalek1967@×××××.com> wrote:
4 >
5 >> I recently built me a new 64 bit system. My old 32 bit system has 2Gbs
6 >> and my new system has 4Gbs. I was expecting it to use about the same
7 >> amount of memory but noticed it uses a good bit more on the new system
8 >> than the old one. With just the normal stuff open, I use about 1.5Gbs
9 >> of ram. My old system would use a little over half that. I have the
10 >> same settings on both.
11 >>
12 >> Is this difference because 64 bit programs use more memory, maybe they
13 >> are larger than 32 bit programs? Just curious. I notice that
14 >> Seamonkey uses more and KDE's plasma-desktop uses more. Those are
15 >> generally the biggest users.
16 >>
17 >> I'm not complaining about the usage, just curious as to why the
18 >> difference.
19 >>
20 > Are you sure you're checking your free ram correctly? run "free" and
21 > check the buffers/cache line :)
22
23 Linux memory usage is notoriously confusing for the uninitiated and not
24 entirely simple to explain or figure out the "real" per-app usage even for
25 those who know /something/ about it.
26
27 First, to directly answer the question. 64-bit memory usage /will/ be
28 somewhat higher, yes, but shouldn't be double. The reason usage is higher
29 is because address pointers are now 64-bit, not 32-bit, so /they/ take
30 twice the space. However, according to the gcc manpage:
31
32 -m32
33 -m64
34 Generate code for a 32-bit or 64-bit environment.
35 The 32-bit environment sets int, long and pointer
36 to 32 bits and generates code that runs on any i386
37 system. The 64-bit environment sets int to 32 bits
38 and long and pointer to 64 bits and generates code
39 for AMD's x86-64 architecture.
40
41 So the common "utility integer" standard C/C++ int types remain 32-bit.
42 This actually one of the bigger issues in porting sources from 32-bit to
43 64-bit, as for years, lazy 32-bit-only programmers were used to thinking
44 of int, long and (memory) pointer as the same size, 32-bits, and being
45 able to directly convert between them and use them nearly interchangeably,
46 but that's no longer possible on amd64, because pointers and ints are no
47 longer the same size.
48
49 But the point (not pointer! =:^) we're interested in for purposes of this
50 discussion is that the very commonly used "utility integer" known simply
51 as "int" remains 32-bit. Because the 32-bit int is /so/ commonly used, to
52 the point that it's the "default" integer type even on 64-bit, with only
53 memory pointers and integers requiring 64-bit size getting full 64-bit,
54 memory usage doesn't normally double, only increasing by some smaller
55 factor, depending on the app and its particular mix of 32-bit int vs 64-
56 bit memory pointer and 64-bit long integers.
57
58 This additional memory usage is one of the negatives of 64-bit, and the
59 reason that on archs other than x86, it's common to see 64-bit kernels for
60 the ability to address > 4GB at the system level, with a 32-bit user-land
61 since few individual apps (with noted exceptions) actually benefit from
62 being able to address > 1-4 GB of RAM in a single app. (Note the 1-4 GB
63 range. This is due to the common user-space/kernel-space split of the 4
64 GB address space on 32-bit systems, meaning individual apps may be limited
65 to only a gig of usable user-address-space, depending on whether the split
66 is 1:3/2:2/3:1 or separate 4GB spaces for user and kernel. Of course full
67 64-bit doesn't have to worry about this.)
68
69 x86 is somewhat different in this regard, however, because traditional 32-
70 bit x86 is known as a "register starved" architecture -- the number of
71 available full-CPU-speed registers on 32-bit x86 is comparatively limited,
72 forcing code to depend on slower L1 cache (tho that's still way faster
73 than L2/L3, which is way faster than main memory, which is way faster than
74 typical spinning-disk main storage) where other archs could be using their
75 relative abundance of CPU registers. When it was designing amd64, AMD
76 pretty much (I'm not sure if exactly) doubled the number of registers in
77 their 64-bit hardware spec as compared to 32-bit (where they kept the same
78 limited number of registers for compatibility reasons), with the result
79 being that on amd64/x86_64 the speed-boost from access to these additional
80 available registers often more than offsets the negative of the
81 comparative double-size memory pointers. The precise balance, whether the
82 cost of dealing with double-size memory pointers or the benefit of access
83 to all those additional registers wins, depends on the app in question,
84 but in general the benefit of the extra registers on amd64/x86_64 as
85 opposed to x86_32/ia32 is sufficient that it's far less common to see the
86 64-bit kernel, 32-bit userland that is often seen on other archs.
87
88 That takes care of the direct answer. Now to expand on what Alex referred
89 to and what I mentioned in my intro as well, the topic of measuring Linux
90 memory usage in general.
91
92 The uninitiated will often look at "free memory" (the value in the Mem:
93 line of the "free" command, run at the command line) on Linux, and wonder
94 why it's so small -- why Linux seems to use so much memory. But, as Alex
95 mentioned, that line is rather misleading, again, to the uninitiated.
96
97 Linux, like most OSs, considers "empty" memory "wasted" memory. If the
98 memory is available to use, therefore, Linux, as other OSs, will try to
99 use it for something, normally for disk cache, mainly, with a bit used for
100 other "buffering" as well. When/if the system needs that memory for other
101 stuff (apps), the cache and buffers can be dumped.
102
103 The confusion comes not in this, but rather, in the number actually
104 exposed as "free" memory, which can be two very different values, either
105 the actual "free" (unused=wasted) memory, or the "free for use if
106 needed" (including memory used for cache and buffers) memory, depending on
107 how the OS chooses to present it. On Linux, the "free" memory as reported
108 by the "free" command on the Mem: line is the first (unused=wasted), while
109 that on the -/+ buffers/cache line is the second (free for use if needed).
110
111 Swap, of course, can be thrown in as another factor, since within context
112 that can be seen as the reverse of disk cache -- app memory swapped out to
113 disk as opposed to disk data cached in memory. Thus free's Swap: line.
114 It's worth noting here the existence of the Linux kernel's swappiness
115 parameter, exposed in the filesystem as /proc/sys/vm/swappiness . This
116 file contains a number 0-100 (attempting to set it > 100 results in an
117 error), 60 being the default, indicating the desired balance between
118 swapping apps out to retain disk cache and keeping apps in memory thus
119 having less room for disk cache. 0 means always prefer keeping apps in
120 memory, dumping cache when needed to do so, 100 means always prefer
121 dumping apps to swap, retaining cache if at all possible.
122
123 As mentioned, the kernel swappiness default is 60, slightly preferring
124 cache to apps. A common recommendation found on the net, however, is to
125 lower swappiness to something like 20, preferring with some strength
126 retention of apps in memory to retention of cache.
127
128 Here, OTOH, I run swappiness=100, because swap is striped across four
129 disks, while most of the filesystem is RAID-1 mirrored on the same four
130 disks, so swap I/O should be faster than rereading formerly cached data
131 back in off disk. And, at least with my current 6 gigs RAM, with
132 PORTAGE_TMPDIR on tmpfs (which is reported in free's cache value) and with
133 parallel merging parameters carefully controlled so that even with
134 swappiness=100 I only end up a few MB (perhaps a couple hundred) into
135 swap, swappiness=100 works very well for me. I don't notice the bit of
136 swapping, and typically when I'm done, I might have 16 or 32 MB swapped
137 out, that stays that way until I swapoff -a or reboot, indicating that I
138 don't really use that bit of swapped apps much anyway or it'd be swapped
139 back in when I did.
140
141 If you wish to experiment with swappiness, you can cat it to see the value
142 as a normal user, but of course only save/echo a new value to it as root.
143
144 When you're done experimenting, if you want to make a permanent change,
145 add a line ...
146
147 vm.swappiness = 100
148
149 ... to your /etc/sysctl.conf file. (Other /proc/sys/* settings can be
150 similarly set this way, or of course with a simple echo-redirect line in
151 /etc/conf.d/local or the like. You can google for info on most or all of
152 the other files under /proc/sys/, if interested.)
153
154 OK, back from the swappiness detour, to memory usage.
155
156 What sort of memory usage is reasonable? Of course that depends on what
157 you do with your computer. =:^) But, as you know, I'm a KDE user as
158 well, and of course a gentoo/amd64 user. Currently, I have an uptime of a
159 week, which was when I last synced and updated both Gentoo and the kernel
160 (thus the week uptime, since I rebooted into the new kernel then). So
161 I've not done a full update since I rebooted, tho I did emerge a few new
162 packages (phonon-vlc and dependencies, including vlc, I was running phonon-
163 xine and still have it installed, but decided to try vlc and phonon-vlc) a
164 couple days ago. Of course I'm in KDE (4.5.4) ATM. With that general
165 system state and keeping in mind that I have 6 gigs RAM (the -m tells free
166 to report in MB):
167
168 $free -m
169 total used free shared buffers cached
170 Mem: 5925 3334 2590 0 319 1571
171 -/+ buffers/cache: 1443 4481
172 Swap: 20479 0 20479
173
174 So ~ 2.5 gigs is entirely unused (empty, effectively wasted, ATM), with
175 the ~ 3.25 gigs of used memory split between ~ 1.4 gigs used for apps and
176 ~ 1.8 gigs of cached and buffer memory, currently used to store data that
177 can be dumped to make room for actual apps, if necessary.
178
179 Tho in my experience, even the 1.4 gigs of app usage isn't entirely
180 required. It has been awhile ago now, but at one point I was running 1
181 gig of total RAM, with no swap. At that time, app-memory usage seemed to
182 run ~ half a gig. When I upgraded RAM to 8 gigs (I since lost a stick
183 that I've not replaced, thus the current 6 gigs), app memory usage
184 increased as well, to closer to a gig (IIRC it was about 1.2 gig after a
185 week's uptime, back then, to compare apples to apples as they say),
186 without changing what I was running or the settings. So given the memory
187 to use, the apps I run apparently use it, up to perhaps a gig and a half.
188 But if they're constrained to under a gig, they'll be content with less,
189 perhaps half a gig. I'm not sure of the mechanisms involved there except
190 that apps do have access to the memory info as well, and perhaps some of
191 them are more liberal with their own caching (in-memory web-page cache for
192 browsers, etc) and the like, given memory room to work with. But there's
193 clearly a point at which they have their fill, as at a gig of RAM, apps
194 were using half of it (half a gig), while when I upgraded to 8 gig, 8
195 times the RAM, app-memory usage only just over doubled. I suspect 4 gigs
196 and 8 gigs would have about the same usage, but below 4 gigs, the apps
197 start to be a bit more conservative with their own usage.
198
199 That covers overall system memory usage. But what about individual apps?
200
201 Individual app memory usage on Linux is unfortunately a rather complex
202 subject. Top is a useful app for reporting on and controlling (nicing,
203 killing, etc) other apps. Top's manpage has a nice description of the
204 various memory related stats and how they relate to each other, so I'll
205 refer you to that for some detail I'm omitting here. Meanwhile, on non-
206 swapping systems, resident memory (top's RES column) is about as accurate
207 a first-order approximation of app memory usage as you'll get, but it's
208 only reporting physical memory, so won't include anything swapped out.
209 Also, the memory one could expect to free by terminating that app will be
210 somewhat less than resident memory, due to libraries and data that may be
211 shared between multiple apps. Top has a SHR (shared) column to report
212 potentially shared memory, but doesn't tell you how many other apps (maybe
213 none) are actually sharing it. Some memory reporting apps won't count
214 shared memory as belonging to the app at all, others (like top, AFAIK)
215 report the full memory shared as belonging to each app, while still others
216 try to count how many apps are sharing what bits, and divide the shared
217 memory by the number of apps sharing it. Which way is "right" depends on
218 what information you're actually looking for. If you want the app totals
219 to match actual total memory usage, apportioned share reporting is the way
220 to go. If you want to know what quitting the app will actually free, only
221 count what's not shared by anything else. If you want to know how much
222 memory an app is actually using, regardless of other apps that may be
223 sharing it too, count all the memory it's using, shared or not.
224
225 Then there's swapping. Due to the way Linux works, the data available on
226 swapped out memory is limited. To get all the normal data would require
227 swapping all that data back in, rather defeating the purpose of swap, so
228 few if any memory usage reporting utils give you much detail about
229 anything that's swapped out. For people with memory enough to do so, a
230 swapoff (or simply running without swap at all) force-disables swap, thus
231 making full statistics available, but as mentioned above, to a point, many
232 apps will use more memory if it's available, conserve if it's not, so
233 running without swap on systems that routinely report non-zero swap usage
234 doesn't necessarily give a true picture of an app's memory usage with swap
235 enabled, either.
236
237 Conclusion: While the output of the free command (and by extension, other
238 references to free memory in Linux) may initially seem a bit unintuitive,
239 it's straightforward enough, once one understands what's there.
240 Unfortunately, the same can't be said about individual application memory
241 usage, which remains somewhat difficult to nail down and even more so to
242 properly describe, even after one understands the basics.
243
244 FWIW, however, I don't claim to be a programmer or to understand all that
245 much beyond the basics. Should someone believe I'm in error with the
246 above, or if they have anything to add or especially if they have a
247 reasonably accurate simpler way to describe things, please post! I love
248 to learn, and definitely do NOT believe I've reach my limit in learning in
249 this area!
250
251 --
252 Duncan - List replies preferred. No HTML msgs.
253 "Every nonfree program has a lord, a master --
254 and if you use the program, he is your master." Richard Stallman

Replies

Subject Author
Re: [gentoo-amd64] Re: Memory usage; 32 bit vs 64 bit. Volker Armin Hemmann <volkerarmin@××××××××××.com>
Re: [gentoo-amd64] Re: Memory usage; 32 bit vs 64 bit. Enrico Weigelt <weigelt@×××××.de>