[gentoo-amd64] Re: Hanging after a few days - 2.6.15-r7 - gentoo-amd64

From:	Duncan <1i5t5.duncan@×××.net>
To:	gentoo-amd64@l.g.o
Subject:	[gentoo-amd64] Re: Hanging after a few days - 2.6.15-r7
Date:	Mon, 13 Mar 2006 23:30:08
Message-Id:	`pan.2006.03.13.23.25.04.89595@cox.net`
In Reply to:	Re: [gentoo-amd64] Hanging after a few days - 2.6.15-r7 by Brett Johnson

1

Brett Johnson posted <20060313142338.GA7383@××××.com>, excerpted below, 

2

on Mon, 13 Mar 2006 08:23:39 -0600:

3

4

> I have noticed a similar problem that appears to be a kind of memory

5

> leak. I had upgraded both my desktop (amd64) and laptop (x86) from

6

> 2.6.14 to 2.6.15 a few weeks ago. After the upgrade, I started noticing

7

> that first thing in the morning or after long periods of the system being

8

> idle, the memory usage would climb to around 550MB. This seemed odd, as

9

> my normal memory usage is around 120MB (using conky as my system monitor).

10

>

11

> I tried to identify what process was consuming all that ram, but nothing

12

> looked out of the ordinary. As I started wokring on the machine, the

13

> memory usage would slowly go down over the course of a few hours, and

14

> level off around 200MB (which is still more than normal).

15

16

Some observations...

17

18

I haven't noticed the memory issue here.  I run mainline kernel.org

19

sources, using package.provided to tell portage I have gentoo-sources

20

2.6.999 so it won't bother me.  I've been running 2.6.15 and then 2.6.15.4

21

until 2.6.16-rc6 came out, as there was a bug with earlier rcs and the

22

daily git patches  beyond 2.6.15-git10 ( my kernel bug is

23

http://bugzilla.kernel.org/show_bug.cgi?id=6130 ).

24

25

Generally, the first step in tracing such bugs is to see if it still

26

occurs with the mainline/vanilla kernels.  If not, it's a patch in

27

whatever portage kernel you are using, so a Gentoo bug should be filed. 

28

If it occurs with mainline, try at least to narrow it down to the rc where

29

the problem started, and file a kernel.org bug.  Preferably, narrow it

30

down further to the daily git snapshot that  did it.  There are

31

instructions for doing that  and even going further, narrowing it down to

32

the file and even procedure or line changed between the two git snapshots,

33

if you have the time to do so, in the doc/BUG-HUNTING text-file in your

34

kernel sources dir.  Narrowing it down beyond the file level takes a lot

35

of time and patience, but narrowing it down to the file isn't too bad, and

36

narrowing it down to the GIT snapshot isn't hard at all, since tarballs of

37

those are made available at kernel.org just as if it were a normal kernel.

38

39

You didn't mention this so I'm not sure if you know to make the

40

distinction or not -- what sort of memory usage was it?  Application usage

41

(indicates a leak) or simply cache or buffer usage?  Linux will normally

42

fill most of memory with cache and buffers, as that's more efficient than

43

having it remain unused.

44

45

Kernel 2.6 has a swappiness tweaking control ( /proc/sys/vm/swappiness ),

46

that determines the balance between keeping stuff cached and swapping out

47

more applications, once all memory is used.  This  is set by default to

48

60.  If you want more applications kept in memory at the expense of

49

caching, set it lower, 0 means always prefer keeping applications in

50

memory even if they aren't used, which was the 2.4 series default. 

51

Conversely, higher than 60 will favor cache more than applications, so

52

apps will tend to be swapped out quicker while cache is maintained.  I

53

have a four-disk RAID setup here, with swap at the same priority over all

54

of them, so my swap is pretty fast and I keep swappiness set to 100,

55

favoring cache over application memory some of which won't be used very

56

often anyway.  (I'm running RAID-6 for much of my system, for 2X

57

redundancy.  Thus, on a four-disk array, it's two-way striped, while swap

58

is four-way striped, so reading apps back in from swap will be faster than

59

having to read data back in off the RAID-6 if it has been flushed out of

60

cache, and swappiness=100 is more efficient.  Normal non-RAID single disk

61

usage would mean swap and file rereading off of disk would be roughly the

62

same speed, so the best performance would be closer to that 60 default

63

swappiness.)

64

65

One of the most common memory/swap scenarios overnight, is that various

66

cron jobs run, logrotate, makewhatis, slocate's database update, etc,

67

filling the cache with data not likely to be used again, while pushing

68

applications into swap.  Applications left running constantly then take a

69

bit to load the first time in the morning, as they have to load back in

70

from swap, replacing some of the cached data.  This is normal, but if you

71

are just going off of the free memory figure, which includes cache and

72

buffers, it'll look like there was a memory leak overnight, because it's

73

all in cache in the morning.  Just as you mentioned, usage over the day

74

will likely increase free memory after that.  Of course, the other

75

noticeable effect is that constantly loaded apps take longer to become

76

usable first time in the morning, as they swap back in.

77

78

There are  a couple ways to deal with this.  First, here, I decided that I

79

didn't use slocate enough to be worth the trouble, particularly since my

80

schedule is messed up enough that there's no consistent time when I'm

81

/not/ at the computer, such that I can setup a cron job to run the

82

database update at that point.  I therefore don't have slocate on my

83

system at all.  I use grep and find, or the find functionality in mc or

84

KDE, to search for files, if I need to, and don't have slocate on my

85

system at all.

86

87

Second, set that swappiness to something fairly low, or to 0, thus

88

favoring apps over cache.  Cache will still use memory available, but it

89

won't force apps to swap to do so.

90

91

Third, if you aren't already aware of the distinction, learn the

92

difference between the different types of "used" memory, cache and buffers

93

vs. application memory, in particular.  (FWIW because it's hard to find a

94

definition for buffers, the difference between cache and buffers is that

95

buffers are basically dedicated cache -- usable by one application not the

96

entire system, while cache is systemwide. That's not absolutely accurate,

97

but it's a practical/working definition. The two are often considered

98

together and should be, as their effect on memory is similar and

99

cumulative.)

100

101

...

102

103

Finally, one thing I've noticed that /does/ leak memory here, under the

104

wrong circumstances -- the composite extension to xorg.  I run KDE, and

105

its kompmgr, part of kwin, is the application at fault.  Every time I

106

recompile xorg-server (modular-X) or the like (libXcomposite), I need to

107

recompile kwin as well.  That seems to kill the memory leak.  If they get

108

out of sync, X will consume more and more (application) memory, pushing

109

into swap virtually everything else.  A combination of setting ulimit for

110

memory and swap usage appropriately, and killing kompmgr when memory usage

111

gets too high (the usage is in X not kompmgr, but killing kompmgr frees

112

it), keeps it manageable on a 1 gig memory system (soon to be 8 gig,

113

memory ordered Friday, hopefully in by Wed!), if I forget to recompile

114

kwin after xorg-server/libXcomposite.

115

116

--

117

Duncan - List replies preferred.   No HTML msgs.

118

"Every nonfree program has a lord, a master --

119

and if you use the program, he is your master."  Richard Stallman in

120

http://www.linuxdevcenter.com/pub/a/linux/2004/12/22/rms_interview.html

121

122

123

--

124

gentoo-amd64@g.o mailing list

Gentoo Archives: gentoo-amd64

Replies

1	Brett Johnson posted <20060313142338.GA7383@××××.com>, excerpted below,
2	on Mon, 13 Mar 2006 08:23:39 -0600:
3
4	> I have noticed a similar problem that appears to be a kind of memory
5	> leak. I had upgraded both my desktop (amd64) and laptop (x86) from
6	> 2.6.14 to 2.6.15 a few weeks ago. After the upgrade, I started noticing
7	> that first thing in the morning or after long periods of the system being
8	> idle, the memory usage would climb to around 550MB. This seemed odd, as
9	> my normal memory usage is around 120MB (using conky as my system monitor).
10	>
11	> I tried to identify what process was consuming all that ram, but nothing
12	> looked out of the ordinary. As I started wokring on the machine, the
13	> memory usage would slowly go down over the course of a few hours, and
14	> level off around 200MB (which is still more than normal).
15
16	Some observations...
17
18	I haven't noticed the memory issue here. I run mainline kernel.org
19	sources, using package.provided to tell portage I have gentoo-sources
20	2.6.999 so it won't bother me. I've been running 2.6.15 and then 2.6.15.4
21	until 2.6.16-rc6 came out, as there was a bug with earlier rcs and the
22	daily git patches beyond 2.6.15-git10 ( my kernel bug is
23	http://bugzilla.kernel.org/show_bug.cgi?id=6130 ).
24
25	Generally, the first step in tracing such bugs is to see if it still
26	occurs with the mainline/vanilla kernels. If not, it's a patch in
27	whatever portage kernel you are using, so a Gentoo bug should be filed.
28	If it occurs with mainline, try at least to narrow it down to the rc where
29	the problem started, and file a kernel.org bug. Preferably, narrow it
30	down further to the daily git snapshot that did it. There are
31	instructions for doing that and even going further, narrowing it down to
32	the file and even procedure or line changed between the two git snapshots,
33	if you have the time to do so, in the doc/BUG-HUNTING text-file in your
34	kernel sources dir. Narrowing it down beyond the file level takes a lot
35	of time and patience, but narrowing it down to the file isn't too bad, and
36	narrowing it down to the GIT snapshot isn't hard at all, since tarballs of
37	those are made available at kernel.org just as if it were a normal kernel.
38
39	You didn't mention this so I'm not sure if you know to make the
40	distinction or not -- what sort of memory usage was it? Application usage
41	(indicates a leak) or simply cache or buffer usage? Linux will normally
42	fill most of memory with cache and buffers, as that's more efficient than
43	having it remain unused.
44
45	Kernel 2.6 has a swappiness tweaking control ( /proc/sys/vm/swappiness ),
46	that determines the balance between keeping stuff cached and swapping out
47	more applications, once all memory is used. This is set by default to
48	60. If you want more applications kept in memory at the expense of
49	caching, set it lower, 0 means always prefer keeping applications in
50	memory even if they aren't used, which was the 2.4 series default.
51	Conversely, higher than 60 will favor cache more than applications, so
52	apps will tend to be swapped out quicker while cache is maintained. I
53	have a four-disk RAID setup here, with swap at the same priority over all
54	of them, so my swap is pretty fast and I keep swappiness set to 100,
55	favoring cache over application memory some of which won't be used very
56	often anyway. (I'm running RAID-6 for much of my system, for 2X
57	redundancy. Thus, on a four-disk array, it's two-way striped, while swap
58	is four-way striped, so reading apps back in from swap will be faster than
59	having to read data back in off the RAID-6 if it has been flushed out of
60	cache, and swappiness=100 is more efficient. Normal non-RAID single disk
61	usage would mean swap and file rereading off of disk would be roughly the
62	same speed, so the best performance would be closer to that 60 default
63	swappiness.)
64
65	One of the most common memory/swap scenarios overnight, is that various
66	cron jobs run, logrotate, makewhatis, slocate's database update, etc,
67	filling the cache with data not likely to be used again, while pushing
68	applications into swap. Applications left running constantly then take a
69	bit to load the first time in the morning, as they have to load back in
70	from swap, replacing some of the cached data. This is normal, but if you
71	are just going off of the free memory figure, which includes cache and
72	buffers, it'll look like there was a memory leak overnight, because it's
73	all in cache in the morning. Just as you mentioned, usage over the day
74	will likely increase free memory after that. Of course, the other
75	noticeable effect is that constantly loaded apps take longer to become
76	usable first time in the morning, as they swap back in.
77
78	There are a couple ways to deal with this. First, here, I decided that I
79	didn't use slocate enough to be worth the trouble, particularly since my
80	schedule is messed up enough that there's no consistent time when I'm
81	/not/ at the computer, such that I can setup a cron job to run the
82	database update at that point. I therefore don't have slocate on my
83	system at all. I use grep and find, or the find functionality in mc or
84	KDE, to search for files, if I need to, and don't have slocate on my
85	system at all.
86
87	Second, set that swappiness to something fairly low, or to 0, thus
88	favoring apps over cache. Cache will still use memory available, but it
89	won't force apps to swap to do so.
90
91	Third, if you aren't already aware of the distinction, learn the
92	difference between the different types of "used" memory, cache and buffers
93	vs. application memory, in particular. (FWIW because it's hard to find a
94	definition for buffers, the difference between cache and buffers is that
95	buffers are basically dedicated cache -- usable by one application not the
96	entire system, while cache is systemwide. That's not absolutely accurate,
97	but it's a practical/working definition. The two are often considered
98	together and should be, as their effect on memory is similar and
99	cumulative.)
100
101	...
102
103	Finally, one thing I've noticed that /does/ leak memory here, under the
104	wrong circumstances -- the composite extension to xorg. I run KDE, and
105	its kompmgr, part of kwin, is the application at fault. Every time I
106	recompile xorg-server (modular-X) or the like (libXcomposite), I need to
107	recompile kwin as well. That seems to kill the memory leak. If they get
108	out of sync, X will consume more and more (application) memory, pushing
109	into swap virtually everything else. A combination of setting ulimit for
110	memory and swap usage appropriately, and killing kompmgr when memory usage
111	gets too high (the usage is in X not kompmgr, but killing kompmgr frees
112	it), keeps it manageable on a 1 gig memory system (soon to be 8 gig,
113	memory ordered Friday, hopefully in by Wed!), if I forget to recompile
114	kwin after xorg-server/libXcomposite.
115
116	--
117	Duncan - List replies preferred. No HTML msgs.
118	"Every nonfree program has a lord, a master --
119	and if you use the program, he is your master." Richard Stallman in
120	http://www.linuxdevcenter.com/pub/a/linux/2004/12/22/rms_interview.html
121
122
123	--
124	gentoo-amd64@g.o mailing list