Gentoo Archives: gentoo-user

From: Kai Peter <kp@×××××××××××.org>
To: Gentoo User <gentoo-user@l.g.o>
Subject: [gentoo-user] Machine hangs up with out of memory
Date: Wed, 28 Apr 2021 09:58:24
Message-Id: 5d385a9cbe4fd28949b4a1e5b4d2216e@lists.qware.org
1 Hi,
2
3 I have an issue with a machine where I'm not able to detect the real
4 root cause. It hangs up totally. It seems like it was running out of
5 memory - but why? Hopefully somebody can give me some insight. As far I
6 can see right now, it hangs up a few hours after an `emerge --update
7 --newuse --deep --with-bdeps=y @world`.
8
9 The machine is an Intel Atom with 8 GB RAM (physical, max) and 24 GB
10 swap (a file). So 32 GB RAM in total. It has a 250GB SSD. It runs
11 gentoo-sources-4.14.83 build with genkernel. Portage uses the stable
12 tree only. It basically provides the hardware for a qemu VM which does
13 the network management: primary ns, dhcp, apache ssl proxy. This VM uses
14 4 GB RAM and has 8 GB swap file. The VM works smoothly. The atom machine
15 itself acts further as basic nfs server to an independent dedicated
16 server - which does the (re)exports - and as secondary ns. For this I'm
17 convinced that 32GB RAM total have to be enough - correct me if I'm
18 wrong!
19
20 The issue starts round about in February (IIRC). The update of gcc-10.2
21 did fail. I have /var/tmp/portage on tmpfs - I did increase the size in
22 fstab from 8 to 16 GB. Afterwards gcc build successfully.
23
24 After two hang-ups I did increase the swap from 8 to 24 GB. It doesn't
25 help. Here is a complete log from /var/log/messages:
26
27 Apr 28 05:35:57 Syrin kernel: [1454017.499919] isc-net-0000 invoked
28 oom-killer: gfp_mask=0x14201ca(GFP_HIGHUSER_MOVABLE|__GFP_COLD),
29 nodemask=(null), order=0, oom_score_adj=0
30 Apr 28 05:35:57 Syrin kernel: [1454017.499925] isc-net-0000 cpuset=/
31 mems_allowed=0
32 Apr 28 05:35:57 Syrin kernel: [1454017.499933] CPU: 0 PID: 27685 Comm:
33 isc-net-0000 Not tainted 4.14.83-gentoo #1
34 Apr 28 05:35:57 Syrin kernel: [1454017.499935] Hardware name: MSI
35 MS-7877/J1900I, BIOS V1.2 03/25/2014
36 Apr 28 05:35:57 Syrin kernel: [1454017.499936] Call Trace:
37 Apr 28 05:35:57 Syrin kernel: [1454017.499948] dump_stack+0x67/0x98
38 Apr 28 05:35:57 Syrin kernel: [1454017.499954] dump_header+0x94/0x20c
39 Apr 28 05:35:57 Syrin kernel: [1454017.499958]
40 oom_kill_process+0x24a/0x420
41 Apr 28 05:35:57 Syrin kernel: [1454017.499962] ?
42 oom_badness.part.9+0xd3/0x150
43 Apr 28 05:35:57 Syrin kernel: [1454017.499965] out_of_memory+0xf9/0x290
44 Apr 28 05:35:57 Syrin kernel: [1454017.499968]
45 __alloc_pages_nodemask+0xf48/0xff0
46 Apr 28 05:35:57 Syrin kernel: [1454017.499974]
47 filemap_fault+0x294/0x4c0
48 Apr 28 05:35:57 Syrin kernel: [1454017.499979]
49 ext4_filemap_fault+0x2c/0x40
50 Apr 28 05:35:57 Syrin kernel: [1454017.499983] __do_fault+0x1f/0xb0
51 Apr 28 05:35:57 Syrin kernel: [1454017.499986]
52 __handle_mm_fault+0x3ed/0xad0
53 Apr 28 05:35:57 Syrin kernel: [1454017.499991]
54 handle_mm_fault+0xaa/0x1f0
55 Apr 28 05:35:57 Syrin kernel: [1454017.499996]
56 __do_page_fault+0x250/0x4f0
57 Apr 28 05:35:57 Syrin kernel: [1454017.500000] ? page_fault+0x2f/0x50
58 Apr 28 05:35:57 Syrin kernel: [1454017.500003] page_fault+0x45/0x50
59 Apr 28 05:35:57 Syrin kernel: [1454017.500005] RIP: 0000:
60 (null)
61 Apr 28 05:35:57 Syrin kernel: [1454017.500007] RSP:
62 12f83750:0000000000000001 EFLAGS: 7ffa12f837a0
63 Apr 28 05:35:57 Syrin kernel: [1454017.500010] Mem-Info:
64 Apr 28 05:35:57 Syrin kernel: [1454017.500017] active_anon:1694713
65 inactive_anon:211859 isolated_anon:0
66 Apr 28 05:35:57 Syrin kernel: [1454017.500017] active_file:328
67 inactive_file:344 isolated_file:32
68 Apr 28 05:35:57 Syrin kernel: [1454017.500017] unevictable:1374 dirty:0
69 writeback:0 unstable:0
70 Apr 28 05:35:57 Syrin kernel: [1454017.500017] slab_reclaimable:4480
71 slab_unreclaimable:7449
72 Apr 28 05:35:57 Syrin kernel: [1454017.500017] mapped:1071 shmem:3
73 pagetables:16352 bounce:0
74 Apr 28 05:35:57 Syrin kernel: [1454017.500017] free:11655 free_pcp:534
75 free_cma:0
76 Apr 28 05:35:57 Syrin kernel: [1454017.500021] Node 0
77 active_anon:6778852kB inactive_anon:847436kB active_file:1312kB
78 inactive_file:1376kB unevictable:5496kB isolated(anon):0kB
79 isolated(file):128kB mapped:4284kB dirty:0kB writeback:0kB shmem:12kB
80 writeback_tmp:0kB unstable:0kB all_unreclaimable? yes
81 Apr 28 05:35:57 Syrin kernel: [1454017.500026] DMA free:15836kB min:20kB
82 low:32kB high:44kB active_anon:0kB inactive_anon:0kB active_file:0kB
83 inactive_file:0kB unevictable:0kB writepending:0kB present:15920kB
84 managed:15836kB mlocked:0kB kernel_stack:0kB pagetables:0kB bounce:0kB
85 free_pcp:0kB local_pcp:0kB free_cma:0kB
86 Apr 28 05:35:57 Syrin kernel: [1454017.500027] lowmem_reserve[]: 0 2664
87 7647 7647
88 Apr 28 05:35:57 Syrin kernel: [1454017.500036] DMA32 free:23732kB
89 min:3892kB low:6620kB high:9348kB active_anon:2319992kB
90 inactive_anon:363260kB active_file:0kB inactive_file:92kB
91 unevictable:0kB writepending:0kB present:2825512kB managed:2734888kB
92 mlocked:0kB kernel_stack:140kB pagetables:22572kB bounce:0kB
93 free_pcp:1136kB local_pcp:476kB free_cma:0kB
94 Apr 28 05:35:57 Syrin kernel: [1454017.500037] lowmem_reserve[]: 0 0
95 4982 4982
96 Apr 28 05:35:57 Syrin kernel: [1454017.500045] Normal free:7052kB
97 min:7284kB low:12384kB high:17484kB active_anon:4458860kB
98 inactive_anon:484176kB active_file:1368kB inactive_file:1568kB
99 unevictable:5496kB writepending:0kB present:5242880kB managed:5102420kB
100 mlocked:5496kB kernel_stack:2276kB pagetables:42836kB bounce:0kB
101 free_pcp:1000kB local_pcp:724kB free_cma:0kB
102 Apr 28 05:35:57 Syrin kernel: [1454017.500046] lowmem_reserve[]: 0 0 0 0
103 Apr 28 05:35:57 Syrin kernel: [1454017.500050] DMA: 1*4kB (U) 1*8kB (U)
104 1*16kB (U) 0*32kB 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB
105 (U) 1*2048kB (M) 3*4096kB (M) = 15836kB
106 Apr 28 05:35:57 Syrin kernel: [1454017.500070] DMA32: 268*4kB (UE)
107 1562*8kB (UE) 645*16kB (UE) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB
108 0*1024kB 0*2048kB 0*4096kB = 23888kB
109 Apr 28 05:35:57 Syrin kernel: [1454017.500084] Normal: 293*4kB (UME)
110 490*8kB (UME) 152*16kB (UE) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB
111 0*1024kB 0*2048kB 0*4096kB = 7524kB
112 Apr 28 05:35:57 Syrin kernel: [1454017.500100] Node 0 hugepages_total=0
113 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
114 Apr 28 05:35:57 Syrin kernel: [1454017.500101] 1818 total pagecache
115 pages
116 Apr 28 05:35:57 Syrin kernel: [1454017.500110] 86 pages in swap cache
117 Apr 28 05:35:57 Syrin kernel: [1454017.500112] Swap cache stats: add
118 13659627, delete 13659513, find 1129210/1793572
119 Apr 28 05:35:57 Syrin kernel: [1454017.500113] Free swap = 0kB
120 Apr 28 05:35:57 Syrin kernel: [1454017.500114] Total swap = 25165820kB
121 Apr 28 05:35:57 Syrin kernel: [1454017.500115] 2021078 pages RAM
122 Apr 28 05:35:57 Syrin kernel: [1454017.500116] 0 pages
123 HighMem/MovableOnly
124 Apr 28 05:35:57 Syrin kernel: [1454017.500117] 57792 pages reserved
125 Apr 28 05:35:57 Syrin kernel: [1454017.500118] 0 pages hwpoisoned
126 Apr 28 05:35:57 Syrin kernel: [1454017.500119] [ pid ] uid tgid
127 total_vm rss nr_ptes nr_pmds swapents oom_score_adj name
128 Apr 28 05:35:57 Syrin kernel: [1454017.500125] [ 4009] 0 4009
129 21207 320 7 3 20 0 apcupsd
130 Apr 28 05:35:57 Syrin kernel: [1454017.500128] [ 4043] 0 4043
131 54371 48 12 3 220 0 rsyslogd
132 Apr 28 05:35:57 Syrin kernel: [1454017.500131] [ 4084] 0 4084
133 1938 178 6 3 18 0 fcron
134 Apr 28 05:35:57 Syrin kernel: [1454017.500133] [ 4400] 0 4400
135 17733 1376 8 3 0 0 ntpd
136 Apr 28 05:35:57 Syrin kernel: [1454017.500136] [ 4429] 0 4429
137 2789 0 7 3 103 0 rsync
138 Apr 28 05:35:57 Syrin kernel: [1454017.500139] [ 4460] 0 4460
139 1689 241 6 3 112 -1000 sshd
140 Apr 28 05:35:57 Syrin kernel: [1454017.500142] [ 4693] 0 4693
141 1067352 80035 1546 7 664044 0 qemu-system-x86
142 Apr 28 05:35:57 Syrin kernel: [1454017.500145] [ 4863] 0 4863
143 1905 465 7 3 43 0 agetty
144 Apr 28 05:35:57 Syrin kernel: [1454017.500148] [ 4864] 0 4864
145 1905 433 7 3 44 0 agetty
146 Apr 28 05:35:57 Syrin kernel: [1454017.500151] [ 4865] 0 4865
147 1905 431 7 3 43 0 agetty
148 Apr 28 05:35:57 Syrin kernel: [1454017.500153] [ 4866] 0 4866
149 1905 443 7 3 43 0 agetty
150 Apr 28 05:35:57 Syrin kernel: [1454017.500156] [ 4867] 0 4867
151 1905 453 7 3 43 0 agetty
152 Apr 28 05:35:57 Syrin kernel: [1454017.500159] [ 4868] 0 4868
153 1905 433 8 3 43 0 agetty
154 Apr 28 05:35:57 Syrin kernel: [1454017.500162] [27439] 0 27439
155 675 295 5 3 60 0 rpcbind
156 Apr 28 05:35:57 Syrin kernel: [1454017.500164] [27509] 0 27509
157 750 419 5 3 80 0 rpc.idmapd
158 Apr 28 05:35:57 Syrin kernel: [1454017.500167] [27520] 65534 27520
159 693 421 5 3 53 0 rpc.statd
160 Apr 28 05:35:57 Syrin kernel: [1454017.500170] [27583] 0 27583
161 819 365 5 3 108 0 rpc.mountd
162 Apr 28 05:35:57 Syrin kernel: [1454017.500173] [27684] 40 27684
163 7570889 1823291 14584 32 5626031 0 named
164 Apr 28 05:35:57 Syrin kernel: [1454017.500176] [10479] 0 10479
165 3262 449 7 3 182 0 udevd
166 Apr 28 05:35:57 Syrin kernel: [1454017.500179] [ 2923] 0 2923
167 1938 318 6 3 10 0 fcron
168 Apr 28 05:35:57 Syrin kernel: [1454017.500182] [ 2924] 0 2924
169 981 406 5 3 0 0 backup.cron
170 Apr 28 05:35:57 Syrin kernel: [1454017.500184] [ 2930] 0 2930
171 981 462 5 3 0 0 rsync.sh
172 Apr 28 05:35:57 Syrin kernel: [1454017.500187] [ 2965] 0 2965
173 13185 1261 30 3 0 0 rsync
174 Apr 28 05:35:57 Syrin kernel: [1454017.500190] [ 2966] 0 2966
175 574 158 5 3 0 0 tee
176 Apr 28 05:35:57 Syrin kernel: [1454017.500192] [ 2968] 0 2968
177 2038 482 7 3 0 0 rsync
178 Apr 28 05:35:57 Syrin kernel: [1454017.500195] [ 2974] 0 2974
179 14142 1189 31 3 0 0 rsync
180 Apr 28 05:35:57 Syrin kernel: [1454017.500198] [ 2995] 0 2995
181 1938 271 6 3 7 0 fcron
182 Apr 28 05:35:57 Syrin kernel: [1454017.500201] [ 2996] 0 2996
183 981 68 6 3 0 0 sh
184 Apr 28 05:35:57 Syrin kernel: [1454017.500203] Out of memory: Kill
185 process 27684 (named) score 904 or sacrifice child
186 Apr 28 05:35:57 Syrin kernel: [1454017.500234] Killed process 27684
187 (named) total-vm:30283556kB, anon-rss:7293164kB, file-rss:0kB,
188 shmem-rss:0kB
189 Apr 28 05:36:00 Syrin kernel: [1454019.937636] oom_reaper: reaped
190 process 27684 (named), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
191
192 All the processes after udevd are just from this day (backup job).
193
194 For comparison, the last call trace before was nearly the same:
195
196 Apr 15 18:26:37 Syrin kernel: [376620.710330] Call Trace:
197 Apr 15 18:26:37 Syrin kernel: [376620.710341] dump_stack+0x67/0x98
198 Apr 15 18:26:37 Syrin kernel: [376620.710347] dump_header+0x94/0x20c
199 Apr 15 18:26:37 Syrin kernel: [376620.710352]
200 oom_kill_process+0x24a/0x420
201 Apr 15 18:26:37 Syrin kernel: [376620.710355] ?
202 oom_badness.part.9+0xd3/0x150
203 Apr 15 18:26:37 Syrin kernel: [376620.710358] out_of_memory+0xf9/0x290
204 Apr 15 18:26:37 Syrin kernel: [376620.710361]
205 __alloc_pages_nodemask+0xf48/0xff0
206 Apr 15 18:26:37 Syrin kernel: [376620.710367] filemap_fault+0x294/0x4c0
207 Apr 15 18:26:37 Syrin kernel: [376620.710372]
208 ext4_filemap_fault+0x2c/0x40
209 Apr 15 18:26:37 Syrin kernel: [376620.710376] __do_fault+0x1f/0xb0
210 Apr 15 18:26:37 Syrin kernel: [376620.710380]
211 __handle_mm_fault+0x3ed/0xad0
212 Apr 15 18:26:37 Syrin kernel: [376620.710385]
213 handle_mm_fault+0xaa/0x1f0
214 Apr 15 18:26:37 Syrin kernel: [376620.710390]
215 __do_page_fault+0x250/0x4f0
216 Apr 15 18:26:37 Syrin kernel: [376620.710394] ? page_fault+0x2f/0x50
217 Apr 15 18:26:37 Syrin kernel: [376620.710396] page_fault+0x45/0x50
218 Apr 15 18:26:37 Syrin kernel: [376620.710400] RIP:
219 21e50088:0x7f41aedee150
220 Apr 15 18:26:37 Syrin kernel: [376620.710401] RSP:
221 21e50070:0000000000000000 EFLAGS: 7f41aedee150
222 Apr 15 18:26:37 Syrin kernel: [376620.710405] Mem-Info:
223 Apr 15 18:26:37 Syrin kernel: [376620.710411] active_anon:1695970
224 inactive_anon:212020 isolated_anon:0
225 Apr 15 18:26:37 Syrin kernel: [376620.710411] active_file:339
226 inactive_file:320 isolated_file:0
227 Apr 15 18:26:37 Syrin kernel: [376620.710411] unevictable:1374 dirty:0
228 writeback:0 unstable:0
229 Apr 15 18:26:37 Syrin kernel: [376620.710411] slab_reclaimable:4311
230 slab_unreclaimable:7552
231 Apr 15 18:26:37 Syrin kernel: [376620.710411] mapped:1059 shmem:3
232 pagetables:16264 bounce:0
233 Apr 15 18:26:37 Syrin kernel: [376620.710411] free:11840 free_pcp:0
234 free_cma:0
235
236 Unfortunately I don't understand all the details. Any help is highly
237 appreciated.
238
239 I assume it has something to do with tmpfs which will not be freed. Just
240 an assumption, I'm searching for clarity, not try and error.
241
242 Thanks
243 Kai

Replies

Subject Author
Re: [gentoo-user] Machine hangs up with out of memory Adam Carter <adamcarter3@×××××.com>