Gentoo Archives: gentoo-hardened

From:	Brian Kroth <bpkroth@××××.edu>
To:	gentoo-hardened@l.g.o, pageexec@××××××××.hu
Subject:	Re: [gentoo-hardened] kernel upgrade problems: bad page state
Date:	Sun, 04 Nov 2007 00:06:30
Message-Id:	`472D0BF9.3030508@wisc.edu`
In Reply to:	Re: [gentoo-hardened] kernel upgrade problems: bad page state by pageexec@freemail.hu

1	pageexec@××××××××.hu wrote:
2	> On 31 Oct 2007 at 22:46, Brian Kroth wrote:
3	>
4	>> but not ever (yet) with this
5	>>
6	>> cactid --verbosity=5 -f 1 -l 10
7	>
8	> what does the -l switch do?
9
10	-f , -l allow you to limit the range of hostids to scan. In the
11	examples I gave the first scanned hostids 1-100, the example above scans
12	hostids 1-10. I was doing this originally to see if I could pinpoint
13	one particular host check that was causing it, but it seems to have more
14	to do with large hosts scans. I think that might be because of the
15	number of forks and allocations.
16
17	>
18	>> The version of cactid in portage is slightly old. After updating from
19	>> 0.8.6i-r1 to 0.8.6j the problem seems to happen less frequently, but
20	>> still happens. With that in mind might this actually be a software
21	>> problem and not a kernel problem? Shouldn't PAX be preventing userland
22	>> software from screwing up the page table?
23	>
24	> i'm almost sure it's a bug somewhere in vma mirroring as that's the
25	> only thing i changed in .22 and on and it does play with page locking
26	> (the bad page state is triggered because a to-be-freed page is still
27	> locked, that's means there's a missing unlock somewhere in the code,
28	> but i couldn't figure it out from the code yet).
29
30	Where's the code for this? I'm no kernel guru by any means, but I'd
31	still be interested to look at it and learn.
32
33	>
34	>> I can send more kernel output if anyone's interested. Any thoughts on
35	>> what else I should be doing to test this?
36	>
37	> i'll need your mm/memory.o from the failing kernel and if it occured on
38	> multiple machines or kernels, indicate which of your report corresponds
39	> to which .o (well, i can find it out from the disasm eventually, but it
40	> helps me if i don't have to ;-). then can you send me a /proc/pid/maps file
41	> from cactid and nagios (if you use grsec make sure that addresses are not
42	> hidden and preferably not randomized either)?
43	>
44
45	So far this is only on that single machine, and only for nagios and
46	cacti. I rebuilt the kernel with the config that's attached. I've
47	basically turned on a few more debug settings in the kernel and turned
48	off the randomization features of pax (CONFIG_PAX_ASLR) and the "remove
49	addresses" feature of grsec (CONFIG_GRKERNSEC_PROC_MEMMAP) like you
50	asked. Tweaked my sec script to copy the maps files before killing the
51	offending processes. Everything should be in the tar. Let me know if
52	you need anything else.
53
54	Nov 3 18:27:38 tux-mc IPMI Watchdog: driver initialized
55	Nov 3 18:32:59 tux-mc Bad page state in process 'nagios'
56	Nov 3 18:32:59 tux-mc page:c129d620 flags:0x40000001 mapping:00000000
57	mapcount:0 count:0
58	Nov 3 18:32:59 tux-mc Trying to fix it up, but a reboot is needed
59	Nov 3 18:32:59 tux-mc Backtrace:
60	Nov 3 18:32:59 tux-mc [<c044c150>] bad_page+0x63/0x92
61	Nov 3 18:32:59 tux-mc [<c044cc18>] free_hot_cold_page+0x7c/0x194
62	Nov 3 18:32:59 tux-mc [<c0456110>] do_wp_page+0x22e/0x426
63	Nov 3 18:32:59 tux-mc [<c0457463>] __handle_mm_fault+0x2ad/0x305
64	Nov 3 18:32:59 tux-mc [<c0414576>] do_page_fault+0x1da/0x7d5
65	Nov 3 18:32:59 tux-mc [<c041c269>] do_fork+0x15d/0x217
66	Nov 3 18:32:59 tux-mc [<c041439c>] do_page_fault+0x0/0x7d5
67	Nov 3 18:32:59 tux-mc [<c06e9525>] error_code+0x75/0x80
68	Nov 3 18:32:59 tux-mc [<c06e0000>] svc_setup_socket+0x1aa/0x223
69	Nov 3 18:32:59 tux-mc =======================
70	Nov 3 18:35:07 tux-mc Bad page state in process 'cactid'
71	Nov 3 18:35:07 tux-mc page:c12efdc0 flags:0x40000001 mapping:00000000
72	mapcount:0 count:0
73	Nov 3 18:35:07 tux-mc Trying to fix it up, but a reboot is needed
74	Nov 3 18:35:07 tux-mc Backtrace:
75	Nov 3 18:35:07 tux-mc [<c044c150>] bad_page+0x63/0x92
76	Nov 3 18:35:07 tux-mc [<c044cc18>] free_hot_cold_page+0x7c/0x194
77	Nov 3 18:35:07 tux-mc [<c0456110>] do_wp_page+0x22e/0x426
78	Nov 3 18:35:07 tux-mc [<c0457463>] __handle_mm_fault+0x2ad/0x305
79	Nov 3 18:35:07 tux-mc [<c0414576>] do_page_fault+0x1da/0x7d5
80	Nov 3 18:35:07 tux-mc [<c04682d5>] sys_read+0x68/0x6a
81	Nov 3 18:35:07 tux-mc [<c041439c>] do_page_fault+0x0/0x7d5
82	Nov 3 18:35:07 tux-mc [<c06e9525>] error_code+0x75/0x80
83	Nov 3 18:35:07 tux-mc [<c06e0000>] svc_setup_socket+0x1aa/0x223
84	Nov 3 18:35:07 tux-mc =======================
85
86
87	PS - would you like me to take this off list?
88
89	Thanks again,
90	Brian

Attachments

File name	MIME type
2.6.22-hardened-r8_debug-info.tar.bz2	application/x-bzip
smime.p7s	application/x-pkcs7-signature

Replies

Subject	Author
Re: [gentoo-hardened] kernel upgrade problems: bad page state	pageexec@××××××××.hu

Report Message

Find on MARC Find on Google Groups