Gentoo Archives: gentoo-hardened

From:	Brian Kroth <bpkroth@××××.edu>
To:	gentoo-hardened@l.g.o
Subject:	Re: [gentoo-hardened] kernel upgrade problems: bad page state
Date:	Thu, 01 Nov 2007 03:48:37
Message-Id:	`47294C17.9080901@wisc.edu`
In Reply to:	Re: [gentoo-hardened] kernel upgrade problems: bad page state by Brian Kroth

1	Brian Kroth wrote:
2	> I have no problems with 2.6.20-r10. I ran it for 4 hours last night and
3	> some weeks before this. 2.6.20-r6 before that, again no problems.
4	> 2.6.22-r8 and 2.6.23 both die as soon as cactid or nagios start running.
5	> I really don't think this is bad ram anymore. I'll see if I can get an
6	> exact test for others to try. Any other kernel debug tweaks I should try?
7	>
8	> Thanks for all your help,
9	> Brian
10
11	I haven't found a way of reproducing this on other machines yet because
12	it takes lots of time to setup cacti. In playing around with cactid
13	though what I've found is that the error happens /nearly/ everytime I
14	specify something like this:
15
16	cactid --verbosity=5 -f 1 -l 100
17
18	but not ever (yet) with this
19
20	cactid --verbosity=5 -f 1 -l 10
21
22	With sec monitoring kern.log for "Bad page state in 'cactid'" and
23	killing cactid when that happens I've noticed that that last line of
24	output from cactid is always something like this:
25
26	10/31/2007 10:22:32 PM - CACTID: Poller[0] Host[42] DEBUG: The POPEN
27	returned the following File Descriptor 5
28
29	The kern.log shows this:
30
31	Oct 31 22:30:09 tux-mc Bad page state in process 'cactid'
32	Oct 31 22:30:09 tux-mc page:c14070c0 flags:0x40000001 mapping:00000000
33	mapcount:0 count:0
34	Oct 31 22:30:09 tux-mc Trying to fix it up, but a reboot is needed
35	Oct 31 22:30:09 tux-mc Backtrace:
36	Oct 31 22:30:09 tux-mc [<c044bf67>] bad_page+0x63/0x92
37	Oct 31 22:30:09 tux-mc [<c044c90c>] free_hot_cold_page+0x7c/0x17f
38	Oct 31 22:30:09 tux-mc [<c0455c24>] do_wp_page+0x223/0x3ed
39	Oct 31 22:30:09 tux-mc [<c0456f24>] __handle_mm_fault+0x2ad/0x305
40	Oct 31 22:30:09 tux-mc [<c0414616>] do_page_fault+0x1da/0x7d5
41	Oct 31 22:30:09 tux-mc [<c041c2d5>] do_fork+0x15d/0x217
42	Oct 31 22:30:09 tux-mc [<c041443c>] do_page_fault+0x0/0x7d5
43	Oct 31 22:30:09 tux-mc [<c06e8db5>] error_code+0x75/0x80
44	Oct 31 22:30:09 tux-mc [<c06e0000>] svc_defer+0xfa/0x139
45	Oct 31 22:30:09 tux-mc =======================
46
47	The version of cactid in portage is slightly old. After updating from
48	0.8.6i-r1 to 0.8.6j the problem seems to happen less frequently, but
49	still happens. With that in mind might this actually be a software
50	problem and not a kernel problem? Shouldn't PAX be preventing userland
51	software from screwing up the page table?
52
53	I can send more kernel output if anyone's interested. Any thoughts on
54	what else I should be doing to test this?
55
56	Thanks,
57	Brian

Attachments

File name	MIME type
smime.p7s	application/x-pkcs7-signature

Replies

Subject	Author
Re: [gentoo-hardened] kernel upgrade problems: bad page state	pageexec@××××××××.hu

Report Message

Find on MARC Find on Google Groups