1 |
Hi! |
2 |
|
3 |
|
4 |
My server hangs every 3-14 days without storing kernel oops message in logs |
5 |
(this is dedicated server at hosting, so I've no physical access to console). |
6 |
I've set up netconsole, and catch kernel oops by network on second server |
7 |
(error message below). |
8 |
|
9 |
These hangs happens on different kernel versions (current is 2.6.8-gentoo-r3). |
10 |
"SpiderAuto" process is my perl script which running using usual user account |
11 |
and 24x7 downloading websites (there number (3-7) of such scripts running |
12 |
doing parallel download of different websites). |
13 |
|
14 |
I suppose this is sort of "race condition" error related to huge number of |
15 |
simultaneous download requests... |
16 |
|
17 |
Any ideas how to fix/workaround this error? Maybe try another kernel source |
18 |
(I'm usually using gentoo-dev-sources)? |
19 |
|
20 |
|
21 |
Oops: 0000 [#1] |
22 |
Modules linked in: |
23 |
netconsole |
24 |
iptable_nat |
25 |
e1000 |
26 |
|
27 |
CPU: 0 |
28 |
EIP: 0060:[<f89d48b1>] Not tainted |
29 |
EFLAGS: 00010207 (2.6.8-gentoo-r3) |
30 |
EIP is at find_appropriate_src+0x47/0xb9 [iptable_nat] |
31 |
eax: f8a3d078 ebx: 08151138 ecx: f8a3d078 edx: 00800000 |
32 |
ds: 007b es: 007b ss: 0068 |
33 |
esi: 00000006 edi: eae0faec ebp: 00000078 esp: eae0fa4c |
34 |
|
35 |
Stack: |
36 |
Process SpiderAuto (pid: 1609, threadinfo=eae0e000 task=efbb7160) |
37 |
|
38 |
00000002 eae0fdd0 f7fddc00 00000000 eae0faec eae0faec eae0fb1c f89d88e0 |
39 |
f89d4d74 eae0faec eae0fb58 00000206 00000286 00000039 eae0fdd0 00000001 |
40 |
c0384580 c02d1509 eae0faec e8c1eb2c c02d23d1 eae0faec eae0fafc e8c1eba0 |
41 |
|
42 |
Call Trace: |
43 |
[<f89d4d74>] get_unique_tuple+0x183/0x1f4 [iptable_nat] |
44 |
[<c02d1509>] invert_tuple+0x2b/0x2f |
45 |
[<c02d23d1>] invert_tuplepr+0x2b/0x33 |
46 |
[<f89d4e59>] ip_nat_setup_info+0x74/0x354 [iptable_nat] |
47 |
[<f89d45cd>] ip_nat_rule_find+0xa9/0xb6 [iptable_nat] |
48 |
[<f89d4109>] ip_nat_fn+0x109/0x1ee [iptable_nat] |
49 |
[<c02d64ef>] ipt_route_hook+0x37/0x3b |
50 |
[<c02a75e2>] ip_finish_output2+0x0/0x186 |
51 |
[<c0299218>] nf_iterate+0x71/0xa5 |
52 |
[<c02a75e2>] ip_finish_output2+0x0/0x186 |
53 |
[<c02a75e2>] ip_finish_output2+0x0/0x186 |
54 |
[<c02994c9>] nf_hook_slow+0x6b/0xf8 |
55 |
[<c02a75e2>] ip_finish_output2+0x0/0x186 |
56 |
[<c02a75b9>] dst_output+0x0/0x29 |
57 |
[<c02a5249>] ip_finish_output+0x1da/0x1df |
58 |
[<c02a75e2>] ip_finish_output2+0x0/0x186 |
59 |
[<c02a75b9>] dst_output+0x0/0x29 |
60 |
[<c02a75cd>] dst_output+0x14/0x29 |
61 |
[<c0299522>] nf_hook_slow+0xc4/0xf8 |
62 |
[<c02a75b9>] dst_output+0x0/0x29 |
63 |
[<c02a5903>] ip_queue_xmit+0x4a6/0x5a9 |
64 |
[<c02a75b9>] dst_output+0x0/0x29 |
65 |
[<c013685a>] do_no_page+0x63/0x2ed |
66 |
[<c0136cb2>] handle_mm_fault+0xd2/0x138 |
67 |
[<c010e775>] do_page_fault+0x16c/0x561 |
68 |
[<c02ba7ac>] tcp_v4_send_check+0x54/0xf4 |
69 |
[<c02b4ceb>] tcp_transmit_skb+0x405/0x693 |
70 |
[<c02b7180>] tcp_connect+0x3b9/0x474 |
71 |
[<c02b9bee>] tcp_v4_connect+0x3c6/0x660 |
72 |
[<c02c93ff>] inet_stream_connect+0x90/0x19f |
73 |
[<c02894ac>] sys_connect+0x85/0xb1 |
74 |
[<c01feba4>] copy_from_user+0x5a/0x86 |
75 |
[<c0291e15>] dev_ioctl+0x3f/0x273 |
76 |
[<c0288902>] sock_ioctl+0x0/0x297 |
77 |
[<c01feba4>] copy_from_user+0x5a/0x86 |
78 |
[<c0289eec>] sys_socketcall+0xa5/0x254 |
79 |
[<c0103c43>] syscall_call+0x7/0xb |
80 |
|
81 |
Code: |
82 |
66 39 72 1e 74 37 8b 1b 8b 03 0f 18 00 90 a1 0c 8c 9d f8 01 |
83 |
|
84 |
<0>Kernel panic: Fatal exception in interrupt |
85 |
In interrupt handler - not syncing |
86 |
|
87 |
|
88 |
-- |
89 |
WBR, Alex. |