1 |
I have a very severe problem after a recent disk replacement. After a few days |
2 |
running, all new processes just hang. The kernel reports: |
3 |
|
4 |
Jan 5 02:25:36 opal kernel: INFO: task mysqld:11387 blocked for more than 120 |
5 |
seconds. |
6 |
Jan 5 02:25:36 opal kernel: "echo 0 > |
7 |
/proc/sys/kernel/hung_task_timeout_secs" disables this message. |
8 |
Jan 5 02:25:36 opal kernel: mysqld D 0000000000000000 0 11387 |
9 |
1 0x00000000 |
10 |
Jan 5 02:25:36 opal kernel: ffff880012caccc0 0000000000000082 |
11 |
0000000000011280 ffff88012f08c660 |
12 |
Jan 5 02:25:36 opal kernel: 0000000000011280 ffff88012920dfd8 |
13 |
0000000000011280 ffff88012920c010 |
14 |
Jan 5 02:25:36 opal kernel: ffff88012920dfd8 0000000000011280 |
15 |
ffff880012caccc0 0000000000011280 |
16 |
Jan 5 02:25:36 opal kernel: Call Trace: |
17 |
Jan 5 02:25:36 opal kernel: [<ffffffff810b9caf>] ? |
18 |
find_get_pages_tag+0xef/0x1a0 |
19 |
Jan 5 02:25:36 opal kernel: [<ffffffff8102c455>] ? |
20 |
default_spin_lock_flags+0x5/0x10 |
21 |
Jan 5 02:25:36 opal kernel: [<ffffffff8143401b>] ? |
22 |
_raw_spin_lock_irqsave+0x3b/0x60 |
23 |
Jan 5 02:25:36 opal kernel: [<ffffffff81046ec3>] ? lock_timer_base+0x33/0x70 |
24 |
Jan 5 02:25:36 opal kernel: [<ffffffff8107a609>] ? |
25 |
debug_mutex_add_waiter+0x29/0x70 |
26 |
Jan 5 02:25:36 opal kernel: [<ffffffff814312cf>] ? |
27 |
__mutex_lock_slowpath+0x22f/0x310 |
28 |
Jan 5 02:25:36 opal kernel: [<ffffffff8102c455>] ? |
29 |
default_spin_lock_flags+0x5/0x10 |
30 |
Jan 5 02:25:36 opal kernel: [<ffffffff8143401b>] ? |
31 |
_raw_spin_lock_irqsave+0x3b/0x60 |
32 |
Jan 5 02:25:36 opal kernel: [<ffffffff8118cd81>] ? queue_log_writer+0x91/0xe0 |
33 |
Jan 5 02:25:36 opal kernel: [<ffffffff81066a80>] ? try_to_wake_up+0x2b0/0x2b0 |
34 |
Jan 5 02:25:36 opal kernel: [<ffffffff81192e10>] ? |
35 |
reiserfs_commit_for_inode+0x140/0x230 |
36 |
Jan 5 02:25:36 opal kernel: [<ffffffff81179e87>] ? |
37 |
reiserfs_sync_file+0x97/0x120 |
38 |
Jan 5 02:25:36 opal kernel: [<ffffffff811290b1>] ? do_fsync+0x31/0x70 |
39 |
Jan 5 02:25:36 opal kernel: [<ffffffff810ff76c>] ? sys_pwrite64+0x7c/0xb0 |
40 |
Jan 5 02:25:36 opal kernel: [<ffffffff8112911b>] ? sys_fsync+0xb/0x20 |
41 |
Jan 5 02:25:36 opal kernel: [<ffffffff81434a39>] ? |
42 |
system_call_fastpath+0x16/0x1b |
43 |
Jan 5 02:25:36 opal kernel: INFO: task kworker/1:1:27685 blocked for more |
44 |
than 120 seconds. |
45 |
Jan 5 02:25:36 opal kernel: "echo 0 > |
46 |
/proc/sys/kernel/hung_task_timeout_secs" disables this message. |
47 |
Jan 5 02:25:36 opal kernel: kworker/1:1 D ffff880005ee5980 0 27685 |
48 |
2 0x00000000 |
49 |
Jan 5 02:25:36 opal kernel: ffff880005ee5980 0000000000000046 |
50 |
0000000000000000 ffff880128354660 |
51 |
Jan 5 02:25:36 opal kernel: 0000000000011280 ffff8801018e3fd8 |
52 |
0000000000011280 ffff8801018e2010 |
53 |
Jan 5 02:25:36 opal kernel: ffff8801018e3fd8 0000000000011280 |
54 |
ffff880005ee5980 0000000000011280 |
55 |
Jan 5 02:25:36 opal kernel: Call Trace: |
56 |
Jan 5 02:25:36 opal kernel: [<ffffffff81066a80>] ? try_to_wake_up+0x2b0/0x2b0 |
57 |
Jan 5 02:25:36 opal kernel: [<ffffffff8106ac92>] ? load_balance+0x102/0x790 |
58 |
Jan 5 02:25:36 opal kernel: [<ffffffff8105fbd0>] ? __wake_up_common+0x50/0x80 |
59 |
Jan 5 02:25:36 opal kernel: [<ffffffff8107a609>] ? |
60 |
debug_mutex_add_waiter+0x29/0x70 |
61 |
Jan 5 02:25:36 opal kernel: [<ffffffff814312cf>] ? |
62 |
__mutex_lock_slowpath+0x22f/0x310 |
63 |
Jan 5 02:25:36 opal kernel: [<ffffffff8107a609>] ? |
64 |
debug_mutex_add_waiter+0x29/0x70 |
65 |
Jan 5 02:25:36 opal kernel: [<ffffffff814312cf>] ? |
66 |
__mutex_lock_slowpath+0x22f/0x310 |
67 |
Jan 5 02:25:36 opal kernel: [<ffffffff8102c455>] ? |
68 |
default_spin_lock_flags+0x5/0x10 |
69 |
Jan 5 02:25:36 opal kernel: [<ffffffff8143401b>] ? |
70 |
_raw_spin_lock_irqsave+0x3b/0x60 |
71 |
Jan 5 02:25:36 opal kernel: [<ffffffff8118cd81>] ? queue_log_writer+0x91/0xe0 |
72 |
Jan 5 02:25:36 opal kernel: [<ffffffff81066a80>] ? try_to_wake_up+0x2b0/0x2b0 |
73 |
Jan 5 02:25:36 opal kernel: [<ffffffff8118f3f6>] ? do_journal_end+0x1d6/0xf00 |
74 |
Jan 5 02:25:36 opal kernel: [<ffffffff8117ff20>] ? reiserfs_sync_fs+0x70/0x70 |
75 |
Jan 5 02:25:36 opal kernel: [<ffffffff8117ff00>] ? reiserfs_sync_fs+0x50/0x70 |
76 |
Jan 5 02:25:36 opal kernel: [<ffffffff8117ff5e>] ? |
77 |
flush_old_commits+0x3e/0x60 |
78 |
Jan 5 02:25:36 opal kernel: [<ffffffff8105054c>] ? |
79 |
process_one_work+0x14c/0x450 |
80 |
Jan 5 02:25:36 opal kernel: [<ffffffff81050c8f>] ? worker_thread+0x13f/0x4d0 |
81 |
Jan 5 02:25:36 opal kernel: [<ffffffff81050b50>] ? manage_workers+0x300/0x300 |
82 |
Jan 5 02:25:36 opal kernel: [<ffffffff81050b50>] ? manage_workers+0x300/0x300 |
83 |
Jan 5 02:25:36 opal kernel: [<ffffffff810578de>] ? kthread+0x9e/0xb0 |
84 |
Jan 5 02:25:36 opal kernel: [<ffffffff81435ac4>] ? |
85 |
kernel_thread_helper+0x4/0x10 |
86 |
Jan 5 02:25:36 opal kernel: [<ffffffff81057840>] ? |
87 |
kthread_freezable_should_stop+0x60/0x60 |
88 |
Jan 5 02:25:36 opal kernel: [<ffffffff81435ac0>] ? gs_change+0x13/0x13 |
89 |
|
90 |
I think it only occurs when I am using the machine in graphic mode (NVidia |
91 |
binary drivers) but am not positive. I have rebuilt the system assuming some |
92 |
corruption after the disk restore and built a new kernel but it makes no |
93 |
difference. The only sure thing is this never happened before the new disk; |
94 |
the trace-backs do seem to indicate it's trying to write but I did manage to |
95 |
write a small file to a partition, so the file-system seems OK. Once this |
96 |
happens the system is toast, sync, reboot and umount commands just hang, only |
97 |
Alt-Sysrq-B does anything. I would be grateful for any suggestions! |
98 |
|
99 |
TIA |
100 |
-Robin |
101 |
-- |
102 |
---------------------------------------------------------------------- |
103 |
Robin Atwood. |
104 |
|
105 |
"Ship me somewheres east of Suez, where the best is like the worst, |
106 |
Where there ain't no Ten Commandments an' a man can raise a thirst" |
107 |
from "Mandalay" by Rudyard Kipling |
108 |
---------------------------------------------------------------------- |