Gentoo Archives: gentoo-user

From: Robin Atwood <robin.atwood@×××××××××.net>
To: gentoo-user@l.g.o
Subject: [gentoo-user] Processes hang - system dies
Date: Sat, 05 Jan 2013 14:07:01
Message-Id: 201301052105.30959.robin.atwood@attglobal.net
1 I have a very severe problem after a recent disk replacement. After a few days
2 running, all new processes just hang. The kernel reports:
3
4 Jan 5 02:25:36 opal kernel: INFO: task mysqld:11387 blocked for more than 120
5 seconds.
6 Jan 5 02:25:36 opal kernel: "echo 0 >
7 /proc/sys/kernel/hung_task_timeout_secs" disables this message.
8 Jan 5 02:25:36 opal kernel: mysqld D 0000000000000000 0 11387
9 1 0x00000000
10 Jan 5 02:25:36 opal kernel: ffff880012caccc0 0000000000000082
11 0000000000011280 ffff88012f08c660
12 Jan 5 02:25:36 opal kernel: 0000000000011280 ffff88012920dfd8
13 0000000000011280 ffff88012920c010
14 Jan 5 02:25:36 opal kernel: ffff88012920dfd8 0000000000011280
15 ffff880012caccc0 0000000000011280
16 Jan 5 02:25:36 opal kernel: Call Trace:
17 Jan 5 02:25:36 opal kernel: [<ffffffff810b9caf>] ?
18 find_get_pages_tag+0xef/0x1a0
19 Jan 5 02:25:36 opal kernel: [<ffffffff8102c455>] ?
20 default_spin_lock_flags+0x5/0x10
21 Jan 5 02:25:36 opal kernel: [<ffffffff8143401b>] ?
22 _raw_spin_lock_irqsave+0x3b/0x60
23 Jan 5 02:25:36 opal kernel: [<ffffffff81046ec3>] ? lock_timer_base+0x33/0x70
24 Jan 5 02:25:36 opal kernel: [<ffffffff8107a609>] ?
25 debug_mutex_add_waiter+0x29/0x70
26 Jan 5 02:25:36 opal kernel: [<ffffffff814312cf>] ?
27 __mutex_lock_slowpath+0x22f/0x310
28 Jan 5 02:25:36 opal kernel: [<ffffffff8102c455>] ?
29 default_spin_lock_flags+0x5/0x10
30 Jan 5 02:25:36 opal kernel: [<ffffffff8143401b>] ?
31 _raw_spin_lock_irqsave+0x3b/0x60
32 Jan 5 02:25:36 opal kernel: [<ffffffff8118cd81>] ? queue_log_writer+0x91/0xe0
33 Jan 5 02:25:36 opal kernel: [<ffffffff81066a80>] ? try_to_wake_up+0x2b0/0x2b0
34 Jan 5 02:25:36 opal kernel: [<ffffffff81192e10>] ?
35 reiserfs_commit_for_inode+0x140/0x230
36 Jan 5 02:25:36 opal kernel: [<ffffffff81179e87>] ?
37 reiserfs_sync_file+0x97/0x120
38 Jan 5 02:25:36 opal kernel: [<ffffffff811290b1>] ? do_fsync+0x31/0x70
39 Jan 5 02:25:36 opal kernel: [<ffffffff810ff76c>] ? sys_pwrite64+0x7c/0xb0
40 Jan 5 02:25:36 opal kernel: [<ffffffff8112911b>] ? sys_fsync+0xb/0x20
41 Jan 5 02:25:36 opal kernel: [<ffffffff81434a39>] ?
42 system_call_fastpath+0x16/0x1b
43 Jan 5 02:25:36 opal kernel: INFO: task kworker/1:1:27685 blocked for more
44 than 120 seconds.
45 Jan 5 02:25:36 opal kernel: "echo 0 >
46 /proc/sys/kernel/hung_task_timeout_secs" disables this message.
47 Jan 5 02:25:36 opal kernel: kworker/1:1 D ffff880005ee5980 0 27685
48 2 0x00000000
49 Jan 5 02:25:36 opal kernel: ffff880005ee5980 0000000000000046
50 0000000000000000 ffff880128354660
51 Jan 5 02:25:36 opal kernel: 0000000000011280 ffff8801018e3fd8
52 0000000000011280 ffff8801018e2010
53 Jan 5 02:25:36 opal kernel: ffff8801018e3fd8 0000000000011280
54 ffff880005ee5980 0000000000011280
55 Jan 5 02:25:36 opal kernel: Call Trace:
56 Jan 5 02:25:36 opal kernel: [<ffffffff81066a80>] ? try_to_wake_up+0x2b0/0x2b0
57 Jan 5 02:25:36 opal kernel: [<ffffffff8106ac92>] ? load_balance+0x102/0x790
58 Jan 5 02:25:36 opal kernel: [<ffffffff8105fbd0>] ? __wake_up_common+0x50/0x80
59 Jan 5 02:25:36 opal kernel: [<ffffffff8107a609>] ?
60 debug_mutex_add_waiter+0x29/0x70
61 Jan 5 02:25:36 opal kernel: [<ffffffff814312cf>] ?
62 __mutex_lock_slowpath+0x22f/0x310
63 Jan 5 02:25:36 opal kernel: [<ffffffff8107a609>] ?
64 debug_mutex_add_waiter+0x29/0x70
65 Jan 5 02:25:36 opal kernel: [<ffffffff814312cf>] ?
66 __mutex_lock_slowpath+0x22f/0x310
67 Jan 5 02:25:36 opal kernel: [<ffffffff8102c455>] ?
68 default_spin_lock_flags+0x5/0x10
69 Jan 5 02:25:36 opal kernel: [<ffffffff8143401b>] ?
70 _raw_spin_lock_irqsave+0x3b/0x60
71 Jan 5 02:25:36 opal kernel: [<ffffffff8118cd81>] ? queue_log_writer+0x91/0xe0
72 Jan 5 02:25:36 opal kernel: [<ffffffff81066a80>] ? try_to_wake_up+0x2b0/0x2b0
73 Jan 5 02:25:36 opal kernel: [<ffffffff8118f3f6>] ? do_journal_end+0x1d6/0xf00
74 Jan 5 02:25:36 opal kernel: [<ffffffff8117ff20>] ? reiserfs_sync_fs+0x70/0x70
75 Jan 5 02:25:36 opal kernel: [<ffffffff8117ff00>] ? reiserfs_sync_fs+0x50/0x70
76 Jan 5 02:25:36 opal kernel: [<ffffffff8117ff5e>] ?
77 flush_old_commits+0x3e/0x60
78 Jan 5 02:25:36 opal kernel: [<ffffffff8105054c>] ?
79 process_one_work+0x14c/0x450
80 Jan 5 02:25:36 opal kernel: [<ffffffff81050c8f>] ? worker_thread+0x13f/0x4d0
81 Jan 5 02:25:36 opal kernel: [<ffffffff81050b50>] ? manage_workers+0x300/0x300
82 Jan 5 02:25:36 opal kernel: [<ffffffff81050b50>] ? manage_workers+0x300/0x300
83 Jan 5 02:25:36 opal kernel: [<ffffffff810578de>] ? kthread+0x9e/0xb0
84 Jan 5 02:25:36 opal kernel: [<ffffffff81435ac4>] ?
85 kernel_thread_helper+0x4/0x10
86 Jan 5 02:25:36 opal kernel: [<ffffffff81057840>] ?
87 kthread_freezable_should_stop+0x60/0x60
88 Jan 5 02:25:36 opal kernel: [<ffffffff81435ac0>] ? gs_change+0x13/0x13
89
90 I think it only occurs when I am using the machine in graphic mode (NVidia
91 binary drivers) but am not positive. I have rebuilt the system assuming some
92 corruption after the disk restore and built a new kernel but it makes no
93 difference. The only sure thing is this never happened before the new disk;
94 the trace-backs do seem to indicate it's trying to write but I did manage to
95 write a small file to a partition, so the file-system seems OK. Once this
96 happens the system is toast, sync, reboot and umount commands just hang, only
97 Alt-Sysrq-B does anything. I would be grateful for any suggestions!
98
99 TIA
100 -Robin
101 --
102 ----------------------------------------------------------------------
103 Robin Atwood.
104
105 "Ship me somewheres east of Suez, where the best is like the worst,
106 Where there ain't no Ten Commandments an' a man can raise a thirst"
107 from "Mandalay" by Rudyard Kipling
108 ----------------------------------------------------------------------

Replies

Subject Author
Re: [gentoo-user] Processes hang - system dies Adam Carter <adamcarter3@×××××.com>