1 |
Hey all, |
2 |
|
3 |
My desktop system has an NVidia graphics card that identifies as: |
4 |
|
5 |
% lspci -v |
6 |
# snip... |
7 |
01:00.0 VGA compatible controller: NVIDIA Corporation GK107 [GeForce GTX |
8 |
650] (rev a1) (prog-if 00 [VGA controller]) |
9 |
Subsystem: Gigabyte Technology Co., Ltd GK107 [GeForce GTX 650] |
10 |
Flags: bus master, fast devsel, latency 0, IRQ 29 |
11 |
Memory at f6000000 (32-bit, non-prefetchable) [size=16M] |
12 |
Memory at e0000000 (64-bit, prefetchable) [size=256M] |
13 |
Memory at f0000000 (64-bit, prefetchable) [size=32M] |
14 |
I/O ports at e000 [size=128] |
15 |
Expansion ROM at 000c0000 [disabled] [size=128K] |
16 |
Capabilities: [60] Power Management version 3 |
17 |
Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+ |
18 |
Capabilities: [78] Express Endpoint, MSI 00 |
19 |
Capabilities: [b4] Vendor Specific Information: Len=14 <?> |
20 |
Capabilities: [100] Virtual Channel |
21 |
Capabilities: [128] Power Budgeting <?> |
22 |
Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 |
23 |
Len=024 <?> |
24 |
Capabilities: [900] #19 |
25 |
Kernel driver in use: nouveau |
26 |
|
27 |
From time to time, maybe once or twice a week, my system will fail. The |
28 |
symptoms are: |
29 |
|
30 |
- Graphics freeze, no mouse movement, and they never start working no |
31 |
matter how long I wait |
32 |
- Sound is working (spotify keeps playing) |
33 |
- Network connectivity works (I can ssh in) |
34 |
|
35 |
When this happens and I ssh in and check out dmesg, I always see an error |
36 |
like the following: |
37 |
|
38 |
[11741.905192] nouveau 0000:01:00.0: fifo: SCHED_ERROR 0a [CTXSW_TIMEOUT] |
39 |
[11741.905202] nouveau 0000:01:00.0: fifo: gr engine fault on channel 10, |
40 |
recovering... |
41 |
|
42 |
Sometimes I see a lot of those errors, sometimes just one. Whenever the |
43 |
system is running normally those don't ever appear. I'm always able to ssh |
44 |
in and reboot cleanly. |
45 |
|
46 |
Does anyone have any idea where I can start digging in to find out what's |
47 |
happening? Are these fifo errors happening in some logic that I can |
48 |
disable with a kernel command line option? |
49 |
|
50 |
Thanks, |
51 |
Devrin |