Gentoo Archives: gentoo-user

From: hw <hw@×××××.de>
To: gentoo-user@l.g.o
Subject: [gentoo-user] Networking trouble
Date: Thu, 15 Oct 2015 13:30:19
Message-Id: 561FAA59.5080707@gc-24.de
1 Hi,
2
3 I have a xen host with some HV guests which becomes unreachable via
4 the network after apparently random amount of times. I have already
5 switched the network card to see if that would make a difference,
6 and with the card currently installed, it worked fine for over 20 days
7 until it become unreachable again. Before switching the network card,
8 it would run a week or two before becoming unreachable. The previous
9 card was the on-board BCM5764M which uses the tg3 driver.
10
11 There are messages like this in the log file:
12
13
14 Oct 14 20:58:02 moonflo kernel: ------------[ cut here ]------------
15 Oct 14 20:58:02 moonflo kernel: WARNING: CPU: 10 PID: 0 at net/sched/sch_generic.c:303 dev_watchdog+0x259/0x270()
16 Oct 14 20:58:02 moonflo kernel: NETDEV WATCHDOG: enp55s4 (r8169): transmit queue 0 timed out
17 Oct 14 20:58:02 moonflo kernel: Modules linked in: arc4 ecb md4 hmac nls_utf8 cifs fscache xt_physdev br_netfilter iptable_filter ip_tables xen_pciback xen_gntalloc xen_gntdev bridge stp llc zfs(PO) nouveau snd_hda_codec_realtek snd_hda_codec_generic zunicode(PO) zavl(PO) zcommon(PO) znvpair(PO) spl(O) zlib_deflate video backlight drm_kms_helper ttm snd_hda_intel snd_hda_controller snd_hda_codec snd_pcm snd_timer snd soundcore r8169 mii xts aesni_intel glue_helper lrw gf128mul ablk_helper cryptd aes_x86_64 sha256_generic hid_generic usbhid uhci_hcd usb_storage ehci_pci ehci_hcd usbcore usb_common
18 Oct 14 20:58:02 moonflo kernel: CPU: 10 PID: 0 Comm: swapper/10 Tainted: P O 4.0.5-gentoo #3
19 Oct 14 20:58:02 moonflo kernel: Hardware name: Hewlett-Packard HP Z800 Workstation/0AECh, BIOS 786G5 v03.57 07/15/2013
20 Oct 14 20:58:02 moonflo kernel: ffffffff8175a77d ffff880124d43d98 ffffffff814da8d8 0000000000000001
21 Oct 14 20:58:02 moonflo kernel: ffff880124d43de8 ffff880124d43dd8 ffffffff81088850 ffff880124d43dd8
22 Oct 14 20:58:02 moonflo kernel: 0000000000000000 ffff8800d45f2000 0000000000000001 ffff8800d5294880
23 Oct 14 20:58:02 moonflo kernel: Call Trace:
24 Oct 14 20:58:02 moonflo kernel: <IRQ> [<ffffffff814da8d8>] dump_stack+0x45/0x57
25 Oct 14 20:58:02 moonflo kernel: [<ffffffff81088850>] warn_slowpath_common+0x80/0xc0
26 Oct 14 20:58:02 moonflo kernel: [<ffffffff810888d1>] warn_slowpath_fmt+0x41/0x50
27 Oct 14 20:58:02 moonflo kernel: [<ffffffff812b31c5>] ? add_interrupt_randomness+0x35/0x1e0
28 Oct 14 20:58:02 moonflo kernel: [<ffffffff8145b819>] dev_watchdog+0x259/0x270
29 Oct 14 20:58:02 moonflo kernel: [<ffffffff8145b5c0>] ? dev_graft_qdisc+0x80/0x80
30 Oct 14 20:58:02 moonflo kernel: [<ffffffff8145b5c0>] ? dev_graft_qdisc+0x80/0x80
31 Oct 14 20:58:02 moonflo kernel: [<ffffffff810d4047>] call_timer_fn.isra.30+0x17/0x70
32 Oct 14 20:58:02 moonflo kernel: [<ffffffff810d42a6>] run_timer_softirq+0x176/0x2b0
33 Oct 14 20:58:02 moonflo kernel: [<ffffffff8108bd0a>] __do_softirq+0xda/0x1f0
34 Oct 14 20:58:02 moonflo kernel: [<ffffffff8108c04e>] irq_exit+0x7e/0xa0
35 Oct 14 20:58:02 moonflo kernel: [<ffffffff8130e075>] xen_evtchn_do_upcall+0x35/0x50
36 Oct 14 20:58:02 moonflo kernel: [<ffffffff814e1e8e>] xen_do_hypervisor_callback+0x1e/0x40
37 Oct 14 20:58:02 moonflo kernel: <EOI> [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20
38 Oct 14 20:58:02 moonflo kernel: [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20
39 Oct 14 20:58:02 moonflo kernel: [<ffffffff810459e0>] ? xen_safe_halt+0x10/0x20
40 Oct 14 20:58:02 moonflo kernel: [<ffffffff81053979>] ? default_idle+0x9/0x10
41 Oct 14 20:58:02 moonflo kernel: [<ffffffff810542da>] ? arch_cpu_idle+0xa/0x10
42 Oct 14 20:58:02 moonflo kernel: [<ffffffff810bd170>] ? cpu_startup_entry+0x190/0x2f0
43 Oct 14 20:58:02 moonflo kernel: [<ffffffff81047cd5>] ? cpu_bringup_and_idle+0x25/0x40
44 Oct 14 20:58:02 moonflo kernel: ---[ end trace 98d961bae351244d ]---
45 Oct 14 20:58:02 moonflo kernel: r8169 0000:37:04.0 enp55s4: link up
46
47
48 After that, there are lots of messages about the link being up, one message
49 every 12 seconds. When you unplug the network cable, you get a message that
50 the link is down, and no message when you plug it in again.
51
52 I was hoping that switching the network card (to one that uses a different
53 driver) might solve the problem, and it did not. Now I can only guess that
54 the network card goes to sleep and sometimes cannot be woken up again.
55
56 I tried to reduce the connection speed to 100Mbit and found that accessing the VMs
57 (via RDP) becomes too slow to use them. So I disabled the power management of the
58 network card (through sysfs) and will have to see if the problem persists.
59
60 We'll be getting decent network cards in a couple days, but since the problem
61 doesn't seem to be related to a particular card/model/manufacturer, that might
62 not fix it, either.
63
64 This problem seems to only occur on machines that operate as a xen server.
65 Other machines, identical Z800s, not running xen, run just fine.
66
67 What would you suggest?

Replies

Subject Author
Re: [gentoo-user] Networking trouble "J. Roeleveld" <joost@××××××××.org>