Gentoo Archives: gentoo-user

From: "J. Roeleveld" <joost@××××××××.org>
To: gentoo-user@l.g.o
Subject: Re: [gentoo-user] Networking trouble
Date: Thu, 15 Oct 2015 13:54:18
Message-Id: 1637330.AMsFmt32R0@andromeda
In Reply to: [gentoo-user] Networking trouble by hw
1 On Thursday, October 15, 2015 03:30:01 PM hw wrote:
2 > Hi,
3 >
4 > I have a xen host with some HV guests which becomes unreachable via
5 > the network after apparently random amount of times. I have already
6 > switched the network card to see if that would make a difference,
7 > and with the card currently installed, it worked fine for over 20 days
8 > until it become unreachable again. Before switching the network card,
9 > it would run a week or two before becoming unreachable. The previous
10 > card was the on-board BCM5764M which uses the tg3 driver.
11 >
12 > There are messages like this in the log file:
13 >
14 >
15 > Oct 14 20:58:02 moonflo kernel: ------------[ cut here ]------------
16 > Oct 14 20:58:02 moonflo kernel: WARNING: CPU: 10 PID: 0 at
17 > net/sched/sch_generic.c:303 dev_watchdog+0x259/0x270() Oct 14 20:58:02
18 > moonflo kernel: NETDEV WATCHDOG: enp55s4 (r8169): transmit queue 0 timed
19 > out Oct 14 20:58:02 moonflo kernel: Modules linked in: arc4 ecb md4 hmac
20 > nls_utf8 cifs fscache xt_physdev br_netfilter iptable_filter ip_tables
21 > xen_pciback xen_gntalloc xen_gntdev bridge stp llc zfs(PO) nouveau
22 > snd_hda_codec_realtek snd_hda_codec_generic zunicode(PO) zavl(PO)
23 > zcommon(PO) znvpair(PO) spl(O) zlib_deflate video backlight drm_kms_helper
24 > ttm snd_hda_intel snd_hda_controller snd_hda_codec snd_pcm snd_timer snd
25 > soundcore r8169 mii xts aesni_intel glue_helper lrw gf128mul ablk_helper
26 > cryptd aes_x86_64 sha256_generic hid_generic usbhid uhci_hcd usb_storage
27 > ehci_pci ehci_hcd usbcore usb_common Oct 14 20:58:02 moonflo kernel: CPU:
28 > 10 PID: 0 Comm: swapper/10 Tainted: P O 4.0.5-gentoo #3 Oct 14
29 > 20:58:02 moonflo kernel: Hardware name: Hewlett-Packard HP Z800
30 > Workstation/0AECh, BIOS 786G5 v03.57 07/15/2013 Oct 14 20:58:02 moonflo
31 > kernel: ffffffff8175a77d ffff880124d43d98 ffffffff814da8d8
32 > 0000000000000001 Oct 14 20:58:02 moonflo kernel: ffff880124d43de8
33 > ffff880124d43dd8 ffffffff81088850 ffff880124d43dd8 Oct 14 20:58:02 moonflo
34 > kernel: 0000000000000000 ffff8800d45f2000 0000000000000001
35 > ffff8800d5294880 Oct 14 20:58:02 moonflo kernel: Call Trace:
36 > Oct 14 20:58:02 moonflo kernel: <IRQ> [<ffffffff814da8d8>]
37 > dump_stack+0x45/0x57 Oct 14 20:58:02 moonflo kernel: [<ffffffff81088850>]
38 > warn_slowpath_common+0x80/0xc0 Oct 14 20:58:02 moonflo kernel:
39 > [<ffffffff810888d1>] warn_slowpath_fmt+0x41/0x50 Oct 14 20:58:02 moonflo
40 > kernel: [<ffffffff812b31c5>] ? add_interrupt_randomness+0x35/0x1e0 Oct 14
41 > 20:58:02 moonflo kernel: [<ffffffff8145b819>] dev_watchdog+0x259/0x270 Oct
42 > 14 20:58:02 moonflo kernel: [<ffffffff8145b5c0>] ?
43 > dev_graft_qdisc+0x80/0x80 Oct 14 20:58:02 moonflo kernel:
44 > [<ffffffff8145b5c0>] ? dev_graft_qdisc+0x80/0x80 Oct 14 20:58:02 moonflo
45 > kernel: [<ffffffff810d4047>] call_timer_fn.isra.30+0x17/0x70 Oct 14
46 > 20:58:02 moonflo kernel: [<ffffffff810d42a6>]
47 > run_timer_softirq+0x176/0x2b0 Oct 14 20:58:02 moonflo kernel:
48 > [<ffffffff8108bd0a>] __do_softirq+0xda/0x1f0 Oct 14 20:58:02 moonflo
49 > kernel: [<ffffffff8108c04e>] irq_exit+0x7e/0xa0 Oct 14 20:58:02 moonflo
50 > kernel: [<ffffffff8130e075>] xen_evtchn_do_upcall+0x35/0x50 Oct 14
51 > 20:58:02 moonflo kernel: [<ffffffff814e1e8e>]
52 > xen_do_hypervisor_callback+0x1e/0x40 Oct 14 20:58:02 moonflo kernel: <EOI>
53 > [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20 Oct 14 20:58:02
54 > moonflo kernel: [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20 Oct
55 > 14 20:58:02 moonflo kernel: [<ffffffff810459e0>] ? xen_safe_halt+0x10/0x20
56 > Oct 14 20:58:02 moonflo kernel: [<ffffffff81053979>] ?
57 > default_idle+0x9/0x10 Oct 14 20:58:02 moonflo kernel: [<ffffffff810542da>]
58 > ? arch_cpu_idle+0xa/0x10 Oct 14 20:58:02 moonflo kernel:
59 > [<ffffffff810bd170>] ? cpu_startup_entry+0x190/0x2f0 Oct 14 20:58:02
60 > moonflo kernel: [<ffffffff81047cd5>] ? cpu_bringup_and_idle+0x25/0x40 Oct
61 > 14 20:58:02 moonflo kernel: ---[ end trace 98d961bae351244d ]--- Oct 14
62 > 20:58:02 moonflo kernel: r8169 0000:37:04.0 enp55s4: link up
63 >
64 >
65 > After that, there are lots of messages about the link being up, one message
66 > every 12 seconds. When you unplug the network cable, you get a message that
67 > the link is down, and no message when you plug it in again.
68 >
69 > I was hoping that switching the network card (to one that uses a different
70 > driver) might solve the problem, and it did not. Now I can only guess that
71 > the network card goes to sleep and sometimes cannot be woken up again.
72 >
73 > I tried to reduce the connection speed to 100Mbit and found that accessing
74 > the VMs (via RDP) becomes too slow to use them. So I disabled the power
75 > management of the network card (through sysfs) and will have to see if the
76 > problem persists.
77 >
78 > We'll be getting decent network cards in a couple days, but since the
79 > problem doesn't seem to be related to a particular card/model/manufacturer,
80 > that might not fix it, either.
81 >
82 > This problem seems to only occur on machines that operate as a xen server.
83 > Other machines, identical Z800s, not running xen, run just fine.
84 >
85 > What would you suggest?
86
87 More info required:
88
89 - Which version of Xen
90 - Does this only occur with HVM guests?
91 - Which network-driver are you using inside the guest
92 - Can you connect to the "local" console of the guest?
93 - If yes, does it still have no connectivity?
94
95 I saw the same on my lab machine, which was related to:
96 - Not using correct drivers inside HVM guests
97 - Switch hardware not keeping the MAC/IP/Port lists long enough
98
99 --
100 Joost

Replies

Subject Author
Re: [gentoo-user] Networking trouble hw <hw@×××××.de>