Gentoo Archives: gentoo-user

From: Alan McKinnon <alan.mckinnon@×××××.com>
To: gentoo-user@l.g.o
Subject: Re: [gentoo-user] strange TCP timeout errors
Date: Wed, 07 Oct 2015 22:26:13
Message-Id: 56159BD6.5000500@gmail.com
In Reply to: Re: [gentoo-user] strange TCP timeout errors by brettrsears@gmail.com
1 On 07/10/2015 21:42, brettrsears@×××××.com wrote:
2 > YyyyYYuIIIIIU
3 > Sent from my Verizon Wireless BlackBerry
4
5
6 Hmmmmmmmmmmmmmm, interesting reply. I'm wondering if it has something to
7 do with:
8
9 1. verizon
10 2. dodgy 3g
11 3. crapberry. oops, sorry: blackberry
12
13 Or maybe it's because y, u and i are in a row on the keyboard, shift and
14 enter are adjacent, and you have a over-friendly cat?
15
16 :-)
17
18 >
19 > -----Original Message-----
20 > From: Alan McKinnon <alan.mckinnon@×××××.com>
21 > Date: Wed, 7 Oct 2015 20:39:42
22 > To: <gentoo-user@l.g.o>
23 > Reply-to: gentoo-user@l.g.o
24 > Subject: Re: [gentoo-user] strange TCP timeout errors
25 >
26 > On 07/10/2015 17:55, Grant wrote:
27 >>>>>>> I've attached a PNG from Munin showing the TCP timeout errors on my
28 >>>>>>> Gentoo server over the past month. The data is expressed in timeouts
29 >>>>>>> per second and that rate is shown to be steadily increasing over the
30 >>>>>>> past month. That seems strange to me. Munin doesn't show any other
31 >>>>>>> data point increasing like this over the time period. Any ideas?
32 >>>>>>>
33 >>>>>>> - Grant
34 >>>>>>>
35 >>>>>>
36 >>>>>> weird - does it reset on an interface restart or reboot?
37 >>>>>
38 >>>>> this would be my test #1
39 >>>>
40 >>>>
41 >>>> I rebooted and the rate of errors has dropped off to almost nothing.
42 >>>>
43 >>>>
44 >>>>>> Can you verify its not an artefact within munin (how?)
45 >>>>>
46 >>>>> In theory, a misconfigured graph can do this. Munin can draw many
47 >>>>> different types of graph, including cumulative values. Even for a data
48 >>>>> type like this which is X events per unit time, if you tell munin to add
49 >>>>> them all up, it will do so and graph it.
50 >>>>>
51 >>>>> Qucik test is to look at the graph config.
52 >>>>
53 >>>>
54 >>>> This graph lives in the "network" section of the munin web interface.
55 >>>> There is no matching section in /etc/munin/plugin-conf.d/munin-node so
56 >>>> it should be be using the default config.
57 >>>>
58 >>>> Any ideas based on this new info?
59 >>>
60 >>> A few :-)
61 >>>
62 >>>
63 >>> I can't find the plugin that delivers that graph though. Maybe I just
64 >>> don't have it, maybe it comes from contrib/
65 >>>
66 >>> What's your USE for munin?
67 >>
68 >>
69 >> USE="apache cgi http mysql ssl syslog -asterisk -dhcpd -doc -ipmi
70 >> -ipv6 -irc -java -memcached -minimal -postgres (-selinux) {-test}"
71 >>
72 >>
73 >>> What do you have in "ls -al /etc/munin/plugins/" ?
74 >
75 >
76 > It's as I thought - your data is accurate but rrd has been given a
77 > completely wrong method to derive the graphs.
78 >
79 > Munin graphs for section "Network" do not have to be in a file called
80 > "network" - it's just a category and the plugin defines what web-page
81 > section it must be in. In your case, the relevant plugin is
82 > netstat_multi which doesn't often get installed. It's data source is
83 > "netstat -s" so grep that output for "timeout" to see it.
84 >
85 > Timeouts are cumulative counters, they do not get less till they wrap
86 > around. So to scale them, the plugin gets the rrd file to subtract
87 > previous reading from current reading and divide by the time interval to
88 > get the timeouts/sec. This is all done inside rrd when the data files
89 > are updated (it's quite a lot of magic)
90 >
91 > That plugin sets the graph type to DERIVE
92 > (/etc/munin/plugins/netstat_multi around line 190. I feel it should be
93 > GAUGE or COUNTER.
94 >
95 > The proper reference on rrd is
96 > http://oss.oetiker.ch/rrdtool/doc/rrdcreate.en.html
97 > and the munin docs are
98 > https://munin.readthedocs.org/en/latest/index.html
99 >
100 > You must edit the plugin file and IIRC recreate the rrd, you will lose
101 > all past info (can't be helped).
102 >
103 >
104 > [snip ls output]
105 >
106 >
107 >> P.S. Any other good plugins you'd recommend?
108 >
109 > http://gallery.munin-monitoring.org/
110 >
111 > Monitoring is highly site-specific so recommendations aren't usually
112 > worth much, but that gallery has LOTS of contributed plugins
113 >
114
115
116 --
117 Alan McKinnon
118 alan.mckinnon@×××××.com