Gentoo Archives: gentoo-user

From: brettrsears@×××××.com
To: gentoo-user@l.g.o
Subject: Re: [gentoo-user] strange TCP timeout errors
Date: Wed, 07 Oct 2015 19:42:28
Message-Id: 588564942-1444246926-cardhu_decombobulator_blackberry.rim.net-2145807130-@b25.c2.bise6.blackberry
In Reply to: Re: [gentoo-user] strange TCP timeout errors by Alan McKinnon
1 YyyyYYuIIIIIU
2 Sent from my Verizon Wireless BlackBerry
3
4 -----Original Message-----
5 From: Alan McKinnon <alan.mckinnon@×××××.com>
6 Date: Wed, 7 Oct 2015 20:39:42
7 To: <gentoo-user@l.g.o>
8 Reply-to: gentoo-user@l.g.o
9 Subject: Re: [gentoo-user] strange TCP timeout errors
10
11 On 07/10/2015 17:55, Grant wrote:
12 >>>>>> I've attached a PNG from Munin showing the TCP timeout errors on my
13 >>>>>> Gentoo server over the past month. The data is expressed in timeouts
14 >>>>>> per second and that rate is shown to be steadily increasing over the
15 >>>>>> past month. That seems strange to me. Munin doesn't show any other
16 >>>>>> data point increasing like this over the time period. Any ideas?
17 >>>>>>
18 >>>>>> - Grant
19 >>>>>>
20 >>>>>
21 >>>>> weird - does it reset on an interface restart or reboot?
22 >>>>
23 >>>> this would be my test #1
24 >>>
25 >>>
26 >>> I rebooted and the rate of errors has dropped off to almost nothing.
27 >>>
28 >>>
29 >>>>> Can you verify its not an artefact within munin (how?)
30 >>>>
31 >>>> In theory, a misconfigured graph can do this. Munin can draw many
32 >>>> different types of graph, including cumulative values. Even for a data
33 >>>> type like this which is X events per unit time, if you tell munin to add
34 >>>> them all up, it will do so and graph it.
35 >>>>
36 >>>> Qucik test is to look at the graph config.
37 >>>
38 >>>
39 >>> This graph lives in the "network" section of the munin web interface.
40 >>> There is no matching section in /etc/munin/plugin-conf.d/munin-node so
41 >>> it should be be using the default config.
42 >>>
43 >>> Any ideas based on this new info?
44 >>
45 >> A few :-)
46 >>
47 >>
48 >> I can't find the plugin that delivers that graph though. Maybe I just
49 >> don't have it, maybe it comes from contrib/
50 >>
51 >> What's your USE for munin?
52 >
53 >
54 > USE="apache cgi http mysql ssl syslog -asterisk -dhcpd -doc -ipmi
55 > -ipv6 -irc -java -memcached -minimal -postgres (-selinux) {-test}"
56 >
57 >
58 >> What do you have in "ls -al /etc/munin/plugins/" ?
59
60
61 It's as I thought - your data is accurate but rrd has been given a
62 completely wrong method to derive the graphs.
63
64 Munin graphs for section "Network" do not have to be in a file called
65 "network" - it's just a category and the plugin defines what web-page
66 section it must be in. In your case, the relevant plugin is
67 netstat_multi which doesn't often get installed. It's data source is
68 "netstat -s" so grep that output for "timeout" to see it.
69
70 Timeouts are cumulative counters, they do not get less till they wrap
71 around. So to scale them, the plugin gets the rrd file to subtract
72 previous reading from current reading and divide by the time interval to
73 get the timeouts/sec. This is all done inside rrd when the data files
74 are updated (it's quite a lot of magic)
75
76 That plugin sets the graph type to DERIVE
77 (/etc/munin/plugins/netstat_multi around line 190. I feel it should be
78 GAUGE or COUNTER.
79
80 The proper reference on rrd is
81 http://oss.oetiker.ch/rrdtool/doc/rrdcreate.en.html
82 and the munin docs are
83 https://munin.readthedocs.org/en/latest/index.html
84
85 You must edit the plugin file and IIRC recreate the rrd, you will lose
86 all past info (can't be helped).
87
88
89 [snip ls output]
90
91
92 > P.S. Any other good plugins you'd recommend?
93
94 http://gallery.munin-monitoring.org/
95
96 Monitoring is highly site-specific so recommendations aren't usually
97 worth much, but that gallery has LOTS of contributed plugins
98
99 --
100 Alan McKinnon
101 alan.mckinnon@×××××.com

Replies

Subject Author
Re: [gentoo-user] strange TCP timeout errors Alan McKinnon <alan.mckinnon@×××××.com>