Gentoo Archives: gentoo-user

From: Mark Seger <mjseger@×××××.com>
To: gentoo-user@l.g.o
Subject: Re: [gentoo-user] SAR vs collectl
Date: Sat, 01 Sep 2012 12:01:12
Message-Id: CAC2B=ZHVRC-BYq1Tou0XygMCUNrGsy8seDMFKA_TbdzfOOS0Yw@mail.gmail.com
1 I just discovered the conversation about collectl and saw in a list archive
2 and thought I'd jump in.
3 When I first wrote collectl over 10 years ago the we felt we needed a more
4 powerful/flexible tool than sar to work with out High Performance customers
5 at HP. For example, we needed to record a lot more types of information
6 than sar such as Infiniband and Lustre File System statistics. How about
7 impi data such as temperatures or fan speeds? Power consumption? Anybody
8 remember Quadrics interconnect? Collectl does that too, but there's a
9 whole lot more to collectl than just types of data it collects.
10
11 Rather than repeating what's on the website -
12 http://collectl.sourceforge.net/, you can read some of the features
13 yourselves. Suffice it to say it runs on some of the worlds largest
14 clusters, sampling hundreds of data points every 10 seconds while using <
15 0.1% of a CPU.
16
17 But even more are 2 utilities that make it even more useful -
18 http://collectl-utils.sourceforge.net/. colplot lets you produce high
19 resolution plot for dozens (or more) of nodes via a browser. colmux allows
20 you to monitor hundreds of nodes in real-time from a single window, much
21 like top. but unlike top which only shows top processes, colmux can do
22 that as well as show top-anything! at least anything collectl can report.
23 for example, if you had dozens of servers, each with dozens of disks, you
24 can use colmux to find the disks with the longest wait time. or how about
25 the systems with the highest temps?
26
27 anyhow, see for yourself and check it out.
28
29 -mark