From :

AIXpert Blog

 

A large UK customer is running a series of tests on AIX and POWER7 machines and wanted to display the results on a webserver using rrdtool and needed help getting a working example running.  So I stepped up to the challenge one night rather than it becoming a search for rrdtool skills inside IBM - when I could sort something out in a few hours. No one in IBM would claim to be a rrdtool expert until they knew the details of what was required. Many people know it well enough to get what they want done but would not put rrdtool guru on their CV. 
 
rrdtool is a fantastically brilliant command to have in your toolbox. Up there with awk, grep, sed, Apache, ksh, and nmon (of course).  It is used to save data in a fixed size "database", does cascade summation of older data to keep the data volume down, it can extract the data across any period and then it can quickly generate impressive .gif file graphs from the data - which are perfect for displaying on a webserver.
 
So the challenge: While the test runs for "some period" like an hour, save the vmstat data and then graph it.
 
 Part one - create a suitable rrdtool database for the vmstat output
 
As a reminder this is what default vmstat output looks like on a current AIX version
$ vmstat 1 4

System configuration: lcpu=16 mem=8192MB ent=2.00

kthr    memory              page              faults              cpu
----- ----------- ------------------------ ------------ -----------------------
 r  b   avm   fre  re  pi  po  fr   sr  cy  in   sy  cs us sy id wa    pc    ec
 1  0 709239  4276   0   0   0   0    0   0  31 14401 361 31  1 68  0  1.01  50.6
 1  0 709239  4276   0   0   0   0    0   0  42 15786 456 31  1 68  0  1.01  50.4
 1  0 709146  4368   0   0   0   0    0   0  16 8989 207 31  1 67  0  1.04  51.9
 1  0 709146  4368   0   0   0   0    0   0 117 16944 627 31  1 68  0  1.02  50.9
 I never notice before but there are two "sy" columns so here we use "sc" for the "sy" (system calls) column and leave sy for the system utilisation column.  We are going to use the same column names as the vmstat command to make this easier to understand althouh that are very short.  When we graph the stats we can spell out the stats full names.
 Without using the power of rrdtool's to summarise data we are going to use a sledgehammer so just save 100,000 at a rate of one second data samples which is about 27 hours.
 
Here is the "rrdtool create" command:
rrdtool create vmstat.rrd --step 1  \
DS:r:GAUGE:5:U:U \
DS:b:GAUGE:5:U:U \
DS:avm:GAUGE:5:U:U \
DS:fre:GAUGE:5:U:U \
DS:re:GAUGE:5:U:U \
DS:pi:GAUGE:5:U:U \
DS:po:GAUGE:5:U:U \
DS:fr:GAUGE:5:U:U \
DS:sr:GAUGE:5:U:U \
DS:cy:GAUGE:5:U:U \
DS:in:GAUGE:5:U:U \
DS:sc:GAUGE:5:U:U \
DS:cs:GAUGE:5:U:U \
DS:us:GAUGE:5:U:U \
DS:sy:GAUGE:5:U:U \
DS:id:GAUGE:5:U:U \
DS:wa:GAUGE:5:U:U \
DS:pc:GAUGE:5:U:U \
DS:ec:GAUGE:5:U:U \
RRA:AVERAGE:0.5:1:100000
 You can look at the rrdtool manual pages for the details but we are basically turning off all the fancy features.
 
 Part two - saving vmstat output in rrdtool format
 
"rrdtool update" is used to put data into the database and adding one row of vmstat data need to look like this:
rrdtool update vmstat.rrd 1354235156:1:0:706069:34785:0:0:0:0:0:0:35:3600:566:0:0:99:0:0.03:1.7
The first number = 1354235156 is the number of seconds since the epoch i.e. 1st Jan 1970. Obvious really!! and the rest is a colon separated list of the stats from vmstat. Fortunately, the UNIX date command can get you that date in this seconds format using:
 $ date +%s
1355307792
So here is how you change vmstat data to rrdtool update commands for an hour (3600 seconds):
 TIME=`date +%s`
 vmstat 1 3600 | awk -v time=$TIME '/^.[0-9]/{ n++; print "rrdtool update vmstat.rrd "time+n":" $1 ":" $2 ":" $3 ":" $4 ":" $5 ":" $6 ":" $7 ":" $8 ":" $9 ":" $10 ":" $11 ":" $12 ":" $13 ":" $14 ":" $15 ":" $16 ":" $17 ":" $18 ":" $19 }' >vmstat.output
ENDTIME=`date +%s` 
 Awk is very good at this sort of thing:
  • We put the korn shell variable nto the awk variable with time=%TIME
  • We ignore lines not starting with a number using  /^.[0-9]/{  ... }
  • We use a counter "n" so each line of output will have a the date in seconds one more than the previous line with n++ and the "time+n"
  • The rest is just formatting to colon separated.
Note: we need the start and end times for graphing, so we extract the right period of time from the database.
 
 Part three - loading the data
We  have the rrdtool commands in the vmstat.out file that we just created so just run the file through a Korn shell
ksh <./vmstat.output 
 Part four- generating the graph files
 This does get a little tricky as thee are loads of options but here are a few worked examples.
 First a simple graph of the Physical CPU consumed - this assumes we are talking about a Shared CPU virtual machine (logical partition):
rrdtool graph physical_consumed.gif \
--title "Physical CPU Consumed" \
--vertical-label "CPUs" \
--height 300 \
--start $TIME \
--end $ENDTIME \
DEF:pc=vmstat.rrd:pc:AVERAGE LINE2:pc#00FF00:"Physical Consumed" 
Notes:
  • The top line is the command and the name of the file to generate.
  • -title and veritcal-label as you might guess are the top and left labels on the graph.
  • -height is the size of the graph in pixels so they display well on a website.
  • Then we have the start and end time in seconds - in this case it is all the stats in the database but you could change these to pick out more interesting periods out of the available data.
  • The last line is complex ...
  • pc=vmstat.rrd is specifying the column that we want from the vmstat.rrd database file
  • AVERAGE is how to deal with more data than we can graph and alternative are, for example, MIN and MAX 
  • LINE2 makes it a line graph and the 2 means thicker lines. Good for a simple one line graph. For multiple lines on one graph use thinner LINE1 lines.
  • The Hex number is the colour (RGB pairs) - although you can use colour names above half a dozen colours the Hex number is easier
  • The title "Physical Consumed" is what is used as the key at the bottom of the graph (not really necessarily on a one line graph with a good title).
See the graph below.

Here is a more complex graph as it is a stacked area graph of the four utilisation numbers:
 rrdtool graph cpu_utilisation.gif \
--rigid --lower-limit 0 --upper-limit 100 \
--title "CPU Utilisation" \
--vertical-label "Percent Stacked" \
--start $TIME \
--end $ENDTIME \
--height 300 \
DEF:us=vmstat.rrd:us:AVERAGE AREA:us#00FF00:"User" \
DEF:sy=vmstat.rrd:sy:AVERAGE STACK:sy#0000FF:"System" \
DEF:wa=vmstat.rrd:wa:AVERAGE STACK:wa#FF0000:"Wait" \
DEF:id=vmstat.rrd:id:AVERAGE STACK:id#FFFFFF:"Idle"
 More notes:
  • The -rigid etc is because we don't want rrdtool to determine the scales as we know it is 0 to 100% and we want to visually compare graphs on a constant scale.
  • There are four DEF line, one for each of the utilisation stats. The first is AREA and the rest are STACK type so they are place one on top of the other.
  • Each stat is given a suitable colour in Hex
  
 Part five - what you get is 
 image
 
 Note this is a 1 minute capture an why there is no scale along the bottom with the date and time.
 image
 
  Part six- want to give it a try
 
You need a copy of rrdtool - assuming you are using AIX
  • The home website is http://oss.oetiker.ch/rrdtool/index.en.html
  • The developer is Tobias Oetiker.
  • There is a version on this website for download.
  • A more up to date version is here on my favourite open source for AIX provider: http://www.perzl.org/aix/index.php?n=Main.Rrdtool 
  • Note: for either there is a long list of dependent software you need to download - this can make it tricky to install. Try it on a spare VM to start with and before you update an important production machine!
  • On Linux it is a simple download.
 
You need a Webserver - if you want simple access to the graphs
  • You have to sort that one out yourself.
  • For these simple graphs you can use my nweb webserver.
  • For a full webserver I use Apache from  - http://www.perzl.org/aix/index.php?n=Main.Apache
  • You will need to create a very simple HTML file to make access straight forward
 
 You need the shell script below
 
 export SECONDS=3600
echo this script captures for $SECONDS seconds

echo remove the vmstat.rrd database in this directory
rm vmstat.rrd

echo  create vmstat.rrd for 10000 seconds = over 27 hours max at 1 second captures
rrdtool create vmstat.rrd --step 1  \
DS:r:GAUGE:5:U:U \
DS:b:GAUGE:5:U:U \
DS:avm:GAUGE:5:U:U \
DS:fre:GAUGE:5:U:U \
DS:re:GAUGE:5:U:U \
DS:pi:GAUGE:5:U:U \
DS:po:GAUGE:5:U:U \
DS:fr:GAUGE:5:U:U \
DS:sr:GAUGE:5:U:U \
DS:cy:GAUGE:5:U:U \
DS:in:GAUGE:5:U:U \
DS:st:GAUGE:5:U:U \
DS:cs:GAUGE:5:U:U \
DS:us:GAUGE:5:U:U \
DS:sy:GAUGE:5:U:U \
DS:id:GAUGE:5:U:U \
DS:wa:GAUGE:5:U:U \
DS:pc:GAUGE:5:U:U \
DS:ec:GAUGE:5:U:U \
RRA:AVERAGE:0.5:1:100000

echo Note the vmstat sy faults coloumn is renames st so sy is system time

TIME=`date +%s`
echo startseconds $TIME

echo Capturing for $SECONDS seconds
vmstat 1 $SECONDS >vmstat.txt &
vmstat 1 $SECONDS | awk -v time=$TIME '/^.[0-9]/{ n++; print "rrdtool update vmstat.rrd "time+n":" $1 ":" $2 ":" $3 ":" $4 ":" $5 ":" $6 ":" $7 ":" $8 ":" $9 ":" $10 ":" $11 ":" $12 ":" $13 ":" $14 ":" $15 ":" $16 ":" $17 ":" $18 ":" $19 }' >vmstat.output

ENDTIME=`date +%s`
echo endseconds $ENDTIME

echo load the vmstat data into the vmstat.rrd database
echo the file has `wc -l vmstat.output` lines
ksh <./vmstat.output

echo graph the data
rrdtool graph cpu_utilisation.gif \
--rigid --lower-limit 0 --upper-limit 100 \
--title "CPU Utilisation" \
--vertical-label "Percent Stacked" \
--start $TIME \
--end $ENDTIME \
--height 300 \
DEF:us=vmstat.rrd:us:AVERAGE AREA:us#00FF00:"User" \
DEF:sy=vmstat.rrd:sy:AVERAGE STACK:sy#0000FF:"System" \
DEF:wa=vmstat.rrd:wa:AVERAGE STACK:wa#FF0000:"Wait" \
DEF:id=vmstat.rrd:id:AVERAGE STACK:id#FFFFFF:"Idle" 

rrdtool graph run_queue.gif \
--title "Process Run Queue" \
--vertical-label "Processes" \
--height 300 \
--start $TIME \
--end $ENDTIME \
DEF:r=vmstat.rrd:r:AVERAGE LINE2:r#00FF00:"Run Queue"

rrdtool graph physical_consumed.gif \
--title "Physical CPU Consumed" \
--vertical-label "CPUs" \
--height 300 \
--start $TIME \
--end $ENDTIME \
DEF:pc=vmstat.rrd:pc:AVERAGE LINE2:pc#00FF00:"Physical Consumed"

rrdtool graph entitlement_consumed.gif \
--title "Entitlement CPU Consumed" \
--vertical-label "CPUs" \
--height 300 \
--start $TIME \
--end $ENDTIME \
DEF:ec=vmstat.rrd:ec:AVERAGE LINE2:ec#00FF00:"Entitlement Consumed"

echo images available
ls -l *.gif
 

 

'Programming > RRD' 카테고리의 다른 글

[RRD] RRGrapher  (0) 2014.01.29
[RRD] Convert sar text output into RRDTool Graphs  (0) 2014.01.29
[RRD] RRD::Simple::Examples  (0) 2014.01.29
[RRD] rrdgraph_examples  (0) 2014.01.29
[RRD] Digitemp data visualisation on your website  (0) 2014.01.29

+ Recent posts