From :
AIXpert Blog
A large UK customer is running a series of tests on AIX and POWER7 machines and wanted to display the results on a webserver using rrdtool and needed help getting a working example running. So I stepped up to the challenge one night rather than it becoming a search for rrdtool skills inside IBM - when I could sort something out in a few hours. No one in IBM would claim to be a rrdtool expert until they knew the details of what was required. Many people know it well enough to get what they want done but would not put rrdtool guru on their CV.
rrdtool is a fantastically brilliant command to have in your toolbox. Up there with awk, grep, sed, Apache, ksh, and nmon (of course). It is used to save data in a fixed size "database", does cascade summation of older data to keep the data volume down, it can extract the data across any period and then it can quickly generate impressive .gif file graphs from the data - which are perfect for displaying on a webserver.
So the challenge: While the test runs for "some period" like an hour, save the vmstat data and then graph it.
Part one - create a suitable rrdtool database for the vmstat output
As a reminder this is what default vmstat output looks like on a current AIX version
$ vmstat 1 4I never notice before but there are two "sy" columns so here we use "sc" for the "sy" (system calls) column and leave sy for the system utilisation column. We are going to use the same column names as the vmstat command to make this easier to understand althouh that are very short. When we graph the stats we can spell out the stats full names.
System configuration: lcpu=16 mem=8192MB ent=2.00
kthr memory page faults cpu
----- ----------- -------- ---- ---- ---- ---- ------------ ---- ---- ---- ---- ---- ---
r b avm fre re pi po fr sr cy in sy cs us sy id wa pc ec
1 0 709239 4276 0 0 0 0 0 0 31 14401 361 31 1 68 0 1.01 50.6
1 0 709239 4276 0 0 0 0 0 0 42 15786 456 31 1 68 0 1.01 50.4
1 0 709146 4368 0 0 0 0 0 0 16 8989 207 31 1 67 0 1.04 51.9
1 0 709146 4368 0 0 0 0 0 0 117 16944 627 31 1 68 0 1.02 50.9
Without using the power of rrdtool's to summarise data we are going to use a sledgehammer so just save 100,000 at a rate of one second data samples which is about 27 hours.
Here is the "rrdtool create" command:
rrdtool create vmstat.rrd --step 1 \
DS:r:GAUGE:5:U:U \
DS:b:GAUGE:5:U:U \
DS:avm:GAUGE:5:U:U \
DS:fre:GAUGE:5:U:U \
DS:re:GAUGE:5:U:U \
DS:pi:GAUGE:5:U:U \
DS:po:GAUGE:5:U:U \
DS:fr:GAUGE:5:U:U \
DS:sr:GAUGE:5:U:U \
DS:cy:GAUGE:5:U:U \
DS:in:GAUGE:5:U:U \
DS:sc:GAUGE:5:U:U \
DS:cs:GAUGE:5:U:U \
DS:us:GAUGE:5:U:U \
DS:sy:GAUGE:5:U:U \
DS:id:GAUGE:5:U:U \
DS:wa:GAUGE:5:U:U \
DS:pc:GAUGE:5:U:U \
DS:ec:GAUGE:5:U:U \
RRA:AVER AGE: 0.5: 1:10 0000
You can look at the rrdtool manual pages for the details but we are basically turning off all the fancy features.
Part two - saving vmstat output in rrdtool format
"rrdtool update" is used to put data into the database and adding one row of vmstat data need to look like this:
rrdtool update vmstat.rrd 1354The first number = 1354235156 is the number of seconds since the epoch i.e. 1st Jan 1970. Obvious really!! and the rest is a colon separated list of the stats from vmstat. Fortunately, the UNIX date command can get you that date in this seconds format using:2351 56:1 :0:7 0606 9:34 785: 0:0: 0:0: 0:0: 35:3 600: 566: 0:0: 99:0 :0.0 3:1. 7
$ date +%sSo here is how you change vmstat data to rrdtool update commands for an hour (3600 seco
1355307792
TIME=` date +%s` vmstat 1 3600 | awk -v time=$TIME '/^.[0-9]/{ n++; print "rrdtool update vmstat.rrd "time+n":" $1 ":" $2 ":" $3 ":" $4 ":" $5 ":" $6 ":" $7 ":" $8 ":" $9 ":" $10 ":" $11 ":" $12 ":" $13 ":" $14 ":" $15 ":" $16 ":" $17 ":" $18 ":" $19 }' >vmstat. outp u t ENDTIME =`da te +%s`
Awk is very good at this sort of thing:
- We put the korn shell variable nto the awk variable with time=%TIME
- We ignore lines not starting with a number using /^.[0-9]/{ ... }
- We use a counter "n" so each line of output will have a the date in seconds one more than the previous line with n++ and the "time+n"
- The rest is just formatting to colon separated.
Note: we need the start and end times for graphing, so we extract the right period of time from the database.
Part three - loading the data
We have the rrdtool commands in the vmstat.out file that we just created so just run the file through a Korn shell
ksh <./vmstat.output
Part four- generating the graph files
This does get a little tricky as thee are loads of options but here are a few worked examples.
First a simple graph of the Physical CPU consumed - this assumes we are talking about a Shared CPU virtual machine (logical partition):
rrdtool graph physical _con sume d.gi f \
--title "Physical CPU Consumed" \
--vertical-label "CPUs" \
--height 300 \
--start $TIME \
--end $ENDTIME \
DEF:pc=v msta t.rr d:pc :AVE RAGE LINE 2:pc #00F F00: "Phy sica l Consumed"
Notes:
- The top line is the command and the name of the file to generate.
- -title and veritcal-label as you might guess are the top and left labels on the graph.
- -height is the size of the graph in pixels so they display well on a website.
- Then we have the start and end time in seconds - in this case it is all the stats in the database but you could change these to pick out more interesting periods out of the available data.
- The last line is complex ...
- pc=vmstat.rrd is specifying the column that we want from the vmstat.rrd database file
- AVERAGE is how to deal with more data than we can graph and alternative are, for example, MIN and MAX
- LINE2 makes it a line graph and the 2 means thicker lines. Good for a simple one line graph. For multiple lines on one graph use thinner LINE1 lines.
- The Hex number is the colour (RGB pairs) - although you can use colour names above half a dozen colours the Hex number is easier
- The title "Physical Consumed" is what is used as the key at the bottom of the graph (not really necessarily on a one line graph with a good title).
See the graph below.
Here is a more complex graph as it is a stacked area graph of the four utilisation numbers:
Here is a more complex graph as it is a stacked area graph of the four utilisation numbers:
rrdtool graph cpu_utilisation.gif \
--rigid --lower-limit 0 --upper-limit 100 \
--title "CPU Utilisation" \
--vertical-label "Percent Stacked" \
--start $TIME \
--end $ENDTIME \
--height 300 \
DEF:us=v msta t.rr d:us :AVE RAGE AREA :us# 00FF 00:" User " \
DEF:sy=v msta t.rr d:sy :AVE RAGE STAC K:sy #000 0FF: "Sys tem" \
DEF:wa=v msta t.rr d:wa :AVE RAGE STAC K:wa #FF0 000: "Wai t" \
DEF:id=v msta t.rr d:id :AVE RAGE STAC K:id #FFF FFF: "Idl e"
More notes:
- The -rigid etc is because we don't want rrdtool to determine the scales as we know it is 0 to 100% and we want to visually compare graphs on a constant scale.
- There are four DEF line, one for each of the utilisation stats. The first is AREA and the rest are STACK type so they are place one on top of the other.
- Each stat is given a suitable colour in Hex
Part five - what you get is
Note this is a 1 minute capture an why there is no scale along the bottom with the date and time.
Part six- want to give it a try
You need a copy of rrdtool - assuming you are using AIX
- The home website is http
://o ss.o etik er.c h/rr dtoo l/in dex. en.h tm l - The developer is Tobias Oetiker.
- There is a version on this website for download.
- A more up to date version is here on my favourite open source for AIX provider: http
://w ww.p erzl .org /aix /ind ex.p hp?n =Mai n.Rr dtoo l - Not
e: for either there is a long list of dependent software you need to download - this can make it tricky to install. Try it on a spare VM to start with and before you update an important production machine! - On Linux it is a simple download.
You need a Webserver - if you want simple access to the graphs
- You have to sort that one out yourself.
- For these simple graphs you can use my nweb webserver.
- For a full webserver I use Apache from - http
://w ww.p erzl .org /aix /ind ex.p hp?n =Mai n.Ap ach e - You will need to create a very simple HTML file to make access straight forward
You need the shell script below
export SECONDS=3600
echo this script captures for $SECONDS seconds
echo remove the vmstat.rrd database in this directory
rm vmstat.rrd
echo create vmstat.rrd for 10000 seconds = over 27 hours max at 1 second captures
rrdtool create vmstat.rrd --step 1 \
DS:r:GAUGE:5:U:U \
DS:b:GAUGE:5:U:U \
DS:avm:GAUGE:5:U:U \
DS:fre:GAUGE:5:U:U \
DS:re:GAUGE:5:U:U \
DS:pi:GAUGE:5:U:U \
DS:po:GAUGE:5:U:U \
DS:fr:GAUGE:5:U:U \
DS:sr:GAUGE:5:U:U \
DS:cy:GAUGE:5:U:U \
DS:in:GAUGE:5:U:U \
DS:st:GAUGE:5:U:U \
DS:cs:GAUGE:5:U:U \
DS:us:GAUGE:5:U:U \
DS:sy:GAUGE:5:U:U \
DS:id:GAUGE:5:U:U \
DS:wa:GAUGE:5:U:U \
DS:pc:GAUGE:5:U:U \
DS:ec:GAUGE:5:U:U \
RRA:AVERAGE:0.5:1:100000
echo Note the vmstat sy faults coloumn is renames st so sy is system time
TIME=`date +%s`
echo startseconds $TIME
echo Capturing for $SECONDS seconds
vmstat 1 $SECONDS >vmstat.txt &
vmstat 1 $SECONDS | awk -v time=$TIME '/^.[0-9]/{ n++; print "rrdtool update vmstat.rrd "time+n":" $1 ":" $2 ":" $3 ":" $4 ":" $5 ":" $6 ":" $7 ":" $8 ":" $9 ":" $10 ":" $11 ":" $12 ":" $13 ":" $14 ":" $15 ":" $16 ":" $17 ":" $18 ":" $19 }' >vmstat.output
ENDTIME=`date +%s`
echo endseconds $ENDTIME
echo load the vmstat data into the vmstat.rrd database
echo the file has `wc -l vmstat.output` lines
ksh <./vmstat.output
echo graph the data
rrdtool graph cpu_utilisation.gif \
--rigid --lower-limit 0 --upper-limit 100 \
--title "CPU Utilisation" \
--vertical-label "Percent Stacked" \
--start $TIME \
--end $ENDTIME \
--height 300 \
DEF:us=vmstat.rrd:us:AVERAGE AREA:us#00FF00:"User" \
DEF:sy=vmstat.rrd:sy:AVERAGE STACK:sy#0000FF:"System" \
DEF:wa=vmstat.rrd:wa:AVERAGE STACK:wa#FF0000:"Wait" \
DEF:id=vmstat.rrd:id:AVERAGE STACK:id#FFFFFF:"Idle"
rrdtool graph run_queue.gif \
--title "Process Run Queue" \
--vertical-label "Processes" \
--height 300 \
--start $TIME \
--end $ENDTIME \
DEF:r=vmstat.rrd:r:AVERAGE LINE2:r#00FF00:"Run Queue"
rrdtool graph physical_consumed.gif \
--title "Physical CPU Consumed" \
--vertical-label "CPUs" \
--height 300 \
--start $TIME \
--end $ENDTIME \
DEF:pc=vmstat.rrd:pc:AVERAGE LINE2:pc#00FF00:"Physical Consumed"
rrdtool graph entitlement_consumed.gif \
--title "Entitlement CPU Consumed" \
--vertical-label "CPUs" \
--height 300 \
--start $TIME \
--end $ENDTIME \
DEF:ec=vmstat.rrd:ec:AVERAGE LINE2:ec#00FF00:"Entitlement Consumed"
echo images available
ls -l *.gif
echo this script captures for $SECONDS seconds
echo remove the vmstat.rrd database in this directory
rm vmstat.rrd
echo create vmstat.rrd for 10000 seconds = over 27 hours max at 1 second captures
rrdtool create vmstat.rrd --step 1 \
DS:r:GAUGE:5:U:U \
DS:b:GAUGE:5:U:U \
DS:avm:GAUGE:5:U:U \
DS:fre:GAUGE:5:U:U \
DS:re:GAUGE:5:U:U \
DS:pi:GAUGE:5:U:U \
DS:po:GAUGE:5:U:U \
DS:fr:GAUGE:5:U:U \
DS:sr:GAUGE:5:U:U \
DS:cy:GAUGE:5:U:U \
DS:in:GAUGE:5:U:U \
DS:st:GAUGE:5:U:U \
DS:cs:GAUGE:5:U:U \
DS:us:GAUGE:5:U:U \
DS:sy:GAUGE:5:U:U \
DS:id:GAUGE:5:U:U \
DS:wa:GAUGE:5:U:U \
DS:pc:GAUGE:5:U:U \
DS:ec:GAUGE:5:U:U \
RRA:
echo Note the vmstat sy faults coloumn is renames st so sy is system time
TIME=`date +%s`
echo startseconds $TIME
echo Capturing for $SECONDS seconds
vmstat 1 $SECONDS >vmstat.txt &
vmstat 1 $SECONDS | awk -v time=$TIME '/^.[0-9]/{ n++; print "rrdtool update vmstat.rrd "time+n":" $1 ":" $2 ":" $3 ":" $4 ":" $5 ":" $6 ":" $7 ":" $8 ":" $9 ":" $10 ":" $11 ":" $12 ":" $13 ":" $14 ":" $15 ":" $16 ":" $17 ":" $18 ":" $19 }' >vmstat.output
ENDTIME=`date +%s`
echo endseconds $ENDTIME
echo load the vmstat data into the vmstat.rrd database
echo the file has `wc -l vmstat.output` lines
ksh <./vmstat.output
echo graph the data
rrdtool graph cpu_utilisation.gif \
--rigid --lower-limit 0 --upper-limit 100 \
--title "CPU Utilisation" \
--vertical-label "Percent Stacked" \
--start $TIME \
--end $ENDTIME \
--height 300 \
DEF:
DEF:
DEF:
DEF:
rrdtool graph run_queue.gif \
--title "Process Run Queue" \
--vertical-label "Processes" \
--height 300 \
--start $TIME \
--end $ENDTIME \
DEF:
rrdtool graph phys
--title "Physical CPU Consumed" \
--vertical-label "CPUs" \
--height 300 \
--start $TIME \
--end $ENDTIME \
DEF:
rrdtool graph enti
--title "Entitlement CPU Consumed" \
--vertical-label "CPUs" \
--height 300 \
--start $TIME \
--end $ENDTIME \
DEF:
echo images available
ls -l *.gif
'Programming > RRD' 카테고리의 다른 글
[RRD] RRGrapher (0) | 2014.01.29 |
---|---|
[RRD] Convert sar text output into RRDTool Graphs (0) | 2014.01.29 |
[RRD] RRD::Simple::Examples (0) | 2014.01.29 |
[RRD] rrdgraph_examples (0) | 2014.01.29 |
[RRD] Digitemp data visualisation on your website (0) | 2014.01.29 |