[HP-UX] Realtime Performance Gathering with Glance

2014. 1. 29. 13:47

Written by Geoff Wild

Thursday, 07 June 2007 23:32

Purpose

The purpose of this article is to outline some of the options available as part of the glance plus (measureware) package. Specifically I intend to focus on glance's ability to gather real time stats in the background and output the results to text files. This data can then be saved and used for historical analysis. Typically you would use perfview for this type of reporting but glance allows a much smaller interval than the minimum 5 minutes for perfview.

Another reason one might want to do this is as a feed to MRTG. MRTG will graph practically anything that can be presented as a number. Using Glance scripts it would be pretty easy to setup a number of MRTG graphs for various performance metrics. MRTG would then automatically handle the daily, weekly, monthly and yearly views. Of course this would require an MRTG implementation on Dude, netcat for aix on dude, copies of the client scripts on every host reporting to mrtg and someone to decide what is important, write the scripts and establish the MRTG reporting hierarchy. Beyond th scope of this article.

First off - Documentation

The following links are links to various glance specific documents:

Now for a few examples.

Simulating SAR for disks

This example will show you how you can use glance to simulate the same output you would get with sar -d. There are some subtle differences as to how the values are calculated that I will not try and explain here mainly because I can't! :) Also, I personally would rather use sar for capturing disk data that glance but it is an excellent example of what you can do.

Script

# The following glance adviser disk loop shows disk activity comparable
# to sar -d data.

# Note that values will differ between sar and glance because of differing
# data sources, calculation methods, and collection intervals.

headersprinted = 0

# For each disk, if there was activity, print a summary:
disk loop {
if BYDSK_PHYS_IO_RATE > 0 then {
    # print headers if this is the first active disk found this interval:
    if headersprinted == 0 then {
      print "--------   --------    device          %util   queue   r+w/s blks/s     secs-avserv"
      headersprinted = 1
    }
    # sar shows average service time in milliseconds:
    avserv = ( BYDSK_UTIL / 100 ) / BYDSK_PHYS_IO_RATE * 1000
    # sar blks/s is 512-byte blocks per second (KB rate times 2):
    blks = BYDSK_PHYS_BYTE_RATE * 2
    print GBL_STATDATE, "   ", GBL_STATTIME, "   ",BYDSK_DEVNAME|15, BYDSK_UTIL|7|2,
          BYDSK_REQUEST_QUEUE|8|2, BYDSK_PHYS_IO_RATE|8|0,
          blks|8|0, avserv|16|2
}
}

if headersprinted == 0 then
print GBL_STATTIME, " (no disk activity this interval)"

Output

--------   --------    device          %util   queue   r+w/s blks/s     secs-avserv
08/02/02   14:50:02   0/0/2/0.6.0      71.01    0.00     110     740            6.46
08/02/02   14:50:02   0/0/2/1.6.0      56.52    0.00     102     681            5.52
08/02/02   14:50:02   0/4/0/0.5.0       9.42    0.00      10      66            9.42
08/02/02   14:50:02   0/12/0/0.5.0      7.97    0.00       9      63            8.66
08/02/02   14:50:02   0/...29.0.2.1.3   1.44    0.00      24     443            0.61
08/02/02   14:50:02   0/...29.0.2.1.4   0.72    0.00      23     443            0.31
08/02/02   14:50:02   0/...29.0.2.2.0   0.72    0.00       1      25           10.29
08/02/02   14:50:02   0/...29.0.2.2.1   0.72    0.00       3      43            2.40
08/02/02   14:50:02   0/...29.0.2.2.3   0.72    0.00       2      74            3.13
08/02/02   14:50:02   0/...29.0.2.2.4   2.17    0.00       5     123            4.09
08/02/02   14:50:02   0/...29.0.2.2.5   0.72    0.00      20     591            0.36
08/02/02   14:50:02   0/...29.0.2.0.0   0.00    0.00       2      28            0.00
08/02/02   14:50:02   0/...29.0.2.3.4   0.00    0.00       1      98            0.00
--------   --------    device          %util   queue   r+w/s blks/s     secs-avserv
08/02/02   14:50:03   0/0/2/0.6.0      23.61    0.00      34     151            6.90
08/02/02   14:50:03   0/0/2/1.6.0      18.05    0.00      31     126            5.75
08/02/02   14:50:03   0/4/0/0.5.0       2.77    0.00       4      17            6.60
08/02/02   14:50:03   0/12/0/0.5.0      2.77    0.00       4      17            6.60
08/02/02   14:50:03   0/...29.0.2.1.3   1.38    0.00      40     800            0.35
08/02/02   14:50:03   0/...29.0.2.1.4   1.38    0.00      40     800            0.35
08/02/02   14:50:03   0/...29.0.2.2.0   2.77    0.00       1      46           19.79
08/02/02   14:50:03   0/...29.0.2.2.3   0.00    0.00       1      46            0.00
08/02/02   14:50:03   0/...29.0.2.2.4   0.00    0.00       3      91            0.00
08/02/02   14:50:03   0/...29.0.2.2.5   1.38    0.00      11     366            1.21

CPU Utilization - Averaged over # of CPU's

The following script will return the average CPU utilization (averaged over number of CPU's) for system, user and total utilization. It could easily be modified to be a feed into MRTG for graphing purposes.

Script

#
# Sample glance script showing average CPU utilization across all CPU's
#

headersprinted = 0
total_total = 0
total_sys = 0
total_user = 0
count = 0

# For each CPU
cpu loop {
    # print headers if this is the first row
    if headersprinted == 0 then {
      print "   Sys CPU   User CPU Total CPU"
      headersprinted = 1
    }
    total_total=total_total+GBL_CPU_TOTAL_UTIL
    total_sys=total_sys+GBL_CPU_SYS_MODE_UTIL
    total_user=total_user+GBL_CPU_USER_MODE_UTIL
    count = count + 1
}
print total_sys/count, " ", total_user/count, " ", total_total/count

Output

# glance -j 5 -adviser_only -syntax cpu.cfg -iterations 3

Welcome to GlancePlus

   Sys CPU   User CPU Total CPU
         8         30         38
   Sys CPU   User CPU Total CPU
         5         31         35
   Sys CPU   User CPU Total CPU
         4         24         28

Lan Statistics

This example will produce packet level statistics (in, out, collisions, errors...) for every lan interface in the server. I found it to be another useful example.

Script

# initialize variables:

netif_to_examine = "" # lan0 would only report on lan0, etc.
headers_printed = headers_printed

netif loop {
# print information for the selected interface or if null THEN all:
IF (BYNETIF_NAME == netif_to_examine) or
(netif_to_examine == "") THEN
{

    # print headers the first time through the loop:
    IF headers_printed == 0 THEN
    {

print "Date Time Interface InPkts OutPkts OutQ Colls Errs"
print " "

headers_printed = 1

}

# print one line per interface reported:

print GBL_STATDATE, " ", GBL_STATTIME, " ", BYNETIF_NAME|8,

BYNETIF_IN_PACKET, BYNETIF_OUT_PACKET,

BYNETIF_QUEUE, BYNETIF_COLLISION, BYNETIF_ERROR

# (note that some interface types do not report collisions or errors)

}

}
print " "

Output

# glance -j 5 -adviser_only -syntax lan.cfg -iterations 3

Welcome to GlancePlus


Date     Time    Interface  InPkts OutPkts  OutQ  Colls  Errs
   
04/28/04 15:01:00 lan0          3      2     0      0      0
04/28/04 15:01:00 lan3         35     39     0      0      0
04/28/04 15:01:00 lan6         31     32     0      0      0
04/28/04 15:01:00 lan7         50     31     0      0      0
04/28/04 15:01:00 lan8          0      0     0      0      0
04/28/04 15:01:00 lan4          0      0     0      0      0
04/28/04 15:01:00 lan9          0      0     0      0      0
04/28/04 15:01:00 lan10         0      0     0      0      0
04/28/04 15:01:00 lan11         0      0     0      0      0
04/28/04 15:01:00 lan5          3      2     0      0      0
04/28/04 15:01:00 lo0          26     26     0     na      0
  
04/28/04 15:01:05 lan0         13      7     0      0      0
04/28/04 15:01:05 lan3        173    230     0      0      0
04/28/04 15:01:05 lan6        121    129     0      0      0
04/28/04 15:01:05 lan7        197    142     0      0      0
04/28/04 15:01:05 lan8          1      1     0      0      0
04/28/04 15:01:05 lan4          1      1     0      0      0
04/28/04 15:01:05 lan9          1      1     0      0      0
04/28/04 15:01:05 lan10         1      1     0      0      0
04/28/04 15:01:05 lan11         1      1     0      0      0
04/28/04 15:01:05 lan5         13      7     0      0      0
04/28/04 15:01:05 lo0           3      3     0     na      0
  
04/28/04 15:01:10 lan0         12      7     0      0      0
04/28/04 15:01:10 lan3        151    221     0      0      0
04/28/04 15:01:10 lan6         97    105     0      0      0
04/28/04 15:01:10 lan7        165    126     0      0      0
04/28/04 15:01:10 lan8          1      1     0      0      0
04/28/04 15:01:10 lan4          1      1     0      0      0
04/28/04 15:01:10 lan9          1      1     0      0      0
04/28/04 15:01:10 lan10         1      1     0      0      0
04/28/04 15:01:10 lan11         1      1     0      0      0
04/28/04 15:01:10 lan5         12      7     0      0      0
04/28/04 15:01:10 lo0           0      0     0     na      0

Detailed Process Gathering

This script proved very valuable when trying to identify Mobila processes running amok in the early days of Mobila. PV has the 5 minute limitation and that skewed things so if a process was bad but for a short period of time it got lost. You must be carefull with this example as it will generate a lot of data very quickly, depending on the interval. I.e. listing all processes every second is a lot of lines in a short period of time.

It would be pretty easy to modify this script to simply count the processes if you wanted to report back to MRTG the number of processes.

Script

process loop {

    if ((proc_cpu_total_util > 0) or ( proc_stop_reason != "SLEEP" ))
then {
    print gbl_statdate, "|", gbl_stattime, "|", proc_cpu_last_used, "|",
proc_mem_virt, "|", proc_mem_res, "|",
proc_cpu_total_util, "|",
proc_stop_reason, "|", proc_disk_logl_io_rate, "|",
proc_proc_id, "|", proc_parent_proc_id, "|",
proc_user_name, "|", proc_proc_name, "|",
        proc_cache_wait_time, "|", proc_cdfs_wait_time, "|",
        proc_disk_subsystem_wait_time, "|", proc_disk_wait_time, "|",
        proc_graphics_wait_time, "|", proc_inode_wait_time, "|",
        proc_ipc_subsystem_wait_time, "|", proc_ipc_wait_time, "|",
proc_jobctl_wait_time, "|", proc_lan_wait_time, "|",
proc_mem_wait_time, "|", proc_msg_wait_time, "|",
proc_nfs_wait_time, "|", proc_other_io_wait_time, "|",
proc_other_wait_time, "|", proc_pipe_wait_time, "|",
proc_pri_wait_time, "|", proc_rpc_wait_time, "|",
proc_sem_wait_time, "|", proc_socket_wait_time, "|",
proc_stream_wait_time, "|", proc_sys_wait_time, "|",
proc_term_io_wait_time
    }
}

Output

gbl_statdate | gbl_stattime | proc_cpu_last_used | proc_mem_virt | proc_mem_res | proc_cpu_total_util | proc_stop_reason | proc_disk_logl_io_rate | proc_proc_id | proc_parent_proc_id | proc_user_name | proc_proc_name | proc_cache_wait_time | proc_cdfs_wait_time | proc_disk_subsystem_wait_time | proc_disk_wait_time | proc_graphics_wait_time | proc_inode_wait_time | proc_ipc_subsystem_wait_time | proc_ipc_wait_time | proc_jobctl_wait_time | proc_lan_wait_time | proc_mem_wait_time | proc_msg_wait_time | proc_nfs_wait_time | proc_other_io_wait_time | proc_other_wait_time | proc_pipe_wait_time | proc_pri_wait_time | proc_rpc_wait_time | proc_sem_wait_time | proc_socket_wait_time | proc_stream_wait_time | proc_sys_wait_time | proc_term_io_wait_time
08/02/02|14:50:00| 0|   32kb|   32kb|    0.0|OTHER |    0.0|     8|     0|root    |supsched        | 0.00| 0.00|     0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.75| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00
08/02/02|14:50:00| 0|   32kb|   32kb|    0.0|OTHER |    0.0|     9|     0|root    |strmem          | 0.00| 0.00|     0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.75| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00
08/02/02|14:50:00| 0|   32kb|   32kb|    0.0|OTHER |    0.0|    10|     0|root    |strweld         | 0.00| 0.00|     0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.75| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00
08/02/02|14:50:00| 0|   32kb|   32kb|    0.0|OTHER |    0.0|    11|     0|root    |strfreebd       | 0.00| 0.00|     0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.75| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00
08/02/02|14:50:00| 0|   32kb|   32kb|    0.0|OTHER |    0.0|    24|     0|root    |lvmschedd       | 0.00| 0.00|     0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.75| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00
08/02/02|14:50:00| 2|   32kb|   32kb|    0.0|STRMS |    0.0|    25|     0|root    |smpsched        | 0.00| 0.00|     0.00| 0.00| 0.00| 0.00| 0.75| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.75| 0.00| 0.00
08/02/02|14:50:00| 0|   32kb|   32kb|    0.0|STRMS |    0.0|    26|     0|root    |smpsched        | 0.00| 0.00|     0.00| 0.00| 0.00| 0.00| 0.75| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.75| 0.00| 0.00
08/02/02|14:50:00| 0|   32kb|   32kb|    0.0|STRMS |    0.0|    27|     0|root    |smpsched        | 0.00| 0.00|     0.00| 0.00| 0.00| 0.00| 0.75| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.75| 0.00| 0.00
08/02/02|14:50:00| 1|   32kb|   32kb|    0.0|STRMS |    0.0|    28|     0|root    |smpsched        | 0.00| 0.00|     0.00| 0.00| 0.00| 0.00| 0.75| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.75| 0.00| 0.00
08/02/02|14:50:00| 1|   32kb|   32kb|    0.0|OTHER |    0.0|    29|     0|root    |sblksched       | 0.00| 0.00|     0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.75| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00
08/02/02|14:50:00| 2|   32kb|   32kb|    0.0|OTHER |    0.0|    30|     0|root    |sblksched       | 0.00| 0.00|     0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.75| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00
08/02/02|14:50:00| 2| 1.8mb|   88kb|    0.0|OTHER |    0.0|   575|     1|root    |ptydaemon       | 0.00| 0.00|     0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.75| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00

Not so pretty - cut back on the info to report on and it cleans up nicely... :)

저작자표시

'OS > HP-UX' 카테고리의 다른 글

[HP] Glance 매뉴얼 (0)	2014.01.29

TOP GUN