Source: https://www.ibm.com/developerworks/wikis/display/WikiPtype/monitoring
vmstat
Virtual Memory Management Stats but also includes CPU and other useful stuff
Syntax |
vmstat <seconds> <count> |
Options |
seconds |
Time between outputs |
|
count |
number of outputs |
Examples |
vmstat 10 20 |
20 lines output with 10 seconds between each |
Output |
Warning: |
ignore the first line (average since reboot) |
|
r |
number of processes on run queue |
|
b |
number of processes on blocked queue = awaiting resources or I/O |
|
avm |
active virtual memory pages in page space |
|
fre |
real memory pages on the free list |
|
re |
Page reclaims, free but claimed before reused |
|
pi |
paged in (per second) |
|
po |
paged out (per second) |
|
fr |
pages freed (page replacement) (per second) |
|
sr |
pages per second scanned for replacement |
|
cy |
complete scans of page table |
|
in |
device interrupts per second |
|
sy |
system calls per second |
|
cs |
CPU context switches per second |
|
us |
User CPU time percentage |
|
sys |
System CPU time percentage |
|
id |
CPU idle percentage (nothing to do) |
|
wa |
CPU waiting for pending local Disk i/o |
iostat
Disk I/O statistics
Syntax |
iostat <seconds> <count> |
Options |
seconds |
Time between outputs |
|
count |
number of outputs |
Examples |
iostat 10 20 20 |
lines output with 10 seconds between each |
Output |
Warning: |
ignore the first line (average since reboot) |
|
%tm_act |
Percentage of time active |
|
Kbps |
K bytes per second transferred |
|
tps |
Transfers per second |
|
msps |
Millisecond per seek (if available) |
|
Kb_read |
Total K bytes read ( likewise for write) |
ps
Process State
Syntax |
ps -l -f -e -uuser -t ttyno -p pid -k -o xxx |
|
ps aux |
Options |
-l |
long listing |
|
-f |
full listing |
|
-u user |
list only user's processes (-u fred) |
|
-e |
every user's processes |
|
-t ttyno |
processes attached to tty (-t 03) |
|
-p pid |
list the process number N |
|
-k |
Include kernel processes (normally hiden) |
|
-o xxx |
Lets you decide the column for example: -o tid,pid,user,class,pcou,pmem,args |
|
aux |
BSD flavour (note no -) |
Examples |
ps -f |
List your shells (sub) processes in detail |
|
ps -f oracle |
List all processes for user oracle |
|
ps -ef |
List all process |
|
ps -el |
As above but other details |
|
ps -fp 23456 |
Just list process 23456 |
|
ps -o tid,pid,args |
List threadID, processID and arguments |
Output |
PID/PPID |
Process IDentity&Parent Process IDentity |
|
S |
State= Running Sleeping Waiting Zombie Terminating Kernel Intermediate X=growing |
|
UID/USER |
User IDentity/User name |
|
C |
CPU recent use value (part of priority) |
|
STIME |
Start time of process |
|
PRI |
Priority (higher means less priority) |
|
NI |
NIce value (part of priority) default 20 |
|
ADDR |
ADDRess, of stack ( segment no) |
|
SZ |
SiZe of process in 1K pages |
|
CMD |
COMmanD the user typed (-f for more) |
|
WCHAN |
Event awaited for (kernel address) |
|
TTY |
Terminal processes in connected to (- = none) |
|
TIME |
Minutes and Seconds of CPU time |
|
SSIZ |
Size of kernel stack |
|
PGIN |
number of pages paged in |
|
SIZE |
Virtual size of data section in 1K's |
|
RSS |
Real memory (resident set) size of process 1K's |
|
LIM |
Soft limit on memory (see setrlimit) xx=none |
|
TSIZ |
Size of text (shared text program) image |
|
TRS |
Size of resident set (real memory) of test |
|
%CPU |
Percentage of CPU used since started |
|
%MEM |
Percentage of real memory used |
nfsstat
Network File Systems Stats
Syntax |
nfsstat -m -z |
Options |
-m |
Display NFS mount point stats |
|
-z |
Zeros NFS stats |
Examples |
nfsstat |
Display all NFS stats |
|
nfsstat -m |
Display stats about the mount points |
Output |
|
Too many columns to cover here but labels are helpful if you know NFS |
netstat
Network statistics
Syntax |
netstat -i -n -r -p -m |
Examples |
netstat -in |
Interface stats |
|
netstat -rn |
Routing stats |
|
netstat -p tcp |
Protocol stats (also try ip, cmp, igmp, udp |
|
netstat -m |
Memory buffer stats used for packets inside AIX |
|
netstat -D |
Packets receiver, transmitted and dropped) stats |
wlmstat
Workload Manager Stats
Syntax |
wlmstat -c -m -b -S -v [seconds [count]] |
Options |
-b -c -m |
List only c=cpu m=memory -b=disks (yes b, not d) |
|
-S List Super Class level only |
|
-v |
Verbose outout (more detailed) |
|
seconds |
Time bewteen output |
|
count |
number of outputs |
Examples |
wlmstat 3 100 |
Basic stats every 3 seconds for 100 times |
|
wlmstat -v 60 |
Full details once a minute for ever |
|
wlmstat -Sv 9 |
As above but Superclass only and every 9 seconds |
Output |
Class |
Name of the Class |
|
CPU,MEM,DKIO |
Percentages |
|
tr |
Tier number of class |
|
i |
Inheritance 0=no 1=yes |
|
#pr |
number of processes in class |
|
sha |
Shares (- = -1) |
|
min |
Minimum Limit as a percentage |
|
smx |
Soft maximum limit as a percentage |
|
hmx |
Hard maximum limit as a percentage |
|
des |
Desired percentage calculated by WLM |
|
npg |
number of memory pages in class |
Hint Try to have nothing in the Default Class.
ncheck
Inode check
Syntax |
ncheck [-a][-i inodenumber...] [-s] [filesystem] |
Options |
-a |
all including . and .. |
|
-i inode |
find the file(s) with these inode no. |
|
-s |
list special and set UID files |
Examples |
ncheck -a / |
List all files in / |
|
ncheck -i 2194 /tmp f |
ind name for inode 2194 in /tmp |
netpmon
Network (and lots more) Monitor - uses trace so only the root user and this can hit performance.
Syntax |
netpmon -o file -Tn -P -v -Oreport-type |
Options |
-o outputfile |
put the output to file not stdout |
|
-T n |
Set output buffer size (default 64000) |
|
-P |
Force monitor process into pinned memory |
|
-v |
Verbose (default only top 20 processes) |
|
-O |
cpu, dd(device driver), so(socket), nfs, all |
Examples |
netpmon -O all -o net.out |
|
do network or general workload here ... |
|
finish with: trcstop There is lots of information gathered in one report. |
Output
filemon
File I/O monitor - uses trace so only the root user and this can hit performance.
Syntax |
filemon -i file -o file -d -Tn -P -v -O levels |
Examples |
filemon -O all -o file.out |
|
do disk I/O work load here... |
|
finish with: trcstop |
Output |
#MBs |
total number of Mbytes transfer during run |
|
#opns |
number of times the file was opened |
|
#rpgs |
number of 4K page reads |
|
#wpgs |
number of 4K page written |
|
#wrs |
number of write calls |
|
persistent |
paged from file system |
|
working |
paged from paging space |
|
util |
percentage busy |
|
KB/s |
average data transfer rate |
svmon
System Virtual Memory Monitor - uses trace so only the root user and this can hit performance.
Syntax |
svmon -G -Pnsa pid... -Pnsa[upg][count] -S sid... -i seconds count |
Options |
-G |
Global report |
|
-P[nsa] pid.. \Process report n=non-sys s-system a=both |
|
-S[nsa][upg][x] |
Segment report as above + u==real-mem p=pinned g=paging x=top x items |
|
-S sid... |
Segment report on particular segments |
|
-i secs count |
Repeat report at interval second & count times |
|
-D sid... |
Detailed report |
Examples |
svmon -G |
Global / General stats |
|
svmon -Pa 215 |
Process report for process 215 |
|
svmon -Ssu 10 |
Top ten system segments in real memory order |
|
svmon -D 340d |
Detailed report on a particular segment |
Output |
size |
in pages (4096) |
|
inuse |
in-use |
|
free |
not in use included rmss pages |
|
pin |
pinned (locked by app.) |
|
work |
pages in working segments |
|
pers |
pages in persistent segments |
|
clnt |
pages in client segments |
|
pg space |
paging space |
Note: pages can be in more than one process
ipcs
InterprocessComms(shared memory,queue&semaphore) stats
Syntax |
ipcs -a |
Examples |
ipcs |
Regular report |
|
ipcs -a |
Full report = more columns |
Output |
T |
Type m=memory, q=queue, s=semaphore |
|
ID, KEY |
What the programmer user to access the ipc |
|
CPID, LPID |
Process that created/last attached |
|
CBYTES |
Bytes current in message queue |
|
QBYTES |
Maximum number of bytes allowed in message queue |
|
QNUM |
number of messages held |
|
NATTCH |
Processes attached to this shared memory |
|
SEGSZ |
Size of shared memory (segment) |
|
NSEMS |
Number of Semaphores |
lvmstat
Logical Volume Stats
Syntax |
lvmstat -v vgname -l lvname -e -d [seconds [count]] |
Options |
|
|
|
-v vgname |
Volume group to track |
|
-l lvname |
Logical volume to track |
|
-e |
Enable |
|
-d |
Disable |
|
seconds |
Between output |
|
count |
Number of outputs |
Examples |
|
|
|
lvmstat -v rootvg -e |
Enable rootvg stats (use -d to disable later) |
|
lvmstat -v rootvg |
Monitor all of volume group |
|
lvmstat -l lv05 |
Monitor just one logical volume in more detail |
Output |
iocnt |
number of io |
|
Kb_read |
KBytes read (same for write) |
|
Kbps |
Kbytes per second |
|
mirror# |
Which copy of a mirror |
fileplace
Placement of a file in the filesystem
Syntax |
fileplace -l -p -v filename |
Options |
-l |
Logical layout in filesystem |
|
-p |
Physical layout on disk(s) |
|
-v |
Verbose (good) |
Example |
fileplace -lv /tmp/xyz |
Logical layout |
Example |
fileplace -pv /db/data.idx |
Disk layout |
rmss
Reduced Memory System Simulator
Syntax |
rmss -p -c <MB> -r |
Options |
|
-p |
Print the current value |
|
-c MB |
Change to M size (in Mbytes) |
|
-r |
Restore all memory to use |
|
-p |
Print the current value |
Example |
rmss -p |
find out how much memory you have online |
Example |
rmss -c 32 |
Change available memory to 32 Mbytes |
Example |
rmss -r |
Undo the above |
Warning:
- rmss can damage performance very seriously
- Don't go below 25% od the machines memory
- Never forget to finish with rmss -r
rmms to determine the real memory use
To test the pressure on memory
- Reduce memory by 5% with rmss -c MB
- Immediately, rmss -r so release the rmss locked memory,
- This memory goes on the free list and will be the next memory allocated on demand
- Watch free memory being used with vmstat or nmon
If it reduces in
- seconds - the machine is probably short on memory
- minutes - memory is about right
- hours or days - there is spare memory, can you tune to use more memory, like increasing RDBMS disk caches or Webspace
truss
Tracks process system calls (AIX5+)
Syntax: |
simple |
truss mycmd |
Syntax: |
detailed |
truss -a -f -c -p pid -o file |
Options |
-a |
Display parameters strings |
|
-f |
Follow child processes |
|
-c |
Counts system calls - displays when process stops |
|
-p pid |
Track a running process with PID pid |
|
-o file |
Output the results to a file (allows interaction cmd) |
Examples |
truss -a -p 23456 |
Track process 23456 |
Output |
lots |
Each system call name and parameters |
sar
System activity reporter
Syntax |
Immediate: |
sar -A [-P ALL] interval number |
|
Collect: |
sar -A -o savefile interval number >/dev/null |
|
Report: |
sar -A -f savefile -i secs -s HH[:MM[:SS]] -e HH[:MM[:SS]] |
Options |
-A |
All stats to be collected/reported |
|
-o savefile |
Collect stats to binary file |
|
-f savefile |
Report stats from binary file |
|
-i secs |
Report at seconds interval from binary file |
|
-s and -e |
Report stats only between these times |
Examples |
sar 10 100 R |
eport now at 10 seconds intervals |
|
sar -A -o fred 10 6 |
Collect data into fred |
|
sar -P ALL 1 30 |
Show individual CPUs |
|
sar -A -f fred |
Report on the data |
|
sar -A -f x -s 10:30 -e 10:45 |
Report on 15 minutes from 10:30 a.m. |
|
sar -A -f fred -i60 |
Report 1 min. interval -not 10 secs as collected |
Column |
output |
comments |
CPU |
%usr %sys |
Percent of time in user / kernel mode |
|
%wio %idle |
Percent of time waiting for disk io/idle |
Buffer Cache |
bread/s bwrit/s lread/s lwrit/s |
Block I/O per second Logical I/O per sec (hopefully cached |
|
pread/s pwrit/s |
Raw disk I/O (not buffer cached) |
|
%rcache %wcache |
Percentage hit on cache |
Kernel |
exec/s fork/s sread/s swrite/s r/wchar/s scall/s |
Calls per second of these system calls sread/write system calls (cache, raw, tty or network). scall is the total system calls |
|
msg/s sema/s |
IPC for messages and semaphores |
|
kexit/s ksched/s kproc-ov/s |
Process exits, process switches and process-overload (hit proc thresholds) |
|
runq-sz |
Avg. process on run queue |
|
%runocc |
Percent. of time with process on queue |
|
swap-sz |
Avg. process waiting for page in |
|
%swap-occ |
Percent. of time with process on queue |
|
cycles/s |
number of page replace search of all pages |
|
faults/s |
number of page faults (might not need I/O) |
|
slots |
number of free pages on paging spaces |
|
odio/s |
number of non-paging disk I/O per second |
|
file-ov, proc-ov |
number of times these table overflow per sec |
|
file-sz inode-sz proc-sz |
Entries in the tables |
|
pswch/s |
Process switches per second |
|
canch/s outch/s rawch/s |
Characters per second on terminal lines |
|
rcvin/s xmtin/s |
Receive and transmit interrupts per second |