AIX 5L SysAdmin II: (Unit 11, Part 1) – Performance and Workload Management

Unit Objectives:
———————–
1. Provide basic performance concepts
2. Provide basic performance analysis
3. Manage the workload on a system
4. Work with the Performance Diagnostic TooL (PDT)

Performance Problems:
———————————-
Performance is very often not objective!
May be good for some but not for others depending on what they are doing on the system.

Understand the Workload:
—————————————
Analyze the hardware:
– Model, Memory, Disks, Network

Identify all the work performed by the system

Identify critical applications and processes:
– What is the system doing?
– What happens under the cover (for example, NFS-mounts)

Characterize the workload:
– Workstation
– Multiuser System
– Server
– Mixture of all above?

Critical Resources: The Four Bottlenecks:
————————————————————-
1. CPU
– Number of processes
– Process, Priorities
2. Memory
– Real memory
– Paging space
– Memory leaks
3. Disk I/O
– Disk balancing
– Types of disks
– LVM policies
4. Network
– NFS used to load applications
– Network type
– Network traffic

Identify CPU-Intensive Program: ps aux
———————————————————
# ps aux (identify processes using the most CPU time)

Note: Execute over several days at different times of the day to determine system usage. How does the system load change?

Identify High-Priority Processes: ps -elf
———————————————————-
# ps -elf

The smaller the PRI value, the higher the priority of the process. The average process runs a priority around 60.
The NI value is used to adjust the process priority. The higher the nice value is, the lower the priority of the process.

Basic Performance Analysis:
——————————————-
# sar -u (monitor CPU load)
High CPU%
Yes – Possible CPU Constraint
No – Check Memory
# vmstat (Check memory)
High paging
Yes – Possible memory constraint
No – Check disk
# iostat (Check disk)
Disk balanced
Yes – Possible disk/SCSI constraint
No – Balance disk

Monitoring CPU Usage: sar -u:
———————————————-
# sar -u 60(every minute) 30(number of times)
AIX www 1 5 000400B24C00 06/06/01
08:24:10 %usr %sys %wio %idle
08:25:10 48 52 0 0
08:26:10 63 37 0 0
….
Average 57 43 0 0

A system is CPU bound, if: %usr + %sys > 80%

Monitoring Memory Usage: vmstat:
—————————————————
# vmstat 5
Summary report every # minutes

Paging space page ins and outs:
pi – page in
po – page out
If any paging-space I/O is taking place, the workload is approaching the system’s memory limit

wa: I/O wait percentage of CPU
If nonzero, a significant amount of time is being spent waiting on file I/O

Monitoring Disk I/O: iostat
—————————————
# iostat 10 2
Disks – Cumulative activity since last reboot

Note: A system is I/O bound, if: %iowait > 25%, %tm_act > 70%

topas:
———-
All 3 previous commands combined into one command
topas

AIX Performance Tools:
———————————-
Identify causes of bottlenecks
1. CPU Bottlenecks – Processes using CUP time
# tprof -x sleep 60 (runs ever 60 seconds) (Let run a while before running the next command to view the data)
# more _prof.all (This file is created by tprof command)
Note: Look at the Total column (100 ticks = 1 sec) (2 sec average)

2. Memory Bottlenecks – Processes using memory
# svmon -G (Global report) Sizes are in # of 4K frames
# svmon -Pt 3 (Top 3 users of memory)

3. I/O Bottlenecks – File systems, LVs, and files causing disk activity
# filemon -o fmout (Starts monitoring disk activity)
# trcstop (Stops monitoring and creates report)
# more fmout (View the results)

There Is Always a Next Bottleneck!
—————————————————-
iostat 10 60 – Our system is I/O bound. Let’s but faster disks!
vmstat 5 – Our system is now memory bound! Let’s but more memory!
sar -u 60 60 – Oh no! The CPU is completely overloaded!

Workload Management Techniques:
—————————————————-
Run programs at different times throughout the day
crontab -e
format = min, hour, dayofmonth, month, weekday, command
# echo “/usr/local/bin/report” | at 0300
# echo “/usr/bin/cleanup” | at 1100 friday

0 3 * * 1-5 /usr/local/bin/report

Workload Management Techniques (2 of 3)
—————————————————————
Set up printer queues that run later in the day

Sequential execution of programs
# vi /etc/qconfig

ksh:
device = kshdev
discipline = fcfs

kshdev:
backend = /usr/bin/ksh

# qadm -D ksh (disable queue)
# qprt -P ksh report1
# qprt -P ksh report2

# qadm -U ksh (queue is up and jobs will be executed sequentially)

Workload Management Techniques (1 of 3)
—————————————————————
Run programs at a reduced priority
#nice -n 15 backup_all &
# ps -el

renice -s -10 3860
ps -el

Note: Must be root to make increase the priority of processes.
Users can only decrease the priority of their processes.

FAQs:
———-
Q: What is the range for a nice value?
A: 0 – 39
?????
Q: Why is the first row of the sar command significantly different than the other output?
A: The first row is always a summary of the system since last reboot.

Q: What fileset has to be installed in order to configure and use the Performance Diagnostics Tool?
A: PDT has been available since AIX 4 and the required fileset is:? bos.perf.diag_tool

Lab:
———
Create an alias to show the most active commands
alias top=”ps aux | tail -2 | sort -R 1.15,1.19mr”
(not sure about syntax because it was difficult to see)

vi .kshrc (Add the alias to the .kshrc file)

vi .profile (=.kshrc)
Note: Make the alias permenant with the previous commands.

top
/home/workshop/ex11_prog1& (priority level and nice value)
ps -elf | grep /home/workshop/ex11_prog1&

jobs
[1] * Running /home/workshop/ex11_prog1&
kill %1
jobs
[1] * Terminated /home/workship/ex11_prog1&

nice -n 15 /home/workshop/ex11_prog1&
ps -elf | grep /home/workshop/ex11_prog1&

renice -n -10 4670(pid)
ps -elf | grep /home/workshop/ex11_prog1&

Lower nice number is running at higher priority

/home/workshop/ex11_cpu&
sar -u 2(secs) 5(times)
ps -elf | grep /home/workshop/ex11_cpu&
kill %2

/home/workshop/ex11_disk&
iostat 2(secs) 5(times)
kill %1

/home/workshop/ex11_memory&
vmstat 5(secs)
Cntrl-D

Control Printing Jobs:
——————————–
vi /etc/config
Add the following lines
ksh:
device = kshdev
discipline = fcfs
kshdev:
backend = /usr/bin/ksh
:wq

qadm -D ksh
qprt -P ksh /home/workshop/ex11_job
lpstat
diable ksh (disable the printer queue)
qprt -P ksh /home/workshop/ex11_job
lpstat (the queue is down with job waiting to be printed)
enable ksh

Leave a Reply

Your email address will not be published. Required fields are marked *

*