AIX 5L SysAdmin II: (Unit 03) – System Initialization – Part 1

Objectives:
—————–
1. Describe the boot process to loading the boot logical volume
2. Describe the contents of the boot logical volume
3. Interpret LED codes displayed during boot and at system halt
4. Re-create the boot logical volume on a system which is failing to boot

How Dows An AIX System Boot?
————————————————
1. Check and initialize the hardware POST (Power on self test)
2. Locate the BLV using the boot list BLV (Boot Logical Volume)
3. Load the BLV and pass control
4. Configure Devices cfgmgr
5. Start init and process /etc/inittab

Loading of a Boot Image:
————————————-
Firmware (BIOS) – No dependencies on the operating system
Boot devices (1-diskette, 2 Cd-Rom, 3-Internal disk)
Bootstrap code (512bytes) hdisk0 Boot Logical Volume (hd5)

Content of Boot Logical Volume (hd5):
——————————————————–
1. AIX Kernel
2. rc.boot (phases rc.boot1, rc.boot2, rc.boot3)
3. “Reduced” ODM
4. Boot commands (cfgmgr, bootinfo -b, ipl_varyon, savebase, restbase)

How to Fix a Corrupted BLV:
——————————————
F5 – Boot from CD, tape or NIM
Maintenance Mode
1. Access a Root Volume Group
# bosboot -ad /dev/hdisk0 (very important command)
# shutdown -Fr

Working with Boot Lists:
————————————
Normal Mode
# bootlist -m normal hdisk0 hdisk1 (m=mode)
# bootlist -m normal -o (o=
hdisk0
hdisk1

Service Mode
# bootlist -m service -o
fd0
cd0
hdisk0
tok0

# diag
Task Selection List
SCSI Bus Analyzer
Download Microcode
Display or Change Bootlist ****
Periodic Diagnostics

Working with Boot Lists – SMS System Management Services
1. Reboot or power on the system
2. Press F1 when tone is heard (press 1 on an ascii terminal)
3. Select Boot (or Multiboot)

System Management Services
———————————————
List of Boot Devices
1. Diskette
2. scsi cd-rom
3. hard disk
4. ethernet

Service Processors and Boot Failures
——————————————————-
Boot failure!
LED codes: 553
Service Processor (BUMP)
Automatic transmittal of boot failure information
IBM Support Center dial in to help

If you have a prompt:
bootlist -m normal -o (list the bootlist)
bootlist -m normal hdisk1 (Make boot from hdisk1 instead of hdisk0)

bosboot -ab /dev/hdisk0 – fix problem on my boot logical volume
rc.boot – controls the boot sequence

Accesssing a System That Will Not Boot
————————————————————
F5 – Maintenance Mode

Boot the system from the BOS CD-ROM, tape or network device (NIM)
Select maintenance mode
1. Access a Root Volume Group – Perform corrective actions
4. Install from a System Backup – Recover data

Must have good up-to-date backups to restore.

Booting Maintenance Mode:
—————————————–
Define the System Console
Language to Use
Welcome to Base Operating System
Installation and Maintenance
Select —>>> 3. Start Maintenance Mode for System Recovery
Maintenance
Volume Group Information
1. Access a Root Volume Group
Logical Volumes

Option 1: Access This Volume Group and Start a Shell
1. When there is not enough space in the file system like / root
chfs -a size=+1 /
2. When there is a corrupted system file
export DISPLAY=lft
use vi to edit the file
3. Fix a corrupted boot logical volume /dev/hd5
bosboot -ad hdisk0

Option 2: Access This Volume Group and Start a Shell Before Mounting the File System
1. Fix File System related problems
fsck -y -V jfs /dev/hd1
fsck -y -V jfs /dev/hd2
fsck -y -V jfs /dev/hd3
fsck -y -V jfs /dev/hd4
fsck -y -V jfs /dev/hd9var
fsck -y -V jfs /dev/hd10opt
2. JFS Log programs
logform /dev/hd8
3. Bad Block relocation:
dd count=1 bs=4k skip=31 seek=1 if=/dev/lv00 of=/dev/lv00

Working in Maintenance Mode
——————————————–

Boot Problem References
—————————————
AIX Message Guide and Reference
– Contains: AIX boot codes
AIX Problem Solving Guide and Reference
– Contains: Problem Solving Procedures
Problem Summary Form
RS/6000 Service Guides
– Contains: PCI firmware checkpoints
PCI error codes

Firmware Checkpoints and Error Codes:
———————————————————-
“SCSI Hardware Error” – 21A00001
“No memory found” – F22 (LED/LCD display)
Explained in RS/6000 Service Guide
Online available at: www-1.ibm.com/servers/eserver/pseries/library/hardware_docs

Flashing 888
———————
Reset
– 102 (Software)
Reset for crash code
Reset for dump code

– 103 or 105 (Hardware or Software)
Reset twice for SRN yyy-zzz
Reset once for FRU
Reset 8 times for location code
Optional Codes for Hardware Failure

Understanding the 103 Message:
————————————————
888 – Press Reset repeatedly to get the location code of the bad device
103 – Type of Read-out (103)
104 – SRN Identifying the FRU (104-101)
101 – SRN Identifying the FRU (104-101)
c01 – # of FRU Sequence (1st defect part)

FRU – Field Replaceable Unit
SRN – Service Request Number

lsdev -C – Shows the location codes for each device to determine which one is causing the problem.

Location Codes: Model 150
—————————————–

SCSI Addressing:
————————–
SCSI Adapter 7
Physical Unit Numbers (PUNs)
0,1,4,2,6,8,15 Terminator
4 – Logical Unit Numbers (LUN)
Both ends, internal and external, of SCSI bus must be terminated

Problem Summary Form:
————————————–

Getting Firmware Updates from Internet:
———————————————————–
1. Get firmware update from IBM – http:/www.rs6000.ibm.com/support/micro
2. Update firmware via System Management Services (4. Utilities)

FAQs:
———-
Q: What is the use of the service boot list?
A: The service boot list has several uses among those it offers the ability to change the bootlist when no operating system is installed. Also, one can change the bootlist without accessing the SMS menus.
?????
Q: Why does a boot logical volume become corrupted?
A: A boot logical volume can get corrupted because it may reside on a bad block on the disk. The appropriate steps in mirroring the boot logical volume were not followed properly. to mention a few.?

Lab:
——–
bootlist -m normal -o (list the boot list)
bootlist -m service -o (check to see which devices are supported in the service boot list)
lsvg -p rootvg (determine which physical volumes belong to rootvg)
bootlist -m normal hdisk0 (sets server to boot from hdisk0)
lsvg rootvg (display rootvg info and VG Identifier)
lspv (display the physical volume PV Ids)
odmget -q “name=hdisk0 and attribute=pvid” CuAt | more
run a script to cause problems with the system
shutdown -Fr (-F=fast shutdown, -r=reboot)
Problem with continuous rebooting
To solve the problem we need a bootable device – (such as a mysysb tape or CD)
Press F5 when icons come on the screen for SMS menu to enter Maintenance Mode
Press F1 – Select a language, Press 1 for English
Select option #3 – Start Maintenance Mode for System Recovery
Select option #1 – Access a Root Volume Group
Checks and displays drives
Select the drive that has the rootvg
Displays the Volume Group ID and the logical volumes on the rootvg
Select option #1 – Access this Volume Group and start a shell
Check to see if the file systems are mounting
Displays “Filesystems mounted for maintenance work.”
bosboot -ad /dev/hdisk0 (Recreate the boot logical volume on hdisk0)
0301-152 bosboot: not enough file space to create….
bootimage
/tmp has 12252 free KB.
bootimage needs 15208 KB.
Increase the size of tmp by one partition and that will fix the lack of space problem.
chfs -a size=+1 /tmp
bosboot -ad /dev/hdisk0 (Rerun bosboot again)
message – bosboot: Boot image is 16604 512 blocks.
bootlist -m normal -o (check to see the bootlist)
cd0
bootlist -m normal hdisk0 (change to boot from the hard disk instead of cd0)
shutdown -Fr (Reboot the machine again)

Leave a Reply

Your email address will not be published. Required fields are marked *

*