AIX Ver. 4 SysAdmin IV:Storage Management (Unit 4) – RAID

Unit Objectives:
————————
List the most commonly used RAID levels
Choose a level suitable to your performance and availability needs.
Implement an SSA RAID

What Is RAID?
———————-
– Redundant array of independent disks
– Architecture designed to improve data availability arrays of disk, and data striping methodologies

System Admin View – Many Disks
Kernel View – Single Disk

– Collection of disks seen by the system as a single disk
– Managed by special electronics that control the individual disks
– Multiple disks combined in an array to obtain performance, capcity, and availability that exceeds that of a single large drive.

History of RAID
————————-
IBM funded a project at the University of California at Berkeley in the mid 1980s to study the performance and cost of using small disk drives grouped into arrays

Berkeley research team published a paper entitled “A Case for Redundant Arrays of Inexpensive Disks”. It outlined five disk array models. The models were labeled RAID levels one through five with no hierarchical relationship implied

1992: RAID Advisory Board (RAB) is founded:
– Precise definition of the RAID levels
– Standardized and reliable benchmarking
– Certification program for subsystem vendors

Data Redundancy Mechanism
——————————————–
Add a parity bit to n bits of information
– This allows the detection of single-bit errors
– The parity bit is computed as an exclusive OR(XOR) of the data bits.
Check for drive errors when accessing data:
– If you know which drive has failed, data can be reconstructed.
– Two or more simultaneous failures mean loss of data

Example with 3+1 drives:

1 2 3 Parity
——————–
0 0 0 0
1 0 0 1
0 1 0 1
1 1 0 0
0 0 1 1
1 0 1 0
0 1 1 0
1 1 1 1

Failure Situations
————————–
1 2 3 Parity
——————–
0 X 0 0
0 X 0 1
1 X 0 0
1 X 0 1
0 0 1 X
X X 1 0

Reconstruct Data:
—————————
1 2 3
———-
0 0 0
0 1 0
1 1 0
1 0 0
0 0 1
? ? 1

RAID Levels
——————-
Different ways of organizing the array are known as different RAID Levels

Level Description
——- ——————-
0 – Striping (Performance Enhancement)
1 – Mirroring (Availability)
0+1 Striping with Mirroring (Both Performance & Availability)
3 – Data Striping + Parity
5 – Data and Parity Striping
6 – Two Parity Disks

Levels not mentioned here are not often implemented because of inherent disadvantages.

RAID – 0:Striping
————————–
Data distributed across the disks in the array
Require minimum of 2 disks

RAID – 1:Mirroring
————————–
Always have a 2nd copy of the data on another set of disks

RAID – 3:Data Striping Plus Parity
————————————————–
Each data block of 1024 bytes is subdivided and distributed across all data disks.
Parity information is recorded on a separate disk.

Note: Can be used for a Raw Logical Volume but you cannot use RAID 3 for Journaled File Systems which use 512 byte block sizes

RAID – 5: Data and Parity Striping (Most Commonly Used)
————————————————————————————-
Parity information is distributed across all drives
Considered to provide best all-around RAID solution
Requires a minimum of three drives

RAID – 0+1:Striping and Mirroring
————————————————-
Combination of RAID levels 0 and 1
Requires at least four disks (two for Striping and two for mirroring)

Choosing a RAID Level
———————————–
RAID 0:
– For increased sequential performance only (for example, intermediate Results in image processing or temporary spool files)
– No redundancy

RAID 1:
– No performance penalty for small writes (for example, database redo log)
– No improvement for large sequential I/O
– Cost is similiar to LVM mirroring

RAID 0+1:
– Sequential performance of RAID 0, costs of RAID 1
– RAID 0 applications with increased availability requirements

RAID 5:
– Quite good sequential performance and moderate penalty for small writes
– For highly available large data storage (for example, tablespaces)

Hot Spare Drives
————————–
Situation:
– Single disk failures may go undetected for a long time
– Reconstruction after disk replacement may last for hours
– A second disk failure means data loss in most cases

Solution:
– Leave at least one drive unused (“hot spare”)
– Reconstruction of data to the hot spare drive starts immediately after the first failure
– Time without protection is minimized
– Drive may need to be moved to the slot of the failed disk during replacement for performance reasons.

When a RAID Failure Occurs
——————————————-
Disk fails
RAID management software notes the failure
Hot spare is selected and configured as a replacement
Rebuild begins
Full level of RAID protection restored

RAID System Administration
——————————————–
Typical administration techniques:
– Menu-driven setup via front panel, serial port, or telnet
– From AIX via special AIX and SCSI commands

RAID limitations and vendor specific extensions:
– Cache size has to be large for efficient RAID 5 operation
– Increasing the size of an array is difficult
-Many implementations have static sizes
– Some techniques work with multiples of the original size only
– Bundling LUNs in a VG works around this problem

Some subsystems can create “sub-arrays”:
– Several LUNs with the same RAID level on the same disks
– For temporary storage needs (for example:testing), not for performance

RAID Support in AIX
——————————
LVM capabilities:
– Mirroring (3.1)
– Striping (4.1)
– Mirroring and Striping (4.3.3)

Raid adapters:
– IBM SCSI-2 FAST/WIDE PCI RAID ADAPTER, feature #2493
– Several SSA adapters with RAID capability

External RAID systems:
– 7137, 7135: popular in the mid-1990s; no longer sold since 09/1998
– Versatile Storage Server (VSS, 2105-B09): predecessor of the ESS
– FibreChannel RAID Storage Server (2102-F10)
– Enterprise Storage Server “Shark” (ESS; 2105-E10 or E20)
Powered by RS/6000 processors and SSA 160 RAID adapters (internally)
Attaches to Ultra-SCSI, Fibre Channel, ESCON and FICON

RAID and SSA
———————–
SSA RAID adapters support RAID – 5
RAID -5 performs two reads and two writes to member disks to update both data and parity information
Results in lower performance for write
SSA RAID adpaters reduce the impact with:
– Read cache
– Background parity update
– Disk buffer

Planning an SSA RAID Installation
—————————————————–
Number of disks required
Number of adapters required
Size and arrangement of SSA loops
Array sizes
Use of hot spares

Creating an SSA RAID
———————————-
Steps involved:
– Verify that cabling is consistent with the loop rules:
Number of initiators (that is, host adapters) is limited (1 or 2)
Disks have to be on the same loop (6219, 6215, 6225, 6230 adapters)
Violation of these rules renders the loop totally unusable
– Identify the pdisks you want to use for the array
– Make sure these disks are not in use:
ssaxlate -l pdisk#
Check disk and VG configuration on both nodes
– Change use of SSA pdisks to:
Array candidates
Hot spare
– Create the aray at the RAID level desired

SSA RAID Main Menu (In SMIT)
———————————————–
1. Change use of Multiple SSA Physical Disks
2. Add an SSA RAID Array
List All Defined SSA RAID Arrays

Remove a Disk From an SSA RAID Array
Add a Disk to an SSA RAID Array

RAID Performance Considerations
————————————————–
Primary factors that affect RAID performance
– Transaction performance
– Sequential performance
– I/O response times
– Array size
– Array build time

Array Build and Rebuild Times
——————————————–
Takes time to build and longer to rebuild a RAID array especially when using a hot spare.

RAID Performance Recommendations
——————————————————–
Basic Guidelines
– Consider mirroring with AIX
– Choose different array size
– Use multiple arrays
– Check adapter performance

Note: Check the microcode level of the adapters.

LAB – RAID
——————
Set up a RAID Array

lsattr -El hdisk11 | grep size
diag
Task Selection
SSA Service Aids
SMIT – SSA RAID Array
Change Use of Multiple SSA Physical Disks
Select the disks on the adapter (F7)
Set to: Array Candidate Disks

Change Use of Multiple SSA Physical Disks
Select the disk for Hot Spare on the adapter (F7)
Set to: Hot Spare Disk

Add an SSA RAID Array
raid_5 array
Member Disks
Select each with F7

lspv
smit mkvg
raidvg
hdisk9
lspv

CheckPoint:
——————
1. What is the minimum number of disks needed for RAID 5? 3 (Stripe 2, Parity 1)
2. Which RAID level offers increased availability and best performance for small writes? RAID Level 1 – Mirroring
3. Which RAID levels offer best performance for large sequential reads? RAID 0 – Striping data (Availability is a problem)
4. Can you place an AIX jfs on a RAID 3 LUN? No – jfs block size is 512, RAID 3 uses 1024 block size.
5. What is the purpose of a hot spare drive? Extra if one drive goes down, reconstructs data on hot spare
6. What will happen if there is an HACMP failover during reconstruction of an SSA RAID? It will be ok and continue the reconstruction

Leave a Reply

Your email address will not be published. Required fields are marked *

*