Zero downtime storage migration operations
January 27th, 2011 by TheBonsai
For some systems a downtime is complicated to organize, or bad at all. We have such a system, a RAC database.
In the past, all storage migration operations there had two problems that lead to downtime:
- CRS votedisk migration at best only with offline CRS
- kick old devices from the device management in the operating system
In December I migrated the system to something more modern
- SLES11
- Oracle Grid Infrastructure (CRS/ASM) 11.2
- Oracle Database 10.2.0.5 (yes, still quite old, but it has to be a 10.2 for now)
The use of a modern Linux kernel (especially a modern SCSI stack) and 11gR2 infrastructure fixed all my trouble from the past.
Votedisk
With 11gR2 Grid Infrastructure, the clusterware manages its vital files (VD, OCR) using an ASM instance. A migration of the ASM diskgroups holding these files is now as easy as a migration of a normal diskgroup. ASM takes care of moving the votedisks and the OCR and collaborates with CRS here.
Linux devices
The new Linux kernels with a finally sane SCSI stack and native multipathing helps with the second problem. It’s not a problem anymore to remove old device references from the stack:
- deconfigure the devices from multipathd (not needed technically, since multipathd itself holds no device references)
- remove the device references from the Linux device mapper
# see /dev/mapper/* for the name dmsetup remove mpathX - remove the LUNs from the SCSI stack
# X:X:X:X LUN number echo 1 > /sys/bus/scsi/X:X:X:X/delete
After that, you can safely edit the FC zone – no errors should occur in any logs.
Category: english, Oracle, Work | No Comments »