Exadata half RACK Image Upgrade-Non-Rolling-Compute Node

- November 15, 2022

As DMA its regular activity to patch exadata machine. There are two ways of patching Exadata box Rolling and Non-Rolling. In this blog, we will start with part 3 of the Exadata (Half Rack) Image Upgrade (Non-Rolling).

Precheck: Exadata Image Upgrade

Oracle recommends to clear all the stateful alrts from all the cell nodes
[root@abcxyzadm01 ~]# dcli -g cell_group -l root "cellcli -e list alerthistory attributes name,beginTime,alertShortName,alertDescription,severity where alerttype=stateful and severity=critical"
Based on Exachk report check if Oracle finds any hardware failure which much be fixed before you proceed for the patching.

Compute Node/ DB Note / YUM Patch Plan (Non-Rolling)

Check image version

dcli -l root -g dbs_group imageinfo -versio
dcli -l root -g dbs_group imageinfo -status
dcli -l root -g dbs_group uname -r
Verify dbnodeupdate script version

Download latest version of dbnodeupdate script from patch 21634633
Download dbserver.patch.zip as p21634633_122110_Linux-x86-64.zip, which contains dbnodeupdate.zip and patchmgr for dbnodeupdate orchestration via patch 21634633

cd /u01/exa_img_upg/YUM
unzip -o p21634633_122110_Linux-x86-64.zi
Should be at least version
./dbnodeupdate.sh -V
ver=$(./dbnodeupdate.sh -V | awk '{print $3}'); if (( $(echo "$ver < 5.151022" | bc -l) )); then echo -e "\nFAIL: dbnodeupdate version too low. Update before proceeding.\n"; elif (( $(echo "$ver > 5.151022" | bc -l) )); then echo -e "\nPASS: dbnodeupdate version OK\n"; else echo -e "\nWARN: dbnodeupdate minimum version ($ver) detected. Check if there is a newer version before proceeding.\n"; fi

dbnodeupdate script is updated frequently (sometimes daily). If not current then download updated version.
Check databases running before stopping CRS

/u01/app/19c/grid/bin/crsctl status resource -t -w "TYPE = ora.database.type"
ps -ef | grep pmon_ | grep -v grep
Stop the CRS (Non-Rolling)

Execute on one node
dcli -l root -g dbs_group /u01/app/12.1.0.2/grid/bin/crsctl disable crs
/u01/app/19c/grid/bin/crsctl stop cluster -all
dcli -l root -g dbs_group /u01/app/19c/grid/bin/crsctl stop crs
dcli -l root -g dbs_group '/u01/app/19c/grid/bin/crsctl check crs | grep online | wc -l | while read retval; do if [[ $retval -eq 0 ]]; then echo CRS Stopped; elif [[ $retval -eq 4 ]]; then echo CRS Running; else echo CRS Not Ready; fi; done;'
Reboot servers and reset ILOM

dcli -l root -g dbs_group uptime
If uptime more than 7 days then reboot servers
dcli -l root -g dbs_group reboot
Reset the iloms
dcli -l root -g dbs_group 'ipmitool bmc reset cold'
Unmount NFS partitions

dcli -l root -g dbs_group 'umount -a -t nfs -f -l'
Run precheck

cd /u01/exa_img_upg/YUM
./dbnodeupdate.sh -u -l /u01/exa_img_upg/YUM/pXXXXXXXX_Linux-x86-64.zip -t XXXXX -g -v
Perform backup and upgrade

Make sure to check known issues section above prior to executing dbnodeupdate.sh
./dbnodeupdate.sh -u -l /u01/exa_img_upg/YUM/pXXXXX_Linux-x86-64.zip -t XXXX -q
Monitor the reboot
Monitor the reboot of each node by logging into the ilom console.
After reboot completes

Before running the completion step, run the CheckHWnFWProfile script to make sure it passes. If not, shut the system down and power cycle it from the ilom ( stop /SYS, wait 5 minutes, start /SYS)
/opt/oracle.cellos/CheckHWnFWProfile
cd /u01/exa_img_upg/YUM
umount -a -t nfs -f -l
./dbnodeupdate.sh -t XXXXXX -c -g
mount -a
Verify fuse RPMs are Installed
yum list installed | grep fuse
There should be 3 fuse rpm's. If not check note "Fuse packages removed as part of dbnodeupdate prereq check (Doc ID 2066488.1)"
Check version and status

dcli -l root -g dbs_group imageinfo -version
dcli -l root -g dbs_group imageinfo -status
dcli -l root -g dbs_group uname -r
Enable CRS

/u01/app/19c/grid/bin/crsctl enable crs
/u01/app/19c/grid/bin/crsctl check crs
dcli -l root -g dbs_group '/u01/app/19c/grid/bin/crsctl check crs | grep online | wc -l | while read retval; do if [[ $retval -eq 0 ]]; then echo CRS Stopped; elif [[ $retval -eq 4 ]]; then echo CRS Running; else echo CRS Not Ready; fi; done;'
Post checks

/u01/app/19c/grid/bin/crsctl status resource -t -w "TYPE = ora.database.type"
The following checks if APM is disabled across all nodes
dcli -l root -g dbs_group 'cat /sys/module/ib_sdp/parameters/sdp_apm_enable'
Additional checks (if there were problems)

ssh <database-node>
cd /var/log/cellos/
cat dbnodeupdate.log
cat dbserver_backup.sh.log
cat CheckHWnFWProfile.log
cat exadata.computenode.post.log
cat cellFirstboot.log
cat exachkcfg.log
cat vldrun.each_boot.log
cat validations.log

Rollback Steps :

Rolling back the update with the dbnodeupdate.sh utility:
./dbnodeupdate.sh -r
Reboot the server using the reboot command.
# reboot
Run the dbnodeupdate.sh utility in 'completion mode' to finish post patching steps
Similar like with regular updates or One-Time updates, when switching OS binaries with the same Oracle Home, the database kernel should be relinked, so the 'post completion' step needs to be performed.

./dbnodeupdate.sh -c

Click on to for Switch Firmware upgrade.

Click on to for Cell node image upgrade.

You can learn in detail on Exadata from book Expert Oracle Exadata

==========================================================

Please check our other blogs for Exadata.

Search This Blog

Database Solutions

Exadata half RACK Image Upgrade-Non-Rolling-Compute Node

Comments

Post a Comment

Popular posts from this blog

Restart Innodb MySQL Cluster after Complete outage(All node Down)

How to clone Pluggable Database from one container to different Container Database

Oracle Block Corruption - Detection and Resolution