Exadata- Cell Node Reboot Steps Without Affecting ASM

Cell Node Reboot Steps Without Affecting ASM

For Exadata DMA it's a regular activity to restart cell node to resolve some issue or for maintenance activity. Below are the steps can be followed to reboot cell node without affecting ASM.

Step 1: Check the  disk_repair_time is set to 8.5 hours for all mounted disk groups in the Oracle ASM instance and if not set the same from above steps Note: We have just set it to 8.5 hours from previous steps.

sqlplus / as sysasm

select dg.name,a.value from v$asm_diskgroup dg, v$asm_attribute a where dg.group_number=a.group_number and a.name='disk_repair_time';

Step 2: Next we need to check if ASM will be OK if the grid disks go OFFLINE. The following command should return 'Yes' for the grid disks.

ssh <Cell_Node>

cellcli -e list griddisk attributes name,asmmodestatus,asmdeactivationoutcome

Note : double check you are on the correct cell which you are going for the reboot.

Step 3: If all the disks return asmdeactivationoutcome='Yes', then only further proceed to next step

Step 4: Run cellcli command to Inactivate all grid disks on the cell you wish to power down/reboot:

cellcli -e list griddisk

cellcli -e alter griddisk all inactive

Note: * Please note - This action could take 10 minutes or longer depending on activity. It is very important to make sure you were able to offline all the disks successfully before shutting down the cell services. Inactivating the grid disks will automatically OFFLINE the disks in the ASM instance.

Step 5: Confirm that the griddisks are now offline by performing the following actions:

(a) Execute the command below and the output should show either asmmodestatus=OFFLINE or asmmodestatus=UNUSED and asmdeactivationoutcome=Yes for all griddisks once the disks are offline in ASM. Only then is it safe to proceed with shutting down or restarting the cell:

cellcli -e list griddisk attributes name,asmmodestatus,asmdeactivationoutcome

( there has also been a reported case of asmmodestatus= OFFLINE: Means Oracle ASM has taken this grid disk offline. This status is also fine and can proceed with remaining instructions)

(b) List the griddisks to confirm all now show inactive:

cellcli -e list griddisk

Step 6: We can now reboot the cell, Double check above steps are taken care.

#hostname

#reboot


or

#shutdown -F -r now

Step 7: Once the cell comes back online - We will need to reactive the griddisks:

cellcli -e alter griddisk all active

Step 8:  Issue the command below and all disks should show 'active':

cellcli -e list griddisk

Step 9: Verify grid disk status:

(a) Verify all grid disks have been successfully put online using the following command:

cellcli -e list griddisk attributes name, asmmodestatus

(b) Wait until asmmodestatus is ONLINE for all grid disks. Each disk will go to a 'SYNCING' state first then 'ONLINE'. The following is an example of the output:

DATA_CD_00_xyzcel01 ONLINE    <<=========

DATA_CD_01_xyzcel01 SYNCING  <<========

DATA_CD_02_xyzcel01 OFFLINE   <<========

DATA_CD_03_xyzcel01 OFFLINE

DATA_CD_04_xyzcel01 OFFLINE

DATA_CD_05_xyzcel01 OFFLINE

DATA_CD_06_xyzcel01 OFFLINE

DATA_CD_07_xyzcel01 OFFLINE

DATA_CD_08_xyzcel01 OFFLINE

DATA_CD_09_xyzcel01 OFFLINE

DATA_CD_10_xyzcel01 OFFLINE

DATA_CD_11_xyzcel01 OFFLINE

(c) Oracle ASM synchronization is only complete when all grid disks show asmmodestatus=ONLINE.

(Please note:  this operation uses Fast Mirror Resync operation - which does not trigger an ASM rebalance. The Resync operation restores only the extents that would have been written while the disk was offline.)

We will get output as below

[root@xyzcel07 ~]# cellcli -e list griddisk attributes name, asmmodestatus

         DATA_DG_CD_00_xyzcel07      ONLINE

         DATA_DG_CD_01_xyzcel07      ONLINE

         DATA_DG_CD_02_xyzcel07      ONLINE

         DATA_DG_CD_03_xyzcel07      ONLINE

         DATA_DG_CD_04_xyzcel07      ONLINE

         DATA_DG_CD_05_xyzcel07      ONLINE

         DATA_DG_CD_06_xyzcel07      ONLINE

         DATA_DG_CD_07_xyzcel07      ONLINE

         DATA_DG_CD_08_xyzcel07      ONLINE

         DATA_DG_CD_09_xyzcel07      ONLINE

         DATA_DG_CD_10_xyzcel07      ONLINE

         DATA_DG_CD_11_xyzcel07      ONLINE

         DBFS_DG_CD_02_xyzcel07         ONLINE

         DBFS_DG_CD_03_xyzcel07         ONLINE

         DBFS_DG_CD_04_xyzcel07         ONLINE

         DBFS_DG_CD_05_xyzcel07         ONLINE

         DBFS_DG_CD_06_xyzcel07         ONLINE

         DBFS_DG_CD_07_xyzcel07         ONLINE

         DBFS_DG_CD_08_xyzcel07         ONLINE

         DBFS_DG_CD_09_xyzcel07         ONLINE

         DBFS_DG_CD_10_xyzcel07         ONLINE

         DBFS_DG_CD_11_xyzcel07         ONLINE

         RECO_DG_CD_00_xyzcel07      ONLINE

         RECO_DG_CD_01_xyzcel07      ONLINE

         RECO_DG_CD_02_xyzcel07      ONLINE

         RECO_DG_CD_03_xyzcel07      ONLINE

         RECO_DG_CD_04_xyzcel07      ONLINE

         RECO_DG_CD_05_xyzcel07      ONLINE

         RECO_DG_CD_06_xyzcel07      ONLINE

         RECO_DG_CD_07_xyzcel07      ONLINE

         RECO_DG_CD_08_xyzcel07      ONLINE

         RECO_DG_CD_09_xyzcel07      ONLINE

         RECO_DG_CD_10_xyzcel07      ONLINE

         RECO_DG_CD_11_xyzcel07      ONLINE

[root@xyzcel07 ~]#

Step10: Cell Node Reboot Steps Without Affecting ASM   completed here.

 You can learn in detail on Exadata from book Expert Oracle Exadata 

 ==========================================================

Please check our other blogs for Exadata

Comments

Popular posts from this blog

Restore MySQL Database from mysqlbackup

Oracle Database 19C Installation on Windows Server 2016

MySQL InnoDB Cluster Restore/Create Issue : - Dba.createCluster: Group Replication failed to start: MySQL Error 3094 (HY000)