Wednesday, November 7, 2012

Applying DB PSU 11.2.0.1.6 (12419378 ) Patch

To install the PSU 11.2.0.1.6 patch, the Oracle home must have the 11.2.0.1.0 Database installed. Subsequent PSU patches can be installed on Oracle Database 11.2.0.1.0 or any PSU with a lower 5th numeral version than the one being installed.
The actual patch needs to be downloaded from MOS and the one I used was for Linux - 
Patch Name - p12419378_112010_LINUX_11.2.0.1.6(PSU)

You must use the Opatch utility version 11.2.0.1.3 or later to apply this patch. Oracle recommends that you use the latest released OPatch 11.2, which is available for download from My Oracle Support patch 6880880 by selecting the 11.2.0.0.0 release.
For updating OPatch to latest version pls check the following note -
<http://handsonoracle.blogspot.in/2012/05/apply-psu-5-for-11gr2-grid-and-rac-11.html>

If you are installing the PSU to an environment that has a Grid Infrastructure home, note the following:
-- Database PSU 11.2.0.1.6 should only be applied to the Database home using the instructions contained in the patch readme file
-- 11.2.0.1.2 Grid Infrastructure PSU <<Patch 9655006>> should be applied to the Grid Infrastructure home using the readme instructions provided with that patch.

To apply GI PSU 9655006 check the following note -
<http://handsonoracle.blogspot.in/2012/11/applying-gi-psu-11.html>

Once you have all the pre-requisite checked you can install this patch in rolling mode. I'm implementing this on single node cluster for the demonstration purpose. However the same method can be used for multi-nodes as well.
To do the rolling upgrade, you have to leave the CRS Stack running on the node to be patched and OPatch will automatically shut down that stack during patching and will restart it after patch.
Since this PSU is meant for RAC Binaries you have to shutdown all instances and services running out of this home.
Detailed instructions are as follows..

[root@appsractest oracle]# sudo su - oracle
[oracle@appsractest ~]$ export PATH=$PATH:/u01/app/oracle/product/11.0/db_1/OPatch
[oracle@appsractest ~]$ which opatch
/u01/app/oracle/product/11.0/db_1/OPatch/opatch
[oracle@appsractest ~]$ cd /u01/app/grid/11.0/12419378/
[oracle@appsractest 12419378]$ ll
total 60
drwxr-xr-x  3 oracle oinstall  4096 Jul  8  2011 custom
drwxr-xr-x  4 oracle oinstall  4096 Jul  8  2011 etc
drwxr-xr-x 12 oracle oinstall  4096 Jul  8  2011 files
-rwxr-xr-x  1 oracle oinstall  2871 Jul  8  2011 patchmd.xml
-rw-rw-r--  1 oracle oinstall 40790 Jul 18  2011 README.html
-rw-r--r--  1 oracle oinstall    21 Jul  8  2011 README.txt

[oracle@appsractest 12419378]$ opatch apply
Oracle Interim Patch Installer version 11.2.0.3.0
Copyright (c) 2012, Oracle Corporation.  All rights reserved.
Oracle Home       : /u01/app/oracle/product/11.0/db_1
Central Inventory : /u01/app/oraInventory
   from           : /u01/app/oracle/product/11.0/db_1/oraInst.loc
OPatch version    : 11.2.0.3.0
OUI version       : 11.2.0.1.0
Log file location : /u01/app/oracle/product/11.0/db_1/cfgtoollogs/opatch/12419378_Nov_07_2012_10_28_01/apply2012-11-07_10-28-00AM_1.log
Applying interim patch '12419378' to OH '/u01/app/oracle/product/11.0/db_1'
Verifying environment and performing prerequisite checks...
Patch 12419378: Optional component(s) missing : [ oracle.client, 11.2.0.1.0 ]
Interim patch 12419378 is a superset of the patch(es) [  9654983 ] in the Oracle Home
OPatch will roll back the subset patches and apply the given patch.
All checks passed.
Please shutdown Oracle instances running out of this ORACLE_HOME on the local system.
(Oracle Home = '/u01/app/oracle/product/11.0/db_1')
Is the local system ready for patching? [y|n]
y
User Responded with: Y
Backing up files...
Backing up files...
Rolling back interim patch '9654983' from OH '/u01/app/oracle/product/11.0/db_1'

Patching component oracle.rdbms.rsf, 11.2.0.1.0...
Patching component oracle.rdbms.dbscripts, 11.2.0.1.0...
Patching component oracle.rdbms, 11.2.0.1.0...
Patching component oracle.oraolap, 11.2.0.1.0...
Patching component oracle.rdbms.deconfig, 11.2.0.1.0...
Patching component oracle.javavm.server, 11.2.0.1.0...
Patching component oracle.precomp.common, 11.2.0.1.0...
Patching component oracle.network.rsf, 11.2.0.1.0...
Patching component oracle.network.listener, 11.2.0.1.0...
RollbackSession removing interim patch '9654983' from inventory
OPatch back to application of the patch '12419378' after auto-rollback.
Patching component oracle.rdbms.rsf, 11.2.0.1.0...
Patching component oracle.rdbms.dbscripts, 11.2.0.1.0...
Patching component oracle.rdbms, 11.2.0.1.0...
Patching component oracle.oraolap, 11.2.0.1.0...
Patching component oracle.rdbms.deconfig, 11.2.0.1.0...
Patching component oracle.javavm.server, 11.2.0.1.0...
Patching component oracle.precomp.common, 11.2.0.1.0...
Patching component oracle.network.rsf, 11.2.0.1.0...
Patching component oracle.network.listener, 11.2.0.1.0...
Patching component oracle.rdbms.dv.oc4j, 11.2.0.1.0...
Patching component oracle.sdo.locator, 11.2.0.1.0...
Patching component oracle.sysman.console.db, 11.2.0.1.0...
Patching component oracle.sysman.oms.core, 10.2.0.4.2...
Patching component oracle.rdbms.dv, 11.2.0.1.0...
Patching component oracle.rdbms.dv, 11.2.0.1.0...
Patching component oracle.xdk.rsf, 11.2.0.1.0...
Patching component oracle.ldap.rsf.ic, 11.2.0.1.0...
Patching component oracle.ldap.rsf, 11.2.0.1.0...
Patching component oracle.sysman.plugin.db.main.repository, 11.2.0.1.0...
Verifying the update...
atch 12419378 successfully applied
Log file location: /u01/app/oracle/product/11.0/db_1/cfgtoollogs/opatch/12419378_Nov_07_2012_10_28_01/apply2012-11-07_10-28-00AM_1.log
OPatch succeeded.

-- Now if you check the OPatch for the applied PSU's you will see something like following.

9352237    12419378  Wed Nov 07 10:46:19 IST 2012   DATABASE PSU 11.2.0.1.1
9654983    12419378  Wed Nov 07 10:46:19 IST 2012   DATABASE PSU 11.2.0.1.2 (INCLUDES CPUJUL2010)
9952216    12419378  Wed Nov 07 10:46:19 IST 2012   DATABASE PSU 11.2.0.1.3 (INCLUDES CPUOCT2010)
12419378   12419378  Wed Nov 07 10:46:19 IST 2012   DATABASE PSU 11.2.0.1.6 (INCLUDES CPUJUL2011)

PNow you just need to load the modified sql in the DB the way you normally load during your CPU apply process and you are done. You jsut applied DB PSU successfully to your Database.

Applying GI PSU 11.2.0.1.2 (9655006) on 11.2.0.1 fails with 
"The opatch Component check failed"


Application of the GI PSU 11.2.0.1.2 failed with following error.. Though the pre-requisite of having OPatch version was clearly satisfied after upgrading OPatch to 11.2.0.3 using patch p6880880 as per MOS note.

Error - 
[root@appsractest 11.0]# opatch auto /u01/app/grid/11.0/stage  -oh /u01/app/grid/11.0
Executing /usr/bin/perl /u01/app/grid/11.0/OPatch/crs/patch112.pl -patchdir /u01/app/grid/11.0 -patchn stage -oh /u01/app/grid/11.0 -paramfile /u01/app/grid/11.0/crs/install/crsconfig_params
opatch auto log file location is /u01/app/grid/11.0/OPatch/crs/../../cfgtoollogs/opatchauto2012-11-07_06-24-58.log
Detected Oracle Clusterware install
Using configuration parameter file: /u01/app/grid/11.0/crs/install/crsconfig_params
OPatch  is bundled with OCM, Enter the absolute OCM response file path:
/home/oracle/ocm.rsp
The opatch minimum version  check for patch /u01/app/grid/11.0/stage/9655006 failed  for /u01/app/grid/11.0
The opatch minimum version  check for patch /u01/app/grid/11.0/stage/9654983 failed  for /u01/app/grid/11.0
Opatch version check failed for oracle home  /u01/app/grid/11.0
Opatch version  check failed

Fix - 
There are mainly two possible reasons for this...

1. Multiple patches in the directory. 
Make sure you have only one patch unzipped in the directory specified as patch top. If there are multiple patches then the OPatch will fail.

2. Permission issues when trying to write to hidden patch_storage directory under GRID_HOME. When opatch is run it will try to update the info in patch_storage directory so that rollback of the patch is possible.
When OPatch auto is run (even as root), internally it will execute opatch utility with napply using Oracle user. So if the oracle user doesn't have the permissions to write to .patch_storage the opatch auto application will fail.
[root@appsractest 11.0]# cd /u01/app/grid/11.0
[root@appsractest 11.0]# ll .patch_storage/
total 4
-rw-r--r-- 1 root root 796 Nov  5 10:15 LsInventory__11-05-2012_10-15-08.log
[root@appsractest 11.0]# chown  oracle:oinstall .patch_storage/ -R
[root@appsractest 11.0]# ll .patch_storage/
total 4
-rw-r--r-- 1 oracle oinstall 796 Nov  5 10:15 LsInventory__11-05-2012_10-15-08.log
[root@appsractest 11.0]# chmod 777 .patch_storage/ -R
[root@appsractest 11.0]# ll .patch_storage/
total 4
-rwxrwxrwx 1 oracle oinstall 796 Nov  5 10:15 LsInventory__11-05-2012_10-15-08.log

Once done above two things, when you run the opatch again, it will go thru fine 
-- To update the GI home
[root@appsractest 11.0]# opatch auto /u01/app/grid/11.0/stage  -oh /u01/app/grid/11.0
Executing /usr/bin/perl /u01/app/grid/11.0/OPatch/crs/patch112.pl -patchdir /u01/app/grid/11.0 -patchn stage -oh /u01/app/grid/11.0 -paramfile /u01/app/grid/11.0/crs/install/crsconfig_params
opatch auto log file location is /u01/app/grid/11.0/OPatch/crs/../../cfgtoollogs/opatchauto2012-11-07_06-27-26.log
Detected Oracle Clusterware install
Using configuration parameter file: /u01/app/grid/11.0/crs/install/crsconfig_params
OPatch  is bundled with OCM, Enter the absolute OCM response file path:
/home/oracle/ocm.rsp
Successfully unlock /u01/app/grid/11.0
patch /u01/app/grid/11.0/stage/9655006  apply successful for home  /u01/app/grid/11.0
patch /u01/app/grid/11.0/stage/9654983  apply successful for home  /u01/app/grid/11.0
ACFS-9300: ADVM/ACFS distribution files found.
ACFS-9312: Existing ADVM/ACFS installation detected.
ACFS-9314: Removing previous ADVM/ACFS installation.
ACFS-9315: Previous ADVM/ACFS components successfully removed.
ACFS-9307: Installing requested ADVM/ACFS software.
ACFS-9308: Loading installed ADVM/ACFS drivers.
ACFS-9321: Creating udev for ADVM/ACFS.
ACFS-9323: Creating module dependencies - this may take some time.
ACFS-9327: Verifying ADVM/ACFS devices.
ACFS-9309: ADVM/ACFS installation correctness verified.
CRS-4123: Oracle High Availability Services has been started.

-- To Update RAC home
[root@appsractest ~]# export PATH=$PATH:/u01/app/oracle/product/11.0/db_1/OPatch/
[root@appsractest ~]# which opatch
/u01/app/oracle/product/11.0/db_1/OPatch/opatch
[root@appsractest ~]# opatch auto /u01/app/grid/11.0/stage -oh /u01/app/oracle/product/11.0/db_1
Executing /usr/bin/perl /u01/app/oracle/product/11.0/db_1/OPatch/crs/patch112.pl -patchdir /u01/app/grid/11.0 -patchn stage -oh /u01/app/oracle/product/11.0/db_1 -paramfile /u01/app/grid/11.0/crs/install/crsconfig_params
opatch auto log file location is /u01/app/oracle/product/11.0/db_1/OPatch/crs/../../cfgtoollogs/opatchauto2012-11-07_07-03-02.log
Detected Oracle Clusterware install
Using configuration parameter file: /u01/app/grid/11.0/crs/install/crsconfig_params
OPatch  is bundled with OCM, Enter the absolute OCM response file path:
/home/oracle/ocm.rsp
patch /u01/app/grid/11.0/stage/9655006/custom/server/9655006  apply successful for home  /u01/app/oracle/product/11.0/db_1
patch /u01/app/grid/11.0/stage/9654983  apply successful for home  /u01/app/oracle/product/11.0/db_1

-- Check the status of the resources for RAC Home and you will find that resources are started automatically after application of patch
[oracle@appsractest ~]$ srvctl  status database -d iedge
Instance iedge1 is running on node appsractest

-- Now when you try to see the applied patch history from OPatch, you will see that patch is applied...

9655006    9655006   Wed Nov 07 07:06:11 IST 2012   GI PSU 11.2.0.1.2 (INCLUDES DATABASE PSU 11.2.0.1.2)




Monday, November 5, 2012

How To Restore From Backed-up Grid Binaries After Failed Upgrade

During one of our patching process (from 11.2.0.1 to 11.2.0.2) for 11g R2 DB, our upgrade for Gird Binaries failed due to some unknown reason. Since we are running out of time we didn't had much choice but to resort to our backed up binaries. The good thing, I guess, is that it failed on the first node. So the other nodes were untouched and we have to backed out only the first node. And also, its time to get our processes tested.

Error - 
CRS-2675: Stop of 'ora.oc4j' on 'appsractest' failed
CRS-2679: Attempting to clean 'ora.oc4j' on 'appsractest'
CRS-2678: 'ora.oc4j' on 'appsractest' has experienced an unrecoverable failure
CRS-2677: Stop of 'ora.iedge.db' on 'appsractest' succeeded
CRS-2673: Attempting to stop 'ora.DATA1.dg' on 'appsractest'
ORA-15154: cluster rolling upgrade incomplete
ORA-15154: cluster rolling upgrade incomplete
CRS-2675: Stop of 'ora.DATA1.dg' on 'appsractest' failed
CRS-2679: Attempting to clean 'ora.DATA1.dg' on 'appsractest'
ORA-15154: cluster rolling upgrade incomplete
ORA-15154: cluster rolling upgrade incomplete
CRS-2678: 'ora.DATA1.dg' on 'appsractest' has experienced an unrecoverable failure
CRS-0267: Human intervention required to resume its availability.
CRS-2673: Attempting to stop 'ora.asm' on 'appsractest'
CRS-2677: Stop of 'ora.asm' on 'appsractest' succeeded
CRS-2794: Shutdown of Cluster Ready Services-managed resources on 'appsractest' has failed
CRS-2675: Stop of 'ora.crsd' on 'appsractest' failed
CRS-2795: Shutdown of Oracle High Availability Services-managed resources on 'appsractest' has failed
CRS-4687: Shutdown command has completed with errors.
CRS-4000: Command Stop failed, or completed with errors.
################################################################
#You must kill processes or reboot the system to properly      #
#cleanup the processes started by Oracle Grid Infrastructure   #
################################################################
Failed to stop old Grid Infrastructure stack at /u01/app/11.2.0.3/grid/crs/install/crsconfig_lib.pm line 14109.
/u01/app/11.2.0.3/grid/perl/bin/perl -I/u01/app/11.2.0.3/grid/perl/lib -I/u01/app/11.2.0.3/grid/crs/install /u01/app/11.2.0.3/grid/crs/install/rootcrs.pl execution failed


-- Before upgrade following pieces were backed up.
1. Grid Home - /u01/app/11.2.0/grid
2. RAC Home - /u01/app/oracle/product/11.2.0/db_1
3. System Files - /etc/oracle, /etc/oratab & /etc/inittab
4. Init Script - /etc/init.d/init*
5. OCR Backup - /u01/app/11.2.0/grid/cdata/appsrac/OCR_Prior_backup

Backup Commands - 
1. Grid Home - 
[root@appsractest oracle]# tar -cvpf node1_grid.tar /u01/app/11.2.0/grid
2. RAC Home -  
[root@appsractest oracle]# tar - cvpf node1_rac.tar /u01/app/oracle/product/11.2.0/db_1
3. System Files  - 
[root@appsractest oracle]#  tar -cvpf etc_oracle.tar /etc/oracle/*; tar -cvpf etc_inittab.tar /etc/inittab; tar -cvpf etc_oratab.tar /etc/oratab
4. Init Script  - 
[root@appsractest oracle]#  tar -cvpf etc_initd.tar /etc/init.d/init*
5. OCR Backup  - 
[root@appsractest oracle]# ocrconfig -manualbackup
-- This will take the backup in $GRID_HOME/cdata/cluster-name directory, rename it to your chosen name.

Since your upgrade has failed you need to stop the any running processes from cluster-ware stack. Find out the process and kill them if you cant stop them.
Once all the processes are exited, now its the time to restore binaries. Following are the commands to restore them.

1. Grid Home - Following command will Untar the directory structure in your current working directory. Once completed, move the files to specific folder. Dont forget to backup the failed Grid Home Directory.
[root@appsractest oracle]# tar -xvpf node1_grid.tar

2. RAC Home -  Following command will Untar the directory structure in your current working directory. Once completed, move the files to specific folder. Dont forget to backup the failed Grid Home Directory.
[root@appsractest oracle]# tar - xvpf node1_rac.tar

3. System Files  - 
[root@appsractest oracle]#  tar -xvpf etc_oracle.tar ; tar -xvpf etc_inittab.tar ; tar -xvpf etc_oratab.tar 
4. Init Script  - 
[root@appsractest oracle]#  tar -xvpf etc_initd.tar
5. OCR Backup  - If you also need to restore the backup of the OCR, first you have to perform all the above four steps and make sure your binaries are restored properly. Once done you can check the integrity of the OCR and if needed to restore, you can check following link to review the restore process for OCR.
<http://handsonoracle.blogspot.in/2012/11/how-to-restore-asm-based-ocr-after-loss.html>

Now you can either reboot or restart the HAS stack as follows.
 [root@appsractest oracle]# crsctl start has 
OR 
[root@appsractest oracle]# reboot

Now if the restore is done correctly, your stack will come up. Once up its time to check whether the activeversion and softwareversion are being reflecting correctly. Both of the following commands should be reflecting same binaries version.
[root@appsractest oracle]# crsctl check crs activeversion
[root@appsractest oracle]# crsctl check crs softwareversion


Sunday, November 4, 2012


How to Restore ASM based OCR After Loss

In this post I will describe how to restore the OCR and votedisk if they are lost due to hardware issue or manual errors.
I am using single node cluster for this demo however, the process remains almost same even on multinode cluster.

[oracle@appsractest ~]$ cluvfy stage -post crsinst -n appsractest -verbose

Performing post-checks for cluster services setup
Checking node reachability...

Check: Node reachability from node "appsractest"

  Destination Node                      Reachable?
  ------------------------------------  ------------------------
  appsractest                           yes
Result: Node reachability check passed from node "appsractest"
Checking user equivalence...

Check: User equivalence for user "oracle"

  Node Name                             Comment
  ------------------------------------  ------------------------
  appsractest                           passed
Result: User equivalence check passed for user "oracle"

ERROR:

PRVF-4037 : CRS is not installed on any of the nodes
Verification cannot proceed
Post-check for cluster services setup was unsuccessful on all the nodes.

[oracle@appsractest ~]$ crsctl check cluster -all

**************************************************************
appsractest:
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
**************************************************************

It is not possible to directly restore a manual or automatic OCR backup if the OCR is located in an ASM disk group. This is caused by the fact that the command 'ocrconfig -restore' requires ASM to be up & running in order to restore an OCR backup to an ASM disk group. However, for ASM to be available, the CSS and CRS stack must have been successfully started. 

On the other side, For the restore to succeed, the OCR also must not be in use (r/w), i.e. no CRS daemon must be running while the OCR is being restored. 

A description of the general procedure to restore the OCR can be found in the  documentation, this document explains how to recover from a complete loss of the ASM disk group that held the OCR and Voting files in a 11gR2 Grid environment.


When using an ASM disk group for CRS there are typically 3 different types of files located in the disk group that potentially need to be restored/recreated:

the Oracle Cluster Registry file (OCR)

the Voting file(s)
the shared SPFILE for the ASM instances

The following example assumes that the OCR was located in a single disk group used exclusively for CRS. The disk group has just one disk using external redundancy.

Note - This document assumes that the name of the OCR diskgroup remains unchanged, however there may be a need to use a different diskgroup name, in which case the name of the OCR diskgroup would have to be modified in /etc/oracle/ocr.loc across all nodes prior to executing the following steps.

--Locate the latest automatic OCR backup

When using a non-shared CRS home, automatic OCR backups can be located on any node of the cluster, consequently all nodes need to be checked for the most recent backup:
[root@appsractest appsractest]# ocrconfig -showbackup manual
appsractest     2012/10/30 06:49:31     /u01/app/grid/11.0/cdata/ractest/backup_20121030_064931.ocr

-- If you try to remove another and the only remaining copy ...
[root@appsractest appsractest]# ocrconfig -delete +DATA1
PROT-28: Cannot delete or replace the only configured Oracle Cluster Registry location

--Make sure the Grid Infrastructure is shutdown on all nodes
-- If the OCR diskgroup is missing, the GI stack will not be functional on any node, however there may still be various -- -- daemon processes running.  On each node shutdown the GI stack using the force (-f) option:


[root@appsractest grid]# crsctl stop crs
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'appsractest'
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'appsractest' has completed
CRS-4133: Oracle High Availability Services has been stopped.


[root@appsractest bin]# ps -ef | grep d.bin
root     13781  3768  0 07:12 pts/1    00:00:00 grep d.bin

Note - if you try to restore before shutting crs , you will get following err 
[root@appsractest grid]# ocrconfig -restore /u01/app/11.2.0.2/grid/cdata/appsractest-cluster/backup_20121022_102735.ocr
PROT-19: Cannot proceed while the Cluster Ready Service is running

-- Start the CRS stack in exclusive mode
-- On the node that has the most recent OCR backup, log on as root and start CRS in exclusive mode, this mode will -- -- --- allow ASM to start & stay up without the presence of a Voting disk and without the CRS daemon process (crsd.bin) ------ running.

Please note:

This document assumes that the CRS diskgroup was completely lost, in which  case the CRS daemon (resource ora.crsd) will terminate again due to the inaccessibility of the OCR - even if above message indicates that the start succeeded. 
If this is not the case - i.e. if the CRS diskgroup is still present (but corrupt or incorrect) the CRS daemon needs to be shutdown manually using:
11.2.0.1:
# $CRS_HOME/bin/crsctl stop res ora.crsd -init
otherwise the subsequent OCR restore will fail.

11.2.0.2:

# $CRS_HOME/bin/crsctl start crs -excl -nocrs
CRS-4123: Oracle High Availability Services has been started.
...
CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'auw2k3'
CRS-2672: Attempting to start 'ora.ctssd' on 'racnode1'
CRS-2676: Start of 'ora.drivers.acfs' on 'racnode1' succeeded
CRS-2676: Start of 'ora.ctssd' on 'racnode1' succeeded
CRS-2676: Start of 'ora.cluster_interconnect.haip' on 'racnode1' succeeded
CRS-2672: Attempting to start 'ora.asm' on 'racnode1'
CRS-2676: Start of 'ora.asm' on 'racnode1' succeeded

IMPORTANT:
A new option '-nocrs' has been introduced with  11.2.0.2, which prevents the start of the ora.crsd resource. It is vital that this option is specified, otherwise the failure to start the ora.crsd resource will tear down ora.cluster_interconnect.haip, which in turn will cause ASM to crash.

-- Since I'm using 11201, I have to use following command
[root@appsractest bin]# crsctl start crs -excl
CRS-4123: Oracle High Availability Services has been started.
CRS-2672: Attempting to start 'ora.gipcd' on 'appsractest'
CRS-2672: Attempting to start 'ora.mdnsd' on 'appsractest'
CRS-2676: Start of 'ora.gipcd' on 'appsractest' succeeded
CRS-2676: Start of 'ora.mdnsd' on 'appsractest' succeeded
CRS-2672: Attempting to start 'ora.gpnpd' on 'appsractest'
CRS-2676: Start of 'ora.gpnpd' on 'appsractest' succeeded
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'appsractest'
CRS-2676: Start of 'ora.cssdmonitor' on 'appsractest' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'appsractest'
CRS-2679: Attempting to clean 'ora.diskmon' on 'appsractest'
CRS-2681: Clean of 'ora.diskmon' on 'appsractest' succeeded
CRS-2672: Attempting to start 'ora.diskmon' on 'appsractest'
CRS-2676: Start of 'ora.diskmon' on 'appsractest' succeeded
CRS-2676: Start of 'ora.cssd' on 'appsractest' succeeded
CRS-2672: Attempting to start 'ora.ctssd' on 'appsractest'
CRS-2672: Attempting to start 'ora.drivers.acfs' on 'appsractest'
CRS-2676: Start of 'ora.ctssd' on 'appsractest' succeeded
CRS-2676: Start of 'ora.drivers.acfs' on 'appsractest' succeeded
CRS-2672: Attempting to start 'ora.asm' on 'appsractest'
CRS-2676: Start of 'ora.asm' on 'appsractest' succeeded
CRS-2672: Attempting to start 'ora.crsd' on 'appsractest'
CRS-2676: Start of 'ora.crsd' on 'appsractest' succeeded
[root@appsractest bin]# crs_stat -t
CRS-0184: Cannot communicate with the CRS daemon.

[root@appsractest bin]# crsctl check cluster
CRS-4692: Cluster Ready Services is online in exclusive mode
CRS-4529: Cluster Synchronization Services is online
-- Stop CRS 
[root@appsractest dbs]# crsctl stop resource ora.crsd -init
CRS-2673: Attempting to stop 'ora.crsd' on 'appsractest'
CRS-2677: Stop of 'ora.crsd' on 'appsractest' succeeded
-- Restore the latest OCR backup, must be done as the root user:
[root@appsractest dbs]# ocrconfig -restore /u01/app/grid/11.0/cdata/ractest/backup_20121030_064931.ocr
[root@appsractest dbs]#
[root@appsractest dbs]# ocrcheck
Status of Oracle Cluster Registry is as follows :
         Version                  :          3
         Total space (kbytes)     :     262120
         Used space (kbytes)      :       2568
         Available space (kbytes) :     259552
         ID                       :  219028771
         Device/File Name         :  +OCR_DISK
                                    Device/File integrity check succeeded
                                    Device/File not configured
                                    Device/File not configured
                                    Device/File not configured
                                    Device/File not configured
         Cluster registry integrity check succeeded
         Logical corruption check succeeded
-- Once restored stop the cluster 
[root@appsractest dbs]# crsctl stop crs -f
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'appsractest'
CRS-2673: Attempting to stop 'ora.cssdmonitor' on 'appsractest'
CRS-2673: Attempting to stop 'ora.ctssd' on 'appsractest'
CRS-2673: Attempting to stop 'ora.asm' on 'appsractest'
CRS-2673: Attempting to stop 'ora.mdnsd' on 'appsractest'
CRS-2673: Attempting to stop 'ora.drivers.acfs' on 'appsractest'
CRS-2677: Stop of 'ora.cssdmonitor' on 'appsractest' succeeded
CRS-2677: Stop of 'ora.mdnsd' on 'appsractest' succeeded
CRS-2677: Stop of 'ora.drivers.acfs' on 'appsractest' succeeded
CRS-2677: Stop of 'ora.asm' on 'appsractest' succeeded
CRS-2677: Stop of 'ora.ctssd' on 'appsractest' succeeded
CRS-2673: Attempting to stop 'ora.cssd' on 'appsractest'
CRS-2677: Stop of 'ora.cssd' on 'appsractest' succeeded
CRS-2673: Attempting to stop 'ora.gpnpd' on 'appsractest'
CRS-2673: Attempting to stop 'ora.diskmon' on 'appsractest'
CRS-2677: Stop of 'ora.gpnpd' on 'appsractest' succeeded
CRS-2673: Attempting to stop 'ora.gipcd' on 'appsractest'
CRS-2677: Stop of 'ora.gipcd' on 'appsractest' succeeded
CRS-2677: Stop of 'ora.diskmon' on 'appsractest' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'appsractest' has completed
CRS-4133: Oracle High Availability Services has been stopped.

-- If you need to Recreate the Voting file, The Voting file needs to be initialized in the OCR_DISK disk group:
[root@appsractest dbs]# $CRS_HOME/bin/crsctl replace votedisk +OCR_DISK
Successful addition of voting disk 00caa5b9c0f54f3abf5bd2a2609f09a9.
Successfully replaced voting disk group with +CRS.
CRS-4266: Voting file(s) successfully replaced

-- Once done you can now safely start the HAS stack and verify if the cluster comes back nicely or not.
[root@appsractest dbs]# crsctl start crs
CRS-4123: Oracle High Availability Services has been started.

If you now check after some time, the whole Grid stack will be up and running

Note  - If you votedisk is also corrupted due to some of the maintenance command, you need to stop all running clusterware threads. Delete and recreate the OCR_DISK 
[root@appsractest dbs]# ps -ef | grep d.bin
root     24899  3768  0 08:18 pts/1    00:00:00 grep d.bin
[root@appsractest dbs]# ps -ef | grep crs
root     24902  3768  0 08:19 pts/1    00:00:00 grep crs
[root@appsractest dbs]# clear
[root@appsractest dbs]# oracleasm createdisk OCR_DISK /dev/hdd1
Writing disk header: done
Instantiating disk: done
[root@appsractest dbs]# oracleasm listdisks
DATA
OCR_DISK

-- In case of ASM one needs to recreate them as follows...
[root@appsractest dbs]# sqlplus  sys/oracle  as sysasm
SQL*Plus: Release 11.2.0.1.0 Production on Tue Oct 30 09:05:41 2012
Copyright (c) 1982, 2009, Oracle.  All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
With the Real Application Clusters and Automatic Storage Management options
SQL> sho parameter spfile
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
spfile                               string      +DATA1/ractest/asmparameterfil
                                                 e/registry.253.798018457
SQL> create spfile='+DATA1/iedge/asmparameterfile/registry.253.798018457' from pfile;

Saturday, November 3, 2012

PRCR-1079 : Failed to start resource ora.oc4j
 OR 
CRS-2501: Resource 'ora.oc4j' is disabled

After the installer of 11gR2 Grid home when tried to check the status of the component, I found that two of them i.e. GSD and OC4J were not started. While gsd is mainly needed for the backward comptability with 10g its not mendatory to have it running with 11g DB. Hence by default it is disabled and one can enable it as and when needed.


[oracle@appsractest bin]$ ./crs_stat -t
Name           Type           Target    State     Host        
------------------------------------------------------------
ora.DATA.dg    ora....up.type ONLINE    ONLINE    appsractest 
ora....ER.lsnr ora....er.type ONLINE    ONLINE    appsractest 
ora....N1.lsnr ora....er.type ONLINE    ONLINE    appsractest 
ora....SM1.asm application    ONLINE    ONLINE    appsractest 
ora....ST.lsnr application    ONLINE    ONLINE    appsractest 
ora....est.gsd application    OFFLINE   OFFLINE               
ora....est.ons application    ONLINE    ONLINE    appsractest 
ora....est.vip ora....t1.type ONLINE    ONLINE    appsractest 
ora.asm        ora.asm.type   ONLINE    ONLINE    appsractest 
ora.eons       ora.eons.type  ONLINE    ONLINE    appsractest 
ora.gsd        ora.gsd.type   OFFLINE   OFFLINE               
ora....network ora....rk.type ONLINE    ONLINE    appsractest 
ora.oc4j       ora.oc4j.type  OFFLINE   OFFLINE               
ora.ons        ora.ons.type   ONLINE    ONLINE    appsractest 
ora....ry.acfs ora....fs.type ONLINE    ONLINE    appsractest 
ora.scan1.vip  ora....ip.type ONLINE    ONLINE    appsractest 


-- When tried to start it manually, it says...
[oracle@appsractest bin]$ ./srvctl start oc4j -v
OC4J could not be started
PRCR-1079 : Failed to start resource ora.oc4j
CRS-2501: Resource 'ora.oc4j' is disabled

-- So I tried to enable it and start it manually as follows
[oracle@appsractest bin]$ ./srvctl enable oc4j
[oracle@appsractest bin]$ ./srvctl start oc4j -v
OC4J has been started

-- After that I tried to fix the issue with GSD as follows
[oracle@appsractest bin]$ ./srvctl enable nodeapps -v
GSD is enabled successfully on node(s): appsractest
PRKO-2415 : VIP is already enabled on node(s): appsractest
PRKO-2416 : Network resource is already enabled.
PRKO-2417 : ONS is already enabled on node(s): appsractest
PRKO-2418 : eONS is already enabled on node(s): appsractest

[oracle@appsractest bin]$ ./srvctl start nodeapps
PRKO-2421 : Network resource is already started on node(s): appsractest
PRKO-2420 : VIP is already started on node(s): appsractest
PRKO-2422 : ONS is already started on node(s): appsractest
PRKO-2423 : eONS is already started on node(s): appsractest

[oracle@appsractest bin]$ ./crs_stat -t
Name           Type           Target    State     Host        
------------------------------------------------------------
ora.DATA.dg    ora....up.type ONLINE    ONLINE    appsractest 
ora....ER.lsnr ora....er.type ONLINE    ONLINE    appsractest 
ora....N1.lsnr ora....er.type ONLINE    ONLINE    appsractest 
ora....SM1.asm application    ONLINE    ONLINE    appsractest 
ora....ST.lsnr application    ONLINE    ONLINE    appsractest 
ora....est.gsd application    ONLINE    ONLINE    appsractest 
ora....est.ons application    ONLINE    ONLINE    appsractest 
ora....est.vip ora....t1.type ONLINE    ONLINE    appsractest 
ora.asm        ora.asm.type   ONLINE    ONLINE    appsractest 
ora.eons       ora.eons.type  ONLINE    ONLINE    appsractest 
ora.gsd        ora.gsd.type   ONLINE    ONLINE    appsractest 
ora....network ora....rk.type ONLINE    ONLINE    appsractest 
ora.oc4j       ora.oc4j.type  ONLINE    ONLINE    appsractest 
ora.ons        ora.ons.type   ONLINE    ONLINE    appsractest 
ora....ry.acfs ora....fs.type ONLINE    ONLINE    appsractest 
ora.scan1.vip  ora....ip.type ONLINE    ONLINE    appsractest 

As you can see now both of them are now started successfully and you can go ahead with remaining of your setup.