TheBonsai's Blog

About the days and nights of TheBonsai

10gR2 (10.2.0.5) on top of existing 11gR2 (11.2.0.3) GI

February 5th, 2012 by TheBonsai

Hello Oracle fans and victims out there,

I tried to create a 10.2.0.5 database on top of an existing 11.2.0.3 GI (with an already fine running 11.2.0.3 RDBMS as second database), 2 node RAC.

Installed the 10.2.0.1 + 10.2.0.5 PS + 10.2.0.5.6 PSU… worked like a charm.

After pinning the nodes (remember, you have to do that for a pre-11gR2 database on 11gR2 Clusterware!), I created the RAC database with DBCA… worked like a charm.

I started up the database with srvctl – crash. Both instances were killed by their LMON because of a KGXGN polling error. It looks like this in the alert logs:

Instance #1:

Sat Feb 04 12:44:20 CET 2012
lmon registered with NM – instance id 1 (internal mem no 0)
Sat Feb 04 12:46:50 CET 2012
oracle@svdbslx060 (LMON) (ospid: 28370) detects hung instances during IMR reconfiguration
oracle@svdbslx060 (LMON) (ospid: 28370) tries to kill the instance 2.
Please check instance 2’s alert log and LMON trace file for more details.
Sat Feb 04 12:48:05 CET 2012
Remote instance kill is issued with system inc 0 and reason 0x20000000
Remote instance kill map (size 1) : 2
Sat Feb 04 12:49:20 CET 2012
Error: KGXGN polling error (15)
Sat Feb 04 12:49:20 CET 2012
Errors in file /opt/oracle/base/admin/REDSYS/bdump/redsys1_lmon_28370.trc:
ORA-29702: Fehler bei Vorgang von Cluster Group Service
LMON: terminating instance due to error 29702
Sat Feb 04 12:49:20 CET 2012
System state dump is made for local instance
Sat Feb 04 12:49:20 CET 2012
Errors in file /opt/oracle/base/admin/REDSYS/bdump/redsys1_diag_28366.trc:
ORA-29702: Fehler bei Vorgang von Cluster Group Service
Sat Feb 04 12:49:20 CET 2012
Trace dumping is performing id=[cdmp_20120204124920]
Sat Feb 04 12:49:20 CET 2012
Instance terminated by LMON, pid = 28370

Instance #2:

Sat Feb 04 12:44:20 CET 2012
lmon registered with NM – instance id 2 (internal mem no 1)
Sat Feb 04 12:49:20 CET 2012
Error: KGXGN polling error (15)
Sat Feb 04 12:49:20 CET 2012
Errors in file /opt/oracle/base/admin/REDSYS/bdump/redsys2_lmon_1695.trc:
ORA-29702: Fehler bei Vorgang von Cluster Group Service
LMON: terminating instance due to error 29702
Sat Feb 04 12:49:20 CET 2012
Trace dumping is performing id=[cdmp_20120204124920]
System state dump is made for local instance
Sat Feb 04 12:49:21 CET 2012
Errors in file /opt/oracle/base/admin/REDSYS/bdump/redsys2_diag_1691.trc:
ORA-29702: Fehler bei Vorgang von Cluster Group Service
Sat Feb 04 12:49:21 CET 2012
Trace dumping is performing id=[cdmp_20120204124921]
Sat Feb 04 12:49:21 CET 2012
Instance terminated by LMON, pid = 1695

The LMON trace files just revealed the same information for me. I installed the 10.2.0.5.2 CRS Bundle into this ORACLE_HOME – no change. The internet – including MOS – gave hints in many directions, but nothing really seemed to match.

Finally, after a day off (you sometimes need distance!) I got it:

The error messages highly indicate an interconnect problem. The fact that a second (11gR2) database works fine at the same time excludes physics and other problems on the base of the stack. The server and the 11gR2 world has 2 interconnect interfaces. Solution: I just burned the IP of one specific interconnect interface into the init.ora parameters of the 10gR2 instance – and we got a take off. It works like a charm.

This entry was posted on Sunday, February 5th, 2012 at 22:20 and is filed under Oracle, Work. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

2 responses about “10gR2 (10.2.0.5) on top of existing 11gR2 (11.2.0.3) GI”

  1. Usn said:

    Was it RAC or Single Instance?

  2. TheBonsai said:

    RAC :-)

Leave a Reply