TheBonsai's Blog

About the days and nights of TheBonsai

Archive for the 'Work' Category

SQLNET.RECV_TIMEOUT/SEND_TIMEOUT and RMAN

November 2nd, 2011 by TheBonsai

Hi there,

I was analyzing some unexpected RMAN termination, a RMAN-10038/RMAN-03009 combo:

RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: Fehler bei REFAF Befehl in c13 Kanal auf 10/31/2011 08:21:50
RMAN-10038: Datenbank-Session für Kanal c13 unerwartet beendet

Nothing, not even an RMAN tracing was able to reveal more hints. The trace just told it in other words. No underlying ORA/TNS error or similar.

It finally turned out it were some parameters recently added to sqlnet.ora, I set SQLNET.RECV_TIMEOUT and SQLNET.SEND_TIMEOUT and the beast silently dropped RMAN channels that were idle for a while. The next command in this channel blew up the whole RUN block of the backup script.

I removed the parameters and it works again.

Be careful with your sqlnet.ora :-)

 

Category: english, Oracle, Work | No Comments »

Escaping special characters in SQL*Plus logon strings

March 24th, 2011 by TheBonsai

SQL*Plus connect strings/logon strings have a couple of special characters, notably these two:

  • / (slash) to separate username and password
  • @ (at) to separate the TNS descriptor string

If you need to use those characters literally in the logon string, you need to tag them with literal double quotes (literal means: the quotes need to be passed to SQL*Plus, I’m not talking about the UNIX shell):

  • Less readable:
    $ sqlplus USER/\"PASS/WORD\"
  • More readable:
    $ sqlplus USER/'"PASS/WORD"'

Category: english, Oracle | No Comments »

Zero downtime storage migration operations

January 27th, 2011 by TheBonsai

For some systems a downtime is complicated to organize, or bad at all. We have such a system, a RAC database.

In the past, all storage migration operations there had two problems that lead to downtime:

  • CRS votedisk migration at best only with offline CRS
  • kick old devices from the device management in the operating system

In December I migrated the system to something more modern

  • SLES11
  • Oracle Grid Infrastructure (CRS/ASM) 11.2
  • Oracle Database 10.2.0.5 (yes, still quite old, but it has to be a 10.2 for now)

The use of a modern Linux kernel (especially a modern SCSI stack) and 11gR2 infrastructure fixed all my trouble from the past.

Votedisk

With 11gR2 Grid Infrastructure, the clusterware manages its vital files (VD, OCR) using an ASM instance. A migration of the ASM diskgroups holding these files is now as easy as a migration of a normal diskgroup. ASM takes care of moving the votedisks and the OCR and collaborates with CRS here.

Linux devices

The new Linux kernels with a finally sane SCSI stack and native multipathing helps with the second problem. It’s not a problem anymore to remove old device references from the stack:

  • deconfigure the devices from multipathd (not needed technically, since multipathd itself holds no device references)
  • remove the device references from the Linux device mapper
    # see /dev/mapper/* for the name
    dmsetup remove mpathX
  • remove the LUNs from the SCSI stack
    # X:X:X:X LUN number
    echo 1 > /sys/bus/scsi/X:X:X:X/delete

After that, you can safely edit the FC zone – no errors should occur in any logs.

Category: english, Oracle, Work | No Comments »