Tuesday, February 17, 2009

Manually switch over to the standby database

Manually switch over of dataguard configuration without dataguard broker

1. Verify the Physical Standby Database Is Performing Properly
Step 1 Identify the existing archived redo log files on strandby

SQL>SELECT SEQUENCE#, FIRST_TIME, NEXT_TIME FROM V$ARCHIVED_LOG ORDER BY SEQUENCE#;

Step 2 Force a log switch to archive the current online redo log file on primary
SQL>ALTER SYSTEM SWITCH LOGFILE;

Step 3 Verify the new redo data was archived on the standby database.
SQL>SELECT SEQUENCE#, FIRST_TIME, NEXT_TIME FROM V$ARCHIVED_LOG ORDER BY SEQUENCE#;

Step 4 Verify that received redo has been applied on standby
SQL>SELECT SEQUENCE#,APPLIED FROM V$ARCHIVED_LOG ORDER BY SEQUENCE#;

The value of the APPLIED column for the most recently received log file will be
either IN-MEMORY or YES if that log file has been applied.

2. Pre-Switchover Checks
Verify Managed Recovery is Running (non-broker) on the standby
SQL>select process from v$managed_standby where process like 'MRP%';

Cancel apply delay for the target standby using SQL
SQL>select delay_mins from v$managed_standby where process = 'MRP0';

if delay_mins >0 then
SQL>RECOVER MANAGED STANDBY DATABASE CANCEL
SQL>ALTER DATABASE RECOVER MANAGED STANDBY DATABASE NODELAY USING CURRENT LOGFILE DISCONNECT FROM SESSION;

Verify there are no large GAPS.
On primary
SQL>SELECT THREAD#, SEQUENCE# FROM V$THREAD;
On the standby the following query should be no more than 1-2 less than the primary query result
SQL>SELECT THREAD#, MAX(SEQUENCE#)
FROM V$ARCHIVED_LOG val, V$DATABASE vdb
WHERE APPLIED = 'YES' AND val.RESETLOGS_CHANGE#=vdb.RESETLOGS_CHANGE#GROUP BY THREAD#;
If there is a gap .. fix the gap

Verify Primary and Standby TEMP Files Match
SQL>select tmp.name FileName, bytes, ts.name Tablespace from v$tempfile tmp, v$tablespace ts where tmp.ts#=ts.ts#;
If there is no match then you will correct the temp tbs problem after switchover

Verify that there is no issue with V$LOG_HISTORY on the Standby (bug 6010833)
1. Get the Primary RESETLOGS_CHANGE#. On the primary execute:
SQL>select RESETLOGS_CHANGE# from V$DATABASE;

2. On the standby get the maximum sequence numbers for each thread from V$LOG_HISTORY:
SQL>select thread#, max(sequence#) from V$LOG_HISTORY where resetlogs_change#=1 group by thread#;

3. Get current sequence numbers on standby:
SQL>select thread#, max(sequence#) from v$archived_log alog, v$database db where alog.resetlogs_change#=db.resetlogs_change# group by thread#;

4. The last sequence# for each thread# from V$LOG_HISTORY should be close (the difference in log sequences < 3) to the last sequence# for each thread# from V$ARCHIVED_LOG.

5. If there is an issue with V$LOG_HISTORY then recreate the standby controlfile

Verify no old partial Standby Redo Logs on the Standby (bug 7159505)
1. Identify any active standby redo logs (SRL’s)
SQL>SELECT GROUP#, THREAD#, SEQUENCE# FROM V$STANDBY_LOG WHERE STATUS = 'ACTIVE' ORDER BY THREAD#,SEQUENCE#;

2.Identify maximum applied sequence number(s).
SQL> select thread#, max(sequence#)
from V$LOG_HISTORY
where resetlogs_change#=<>
group by thread#;

3.If there are any active SRL's that have a thread#/sequence# less than the thread#/sequence# returned from the V$LOG_HISTORY (meaning the recovery has progressed beyond the active SRL) query then clear them.
SQL>RECOVER MANAGED STANDBY DATABASE CANCEL
SQL>ALTER DATABASE CLEAR LOGFILE GROUP ;

3 Switchover
Clear Potential Blocking Parameters & Jobs
1.Capture current job state on the primary
SQL>select * from dba_jobs_running;
SQL>select owner,job_name, start_date, end_date, enabled from dba_scheduler_jobs where enabled='TRUE' and owner <> 'SYS';
SQL>show parameter job_queue_processes -- and capture the value

Block further job submission
SQL>alter system set job_queue_processes=0 scope=both;
SQL>execute dbms_scheduler.disable(job_name);

Disable any cron jobs that may interfere , rman backups
Shutdown all mid-tiers (e.g db console)
$ emctl stop dbconsole

Monitor Switchover
Turn on Data Guard tracing on primary and standby
Tracing is turned on to have diagnostic information available in case any issues arise.
SQL>show parameter log_archive_trace -- capture the value
Set Data Guard trace level to 8191
SQL>alter system set log_archive_trace=8191;
Trace output will appear under the destination pointed to by the database parameter BACKGROUND_DUMP_DEST with “mrp” in the file name.

Monitor alert log for both primary and standby

Finally Switchover
Verify that the primary database can be switched to the standby role
SQL>SELECT SWITCHOVER_STATUS FROM V$DATABASE;
A value of TO STANDBY or SESSIONS ACTIVE (requires the WITH SESSION SHUTDOWN clause on the switchover command) indicates that the primary database can be switched to the standby role

Switchover the primary to a standby database
SQL>ALTER DATABASE COMMIT TO SWITCHOVER TO STANDBY WITH SESSION SHUTDOWN;
in primary alert log you would see

Switchover: Complete - Database shutdown required (sfs_stby1)
Completed: ALTER DATABASE COMMIT TO SWITCHOVER TO STANDBY WITH SESSION SHUTDOWN

Verify that the standby database can be switched to the primary role
SQL>SELECT SWITCHOVER_STATUS FROM V$DATABASE;

Check if the standby has ever been open read-only
SQL> SELECT VALUE FROM V$DATAGUARD_STATS WHERE NAME='standby has been open';

If the standby was open read-only then restart the standby
SQL> SHUTDOWN IMMEDIATE
SQL> STARTUP MOUNT

Switchover the standby database to a primary
SQL> ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY;

Open the new primary database:
SQL> ALTER DATABASE OPEN;
Correct any tempfile mismatch

Restart the new standby
On the the new standby database (old production database), bring it to the mount state and start managed recovery.
SQL> SHUTDOWN IMMEDIATE;
SQL> STARTUP MOUNT;
SQL> RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT;

Post-Switchover Steps
For each database:
SQL>alter system set log_archive_trace=;
SQL>alter system set job_queue_processes= scope=both;
SQL>execute dbms_scheduler.enable();
Enable any cron jobs that were diabled in 3.1

Check that they are syncronized
Perfom log switch on new primary
Create cron job to backup new primary !!!


Reference
Metalink Note id : 751600.1

Friday, February 13, 2009

Build Your Own Oracle RAC Cluster on Oracle Enterprise Linux and iSCSI

A very nice article form OTN. Can be used as a guide to build RAC on Linux using OCFS2 and ASM.

http://www.oracle.com/technology/pub/articles/hunter_rac10gr2_iscsi.html

Configuring VLANs Under Fedora/RHEL/CentOS

A nice link to help configure vlans on RHEL/CentOs. Especially usefull when you want to install and configure Oracle 10gR2 CRS using vlans for public, and cluster interconnect.

http://www.tummy.com/journals/entries/jafo_20061130_063931

Tuesday, February 10, 2009

Installing RAC 10gR2 on RHEL5.2 POWER LINUX

On both nodes
more /etc/oraInst.loc
on node1
groupadd oinstall
groupadd dba
useradd -u 200 -g oinstall -G dba oracle
passwd oracle
id nobody (user must exist)
id oracle (record the uid,gid)

on node2
groupadd -g 500 oinstall
groupadd -g 501 dba
useradd -u 200 -g oinstall -G dba oracle

Configure ssh equivelance
as root
vi /etc/hosts
127.0.0.1 localhost.localdomain localhost
172.18.253.32 billing2
172.18.253.30 billing1

which scp
if not in /usr/local/bin create link :
cd /usr/local/bin ; ln -s /usr/bin/scp scp

which ssh
if not in /usr/local/bin create link :
cd /usr/local/bin ; ln -s /usr/bin/ssh ssh

On both nodes login as oracle
mkdir .ssh
chmod 700 .ssh
ssh-keygen -t rsa
ssh-keygen -t dsa

on billing1:
cd .ssh
ssh billing1 cat /home/oracle/.ssh/id_rsa.pub >> authorized_keys
ssh billing1 cat /home/oracle/.ssh/id_dsa.pub >> authorized_keys
ssh billing2 cat /home/oracle/.ssh/id_rsa.pub >> authorized_keys
ssh billing2 cat /home/oracle/.ssh/id_dsa.pub >> authorized_keys
scp authorized_keys billing2:/home/oracle/.ssh/

on billing2:
cd .ssh
ssh billing1 cat /home/oracle/.ssh/id_rsa.pub >> authorized_keys
ssh billing1 cat /home/oracle/.ssh/id_dsa.pub >> authorized_keys
ssh billing2 cat /home/oracle/.ssh/id_rsa.pub >> authorized_keys
ssh billing2 cat /home/oracle/.ssh/id_dsa.pub >> authorized_keys
scp authorized_keys billing1:/home/oracle/.ssh/

on both nodes
chmod 600 ~/.ssh/authorized_keysexec
/usr/bin/ssh-agent $SHELL
/usr/bin/ssh-add
vi /home/oracle/.ssh/config ...
add
Host *
ForwardX11 no

vi .bashrc ... add
if [ -t 0 ]; then
stty intr ^C
fi
vi .bash_profile
umask 022
export DISPLAY=172.18.41.140:0
. ./.bash_profile

as root
vi /etc/hosts
127.0.0.1 localhost.localdomain localhost
172.18.253.32 billing2
172.18.253.33 billing2-v
10.254.254.32 billing2-i
172.18.253.30 billing1
172.18.253.31 billing1-v
10.254.254.30 billing1-i

vi /etc/hosts.equiv
billing1 oracle
billing2 oracle
billing1-i oracle
billing2-i oracle

Install required packages
yum install libaio-devel
yum install libXp

Download and install ibm xlc
http://www-1.ibm.com/support/docview.wss?rs=2030&context=SSJT9L&context=SSENT9&context=SSEP5D&dc=D400&dc=D410&dc=D420&dc=D430&q1=Run-time+Environment+Component&uid=swg24007906&loc=en_US&cs=utf-8&lang=en
rpm -ihv update/xlc/apr2007/xlsmp.msg.rte-1.5.1-3.ppc64pseries.rpm
rpm -ihv update/xlc/apr2007/xlsmp.rte-1.5.1-3.ppc64pseries.rpm
rpm -ihv update/xlc/apr2007/vacpp.rte-7.0.1-3.ppc64pseries.rpm
rpm -ivh vacpp.rte.lnk-7.0.1-0.ppc64pseries.rpm
rpm -ihv vac.lib-7.0.1-0.ppc64pseries.rpm

add entries to /etc/sysctl.conf
# Oracle RAC
kernel.shmmni = 4096
kernel.sem = 250 32000 100 128
fs.file-max = 65536
net.ipv4.ip_local_port_range = 1024 65000

sysctl -p

vi /etc/security/limits.conf
# Oracle Limits
oracle soft nproc 2047
oracle hard nproc 16384
oracle soft nofile 1024
oracle hard nofile 65536
# As per Release Notes
oracle soft memlock 3145728
oracle hard memlock 3145728

vi /etc/pam.d/login
# Oracle RAC
session required pam_limits.so

vi /etc/profile
if [ $USER = "oracle" ]; then
if [ $SHELL = "/bin/ksh" ]; then
ulimit -p 16384
ulimit -n 65536
else
ulimit -u 16384 -n 65536
fi
fi
mkdir -p /orabin
chown -R oracle:oinstall /orabin
chmod -R 775 /orabin
mkdir -p /orabin/crs
chown -R root:oinstall /orabin/crs
chmod -R 775 /orabin/crs
as root on both nodes
lsmod grep hang
if not hangcheck_timer configured then
insmod /lib/modules/2.6.18-92.el5/kernel/drivers/char/hangcheck-timer.ko hangcheck_tick=30 hangcheck_margin=180
add the previous comand in /etc/rc.d/rc.local

Configuring raw devices for ocr and css using mpath (Metalink Note:564580.1)
raw /dev/raw/raw1 /dev/mapper/mpath1
raw /dev/raw/raw2 /dev/mapper/mpath2
raw -qals -l /dev/raw/
dd if=/dev/zero of=/dev/raw/raw1 bs=1024 count=100
su - oracle
dd if=/dev/zero of=/dev/raw/raw1 bs=1024 count=100 --> Permission Denied Expected
dd if=/dev/zero of=/dev/raw/raw2 bs=1024 count=100
su - oracle
dd if=/dev/zero of=/dev/raw/raw2 bs=1024 count=100 --> Permission Denied Expected
raw /dev/raw/raw1 0 0
raw /dev/raw/raw2 0 0
vi /etc/rc.d/rc.local and add
##### Oracle Cluster Registry (OCR) devices#####
chown root:oinstall /dev/mapper/mpath1
chmod 660 /dev/mapper/mpath1
raw /dev/raw/raw1 /dev/mapper/mpath1
chown root:oinstall /dev/raw/raw1
chmod 660 /dev/raw/raw1
##### Oracle Voting disks#####
chown oracle:oinstall /dev/mapper/mpath2
chmod 660 /dev/mapper/mpath2
raw /dev/raw/raw2 /dev/mapper/mpath2
chown oracle:oinstall /dev/raw/raw2
chmod 660 /dev/raw/raw2
/etc/rc.local
ll /dev/mapper
brw-rw---- 1 root oinstall 253, 5 May 30 14:40 mpath1
brw-rw---- 1 oracle oinstall 253, 6 May 30 14:40 mpath2
ls -l /dev/raw/
crw-rw---- 1 root oinstall 162, 1 May 30 15:41 raw1
crw-rw---- 1 oracle oinstall 162, 2 May 30 15:41 raw2
Disable sendmail Service and reboot
ll /dev/mapper/ & ls -l /dev/raw same results as above

Open oracle session on first node
export ORACLE_BASE=/orabin
export ORACLE_HOME=/orabin/crs

cd /orasoft
Disk1/cluvfy/runcluvfy.sh stage -pre crsinst -n billing1,billing2
Ignore vipca Error
runInstaller -ignoreSysPrereqs (as per Metalink Note: 414163.1)
Ignore remote attach home error
as root first on node1 and then on node2 do:
/orabin/oraInventory/orainstRoot.sh
Before Running root.sh on each node apply fix for Bug 4679769
FAILED TO FORMAT OCR DISK USING CLSFMTas per Metalink Note:564580.1
as root first on node1 and then on node2 do:
cd /orasoft
unzip p4679769_10201_IBMPower.zip
cd 5682732/
ls -l /orabin/crs/bin/clsfmt.bin
-rwxr-xr-x 1 oracle oinstall 784750 Nov 11 2005 /orabin/crs/bin/clsfmt.bin
cp /orabin/crs/bin/clsfmt.bin /orabin/crs/bin/clsfmt.bin.bak
cp clsfmt.bin /orabin/crs/bin/clsfmt.bin
chmod 755 /orabin/crs/bin/clsfmt.bin
ls -l /orabin/crs/bin/clsfmt.bin-rwxr-xr-x 1 oracle oinstall 778072 May 30 16:12 /orabin/crs/bin/clsfmt.bin
Apply workaround #2 for vipca errors as per Metalink Note:414163.1 (maybe not applying on ppc)
as root first on node1 and then on node2 do:
vi /orabin/crs/bin/vipca
Find lines : if [ "$arch" = "i686" -o "$arch" = "ia64" -o "$arch" = "x86_64" ]
then
LD_ASSUME_KERNEL=2.4.19
export LD_ASSUME_KERNEL
fi
unset LD_ASSUME_KERNEL <<== Line to be added
as root first on node1 and then on node2 do:
/orabin/crs/root.sh
On node2 you may see :
Running vipca(silent) for configuring nodeapps
The given interface(s), "eth0" is not public.
Public interfaces should be used to configure virtual IPs.
Ignore vipca error and run it from master node (First test from failling node failed, According to netapp document run from master node)
on node2:
export DISPLAY=172.18.41.140:0
/orabin/crs/bin/vipca
define interface for the virtual ips
and check crs after vipca finishes and installation reports finished
on both nodes
/orabin/crs/bin/crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora....ng1.gsd application ONLINE ONLINE billing1
ora....ng1.ons application ONLINE ONLINE billing1
ora....ng1.vip application ONLINE ONLINE billing1
ora....ng2.gsd application ONLINE ONLINE billing2
ora....ng2.ons application ONLINE ONLINE billing2
ora....ng2.vip application ONLINE ONLINE billing2

Reboot just to check if all is ok !!!

FINISHED !!!!!!

Wednesday, February 4, 2009

ORA-10631 When Trying To Shrink A Table That Has A Function-based Index

The shrink_clause is subject to the following restrictions:

-You cannot specify this clause for a cluster, a clustered table, or any object with a LONG column.
-Segment shrink is not supported for tables with function-based indexes or bitmap join indexes.
-This clause does not shrink mapping tables of index-organized tables, even if you specify CASCADE.
-You cannot specify this clause for a compressed table.
-You cannot shrink a table that is the master table of an ON COMMIT materialized view.
-Rowid materialized views must be rebuilt after the shrink operation.

In order to shrink segments that have function based indexes , get index ddl , drop the index , shrink the table and recreate the index.