Data Guard Physical Standby Database Best Practices
Introduction
Oracle Data Guard is
one of the software solutions provided by Oracle Corporation to maximize high
availability of Oracle databases. Oracle Data Guard maintains one or many
secondary databases as alternatives to the primary production database.
Data Guard
Architecture
Oracle Data Guard
supports both physical standby and logical standby sites.
Physical Standby: When the primary database transactions generate redo
entries, a redo apply process keeps up the secondary databases with the exact
block copies of the primary database.
Logical Standby: SQL apply processes read the redo and convert it to SQL
transactions. These are then applied to the secondary database.
Data Guard Modes
Oracle Data Guard can
operate in 3 different modes:
o Maximum
Protection: Transactions are not allowed to commit
until all redo data are written to the online redo logs and propagated to at
least one synchronized secondary database. If for any reason, the primary database
cannot propagate its redo stream to one secondary database, the primary will
shutdown to ensure maximum protection mode.
o Maximum
Performance: Transactions are not allowed to commit
as soon as the redo are written to the online redo logs. The redo stream is
asynchronously propagated to the secondary databases to ensure maximum
performance mode.
o Maximum
Availability: Transactions are not allowed to commit
until all redo data are written to the online redo logs and propagated to at
least one synchronized secondary database. If for any reason, the primary
database cannot propagate its redo stream to one secondary database, the
primary will NOT shutdown and operates as it it were in maximum performance
mode until issues are fixed.
Glossary
o Role transition: Changing the role of each database
component from primary database to the secondary database or from secondary
database to the primary database.
o Switchover: Planned role transition for testing.
Manual intervention.
o Fail Over: Unplanned failure. Manual or Automatic
intervention. Automatic role transition is the recommended.
o Primary database: Where the users are connected to access
to the database.
o Standby database: Exists in the the disaster recovery (DR)
site. Where the users are connected in the case of planned role transition
(Switchover) or in the case of unplanned failure (Fail Over).
o Data Guard Observer: Process monitors both primary and
standby databases and performs an automatic fail over when necessary.
o The Broker: The management framework for Oracle Data
Guard. It comes integrated into the oracle database enterprise edition.
o Fast Start Fail Over (FSFO): Automatic fail over to the standby database
occurs in case of failure. FSFO requires the broker.
The Environment
o I have 3 VM with version of Oracle Enterprise
Linux 5 installed.
o I will use the Oracle Database 11g Enterprise
Edition Release 2; 11.2.0.1.0 on the primary database and the standby database.
o Oracle software is installed and a database is
already created on the primary site.
o Oracle software is installed and the database
will be created during this demonstration for the standby site.
o On the third host, I have installed a version
of Oracle database 11g Client with administrator option; 11.2.0.1.0. This is
the Data Guard Observer host.
o The Oracle Home is on identical path on both
nodes: primary database and the standby database.
o Primary database server name is dg1 with a
database name dg1.
o Standby database server name is dg2 with a
database name dg2.
o The Data Guard Observer server name is dg3.
For the rest of the
document, I am going to implement a physical Data Guard environment in maximum
performance mode. The whole document will be split in 2 parts:
1.
Prepare the Data Guard environment.
2.
Create the standby database and verity the physical standby
database. I will then Configure the Data Guard Broker, enable Fast Start Fail
Over (FSFO) and run a Data Guard Observer.
Implementation notes
These are the steps to
follow:
1.
Enable forced logging
2.
Create a password file
3.
Configure a standby redo log
4.
Enable archiving
5.
Set up the primary database initialization parameters
6.
Configure the listener and tnsnames to support the database on
both nodes
Note: It is
recommended to apply all patches on the primary and the standby before setting
up the Data Guard environment.
Prepare the primary
and the physical Standby Environments
Primary Server: Step 1
Oracle Data Guard
requires the primary database to be run in FORCE LOGGING mode. This means that
statements using the NOLOGGING option will still generate redo information to
maintain the Data Guard standby databases.
[oracle@dg1 ~]$ sqlplus / as sysdba
SQL> select force_logging from v$database;
FOR
---
NO
SQL> alter database force logging;
Database altered.
SQL>
The standby log files
are required to store redo received from the primary database. Standby redo
logs must be equal or larger to the largest redo log file in the primary
database. The standby redo log must have at least one more redo log group than
the redo log on the primary database. It is recommended that the standby redo
log is created on the primary database and propagated to the standby for any
switch over or fail over operations.
SQL> select * from v$log;
GROUP# THREAD#
SEQUENCE# BYTES BLOCKSIZE
MEMBERS ARC STATUS FIRST_CHANGE#
FIRST_TIM NEXT_CHANGE# NEXT_TIME
---------- ---------- ---------- ---------- ----------
---------- --- ---------------- ------------- --------- ------------ ---------
1 1 7
52428800 512 2 YES INACTIVE 828952 15-JAN-12 849105 16-JAN-12
2 1 8
52428800 512 2 NO
CURRENT 849105
16-JAN-12 2.8147E+14
3 1 6
52428800 512 2 YES INACTIVE 822304 15-JAN-12 828952 15-JAN-12
SQL> alter database add standby logfile size 50M;
Database altered.
SQL>
SQL> alter database add standby logfile size 50M;
Database altered.
SQL> alter database add standby logfile size 50M;
Database altered.
SQL> alter database add standby logfile size 50M;
Database altered.
SQL>
SQL> select * from v$logfile;
Next, set the
LOG_ARCHIVE_CONFIG parameter. The LOG_ARCHIVE_CONFIG parameter enables or
disables the sending of redo streams to the standby sites. The DB_UNIQUE_NAME
of the primary database is dg1 and the DB_UNIQUE_NAME of the standby database
is dg2. The primary database is configured to ship redo log stream to the
standby database. In this example, the standby database service is dg2.
Next,
STANDBY_FILE_MANAGEMENT is set to AUTO so that when Oracle files are added or
dropped from the primary database, these changes are made to the standby
databases automatically. The STANDBY_FILE_MANAGEMENT is only applicable to the
physical standby databases.
Setting the
STANDBY_FILE_MANAGEMENT parameter to AUTO is is recommended when using Oracle
Managed Files (OMF) on the primary database. Next, the primary database must be
running in ARCHIVELOG mode.
SQL> show parameter db_name
NAME TYPE VALUE
------------------------------------ ----------
------------------------------
db_name string dg1
SQL> show parameter db_unique_name
NAME TYPE VALUE
------------------------------------ ----------
------------------------------
db_unique_name string dg1
SQL> alter system set
log_archive_config='dg_config=(dg1,dg2)';
System altered.
SQL> alter system set log_archive_dest_2=
2 'service=dg2 async
valid_for=(online_logfile,primary_role) db_unique_name=dg2';
System altered.
SQL> alter system set standby_file_management=AUTO;
System altered.
SQL>
SQL> archive log list;
Database log mode
Archive Mode
Automatic archival
Enabled
Archive destination
USE_DB_RECOVERY_FILE_DEST
Oldest online log sequence
7
Next log sequence to archive
9
Current log sequence
9
SQL> exit;
Ensure the Oracle Net
Services aliases for both the primary database and standby database are added
to the tnsnames.ora file.
[oracle@dg1 ~]$ cat $ORACLE_HOME/network/admin/tnsnames.ora
dg1 =
(DESCRIPTION =
(ADDRESS = (PROTOCOL =
TCP)(HOST = dg1.localdomain)(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = dg1)
)
)
dg2 =
(DESCRIPTION =
(ADDRESS = (PROTOCOL =
TCP)(HOST = dg2.localdomain)(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = dg2)
)
)
[oracle@dg1 ~]$
Copy the updated
tnsnames.ora file to the standby site (host).
Standby Server: Step 2
The Oracle database
binaries have already been installed at this location ($ORACLE_HOME). The new
standby database will have dg2 as the SID.
The listener on the
standby site must be configured with a static service entry for the standby
database. Restart the listener after modification.
[oracle@dg2 admin]$ cat listener.ora
LISTENER =
(ADDRESS_LIST=
(ADDRESS=(PROTOCOL=tcp)(HOST=dg2.localdomain)(PORT=1521))
(ADDRESS=(PROTOCOL=ipc)(KEY=PNPKEY)))
SID_LIST_LISTENER=
(SID_LIST=
(SID_DESC=
(GLOBAL_DBNAME=dg2)
(SID_NAME=dg2)
(ORACLE_HOME=/u01/app/oracle/product/11.2.0/db_1)
)
)
[oracle@dg2 admin]$ lsnrctl start
Create audit directory
files under $ORACLE_BASE/admin/$ORACLE_SID/adump. In my example, I will also
create the Fast recover area and the oradata directories.
[oracle@dg2 ~]$ mkdir -p $ORACLE_BASE/admin/dg2/adump
[root@dg2 ~]# mkdir -p /opt/oradata
[root@dg2 ~]# mkdir -p /opt/fast_recovery_area
Next, create a simple
parameter file on the standby hist with the DB_NAME value.
[oracle@dg2 dbs]$ echo DB_NAME=dg2 > initdg2.ora
[oracle@dg2 dbs]$ cat initdg2.ora
DB_NAME=dg2
[oracle@dg2 dbs]$
The primary database
password file must be copied to the standby system for redo authentication.
[oracle@dg2 dbs]$ scp dg1:$ORACLE_HOME/dbs/orapwdg1 orapwdg2
Standby Database Creation
Start the standby database instance in NOMOUNT start up mode:
SQL> startup nomount pfile=initdg2.ora;
Now that the configuration of the standby server is complete,
let’s perform the duplication from the primary site.
Primary Server
Use the Recovery Manager (RMAN) to duplicate the primary
database to the standby database.
Invoke RMAN; connect to the primary database as the sys user. Make an auxiliary connection to the standby instance:
Invoke RMAN; connect to the primary database as the sys user. Make an auxiliary connection to the standby instance:
RMAN> connect target sys
target database Password:
connected to target database: DG1 (DBID=1753913301)
RMAN> connect auxiliary sys@dg2
auxiliary database Password:
connected to auxiliary database: DG2 (not mounted)
RMAN> run{
allocate channel prmy1 type disk;
allocate channel prmy2 type disk;
allocate auxiliary channel stby type disk;
duplicate target database for standby from active database
spfile
parameter_value_convert 'dg1','dg2'
set db_unique_name='dg2'
set db_file_name_convert='/dg1/','/dg2/'
set log_file_name_convert='/dg1/','/dg2/'
set control_files='/opt/oradata/dg2/dg2.ctl'
set log_archive_max_processes='5'
set fal_client='dg2'
set fal_server='dg1'
set standby_file_management='AUTO'
set log_archive_config='dg_config=(dg1,dg2)'
set log_archive_dest_2='service=dg1 ASYNC
valid_for=(ONLINE_LOGFILE,PRIMARY_ROLE) db_unique_name=dg1'
;
}
The RMAN duplication process is finished without errors. After
the database duplication, we will force the log switch on the primary database
to archive the current redo log group.
SQL> alter system switch logfile;
Standby Server
On the standby database, run the ALTER DATABASE RECOVER MANAGED
STANDBY DATABASE command to start redo apply.The USING CURRENT LOGFILE means
the redo is applied as soon as it is received on the standby.The DISCONNECT
option means the redo apply will run in background session.
SQL> alter database recover managed standby database
using current logfile
disconnect;
Note that the current log sequence number on the standby is 10.
SQL> select sequence#, first_time, applied
from v$archived_log
order by sequence#;
SEQUENCE# FIRST_TIM
APPLIED
---------- --------- ---------
9 16-JAN-12 YES
10 16-JAN-12
IN-MEMORY
Primary Server
Let’s perform 3 additional log switches on the primary database.
Then we will query the standby database to verify that the log files are
applied to the standby.
SQL> alter system switch logfile;
Standby Server
We query the standby database. The logs were successfully
transported and applied.
SQL> select sequence#, first_time, applied
from v$archived_log
order by sequence#;
SEQUENCE# FIRST_TIM
APPLIED
---------- --------- ---------
9 16-JAN-12 YES
10 16-JAN-12 YES
11 16-JAN-12 YES
12 16-JAN-12 YES
13 16-JAN-12
IN-MEMORY
Broker Configuration
Configuring the broker is recommended because it simplifies data
guard operations.
The DG_BROKER_START parameter must be set to TRUE.
Standby Server
SQL> alter system set dg_broker_start=TRUE;
The Oracle Net listener must be also configured with an
additional static service identifier. The value of the GLOBAL_DBNAME attribute
must be set to a concatenation of _DGMGRL.
[oracle@dg2 dbs]$ cat $ORACLE_HOME/network/admin/listener.ora
LISTENER =
(ADDRESS_LIST=
(ADDRESS=(PROTOCOL=tcp)(HOST=dg2.localdomain)(PORT=1521))
(ADDRESS=(PROTOCOL=ipc)(KEY=PNPKEY)))
SID_LIST_LISTENER=
(SID_LIST=
(SID_DESC=
(GLOBAL_DBNAME=dg2)
(SID_NAME=dg2)
(ORACLE_HOME=/u01/app/oracle/product/11.2.0/db_1)
)
(SID_DESC=
(GLOBAL_DBNAME=dg2_DGMGRL)
(SID_NAME=dg2)
(ORACLE_HOME=/u01/app/oracle/product/11.2.0/db_1)
)
)
[oracle@dg2 dbs]$ lsnrctl status
(Entry truncated)
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=dg2.localdomain)(PORT=1521)))
(DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=PNPKEY)))
Services Summary...
Service "dg2" has 1 instance(s).
Instance
"dg2", status UNKNOWN, has 1 handler(s) for this service...
Service "dg2_DGMGRL" has 1 instance(s).
Instance
"dg2", status UNKNOWN, has 1 handler(s) for this service...
The command completed successfully
Primary Server
SQL> alter system set dg_broker_start=TRUE;
The Oracle Net listener on the primary database should have
static service definitions:
[oracle@dg1 ~]$ cat $ORACLE_HOME/network/admin/listener.ora
LISTENER =
(DESCRIPTION_LIST =
(DESCRIPTION =
(ADDRESS = (PROTOCOL
= TCP)(HOST = dg1.localdomain)(PORT = 1521))
)
)
SID_LIST_LISTENER=
(SID_LIST=
(SID_DESC=
(GLOBAL_DBNAME=dg1)
(SID_NAME=dg1)
(ORACLE_HOME=/u01/app/oracle/product/11.2.0/db_1)
)
(SID_DESC=
(GLOBAL_DBNAME=dg1_DGMGRL)
(SID_NAME=dg1)
(ORACLE_HOME=/u01/app/oracle/product/11.2.0/db_1)
)
)
ADR_BASE_LISTENER = /u01/app/oracle
[oracle@dg1 ~]$ lsnrctl start
(Entry truncated)
Services Summary...
Service "dg1" has 1 instance(s).
Instance
"dg1", status UNKNOWN, has 1 handler(s) for this service...
Service "dg1_DGMGRL" has 1 instance(s).
Instance
"dg1", status UNKNOWN, has 1 handler(s) for this service...
The command completed successfully
Let’s create a broker configuration and identify the primary
database.
[oracle@dg1 ~]$ dgmgrl
DGMGRL> connect sys
Password:
Connected.
DGMGRL> create configuration 'DGConfig1' as primary database
is 'dg1'
> connect identifier is dg1;
Configuration "DGConfig1" created with primary
database "dg1"
DGMGRL> add database 'dg2' as connect identifier is dg2;
Database "dg2" added
DGMGRL> enable configuration
Enabled.
DGMGRL> show configuration
Configuration - DGConfig1
Protection Mode:
MaxPerformance
Databases:
dg1 - Primary database
dg2 - Physical standby
database
Fast-Start Failover: DISABLED
Configuration Status:
SUCCESS
Perform a switchover test:
DGMGRL> switchover to dg2;
Performing switchover NOW, please wait...
New primary database "dg2" is opening...
Operation requires shutdown of instance "dg1" on
database "dg1"
Shutting down instance "dg1"...
ORA-01109: database not open
Database dismounted.
ORACLE instance shut down.
Operation requires startup of instance "dg1" on
database "dg1"
Starting instance "dg1"...
ORACLE instance started.
Database mounted.
Switchover succeeded, new primary is "dg2"
DGMGRL> show configuration;
Configuration - DGConfig1
Protection Mode:
MaxPerformance
Databases:
dg2 - Primary database
dg1 - Physical standby
database
Fast-Start Failover: DISABLED
Configuration Status:
SUCCESS
Standby Server
Confirm of the role switch:
SQL> select database_role from v$database;
DATABASE_ROLE
----------------
PRIMARY
Primary Server
The former primary database is now the new physical standby
database:
SQL> select database_role from v$database;
DATABASE_ROLE
----------------
PHYSICAL STANDBY
Switch over to the former primary database:
DGMGRL> switchover to dg1;
Performing switchover NOW, please wait...
New primary database "dg1" is opening...
Operation requires shutdown of instance "dg2" on
database "dg2"
Shutting down instance "dg2"...
ORA-01109: database not open
Database dismounted.
ORACLE instance shut down.
Operation requires startup of instance "dg2" on
database "dg2"
Starting instance "dg2"...
ORACLE instance started.
Database mounted.
Switchover succeeded, new primary is "dg1"
DGMGRL> show configuration
Configuration - DGConfig1
Protection Mode:
MaxPerformance
Databases:
dg1 - Primary database
dg2 - Physical standby
database
Fast-Start Failover: DISABLED
Configuration Status:
SUCCESS
DGMGRL> show database dg2
Database - dg2
Role: PHYSICAL STANDBY
Intended State: APPLY-ON
Transport Lag: 5 minutes 2 seconds
Apply Lag: 5 minutes 51 seconds
Real Time Query: OFF
Instance(s):
dg2
Database Status:
SUCCESS
Enable Fast Start Fail Over
The actual configuration is running in Max Performance mode and
Fast Start Fail Over is currently disabled.
Primary Server
To configure FSFO, you must first enable
flashback database on both the primary and standby databases. (Further
reading: Introduction
to Oracle Flashback Technology)
SQL> alter database flashback on;
Redo apply must be stopped to enable flashback database on the
standby database:
DGMGRL> connect sys
Password:
Connected.
DGMGRL> edit database 'dg2' set state='apply-off';
Succeeded.
Standby Server
SQL> alter database flashback on;
Database altered.
SQL> select flashback_on from v$database;
FLASHBACK_ON
------------------
YES
Primary Server
Restart the Redo apply:
DGMGRL> edit database
'dg2' set state='apply-on';
Succeeded.
The Observer
The server dg3 will act as the observer in the Fast-Start Fail
Over configuration. The Oracle client binaries have been installed with
administrator option. Confirm the connectivity with both the primary and the
standby databases:
[oracle@dg3 ~]$ tnsping dg1
(Entry truncated)
Used EZCONNECT adapter to resolve the alias
Attempting to contact
(DESCRIPTION=(CONNECT_DATA=(SERVICE_NAME=))(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.2.101)(PORT=1521)))
OK (140 msec)
[oracle@dg3 ~]$ tnsping dg2
(Entry truncated)
Attempting to contact
(DESCRIPTION=(CONNECT_DATA=(SERVICE_NAME=))(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.2.102)(PORT=1521)))
OK (50 msec)
[oracle@dg3 ~]$
The prerequisites for FSFO have been met. So FSFO can be
configured, enabled and started. The FSFO observer process will be started
using the DGMGRL session and will be logged to a file named observer.log:
[oracle@dg3 admin]$ dgmgrl -logfile ./observer.log
DGMGRL for Linux: Version 11.2.0.1.0 - Production
Copyright (c) 2000, 2009, Oracle. All rights reserved.
Welcome to DGMGRL, type "help" for information.
DGMGRL> connect sys@dg1
Password:
Connected.
DGMGRL> show fast_start failover
Fast-Start Failover: DISABLED
Threshold: 30 seconds
Target: (none)
Observer: (none)
Lag Limit: 30 seconds
Shutdown Primary: TRUE
Auto-reinstate: TRUE
Configurable Failover Conditions
Health Conditions:
Corrupted
Controlfile YES
Corrupted
Dictionary YES
Inaccessible
Logfile NO
Stuck Archiver NO
Datafile Offline YES
Oracle Error Conditions:
(none)
DGMGRL> edit configuration set property
FastStartFailoverLagLimit=60;
Property "faststartfailoverlaglimit" updated
DGMGRL> enable fast_start failover;
Enabled.
DGMGRL> start observer;
Primary Server
DGMGRL> show configuration verbose
Configuration - DGConfig1
Protection Mode:
MaxPerformance
Databases:
dg1 - Primary database
dg2 - (*) Physical
standby database
(*) Fast-Start Failover
target
Fast-Start Failover: ENABLED
Threshold: 30 seconds
Target: dg2
Observer: dg3.localdomain
Lag Limit: 60 seconds
Shutdown Primary: TRUE
Auto-reinstate: TRUE
Configuration Status:
SUCCESS
Create a shell script and run the shell script in the
background; create observer.sh (MOS ID: 1084681.1)
#!/bin/ksh
dgmgrl -echo -logfile /home/oracle/dgmgrl.log << EOF
connect sys/wissem@dg3
start observer
EOF
chmod +x observer.sh
./observer.sh &
Where do you place the Observer?
That all depends on your requirements, the observer can be
placed in a third site separated from primary and standby sites with a HA
observer; this means 2 hosts; one observer process running on the first host
and the other on standby (not running) on the second host. Only one observer
can run at any one time.
No comments:
Post a Comment