RMAN with NETBACKUP

 

Installation

The NetBackup client software must first be installed as the root user. (for versions 6.x and earlier you also need to install the Oracle RMAN NBU client extension (add-on) as well).

The NBU client and Oracle extension should both be at the same version, and no higher then the version of the media server backing up to.

A VIP must be setup for each database being backed up by this method. The naming convention is:

vorc<dbname>bkup.domain.com

This VIP allows for it to easily be moved to the active node (or node being used for backups) in a cluster, or a new server. This insures consistency in the NetBackup client name. Otherwise it can be difficult to locate backup images needed for restores, and will require additional work to reconfigure the backup jobs.

For more details on NetBackup client software configuration see Backup Network Client Configuration Requirements. The section about the bp.conf file may be of particular interest.

Some test backups must be run to verify the OS backups are functioning properly before attempting to do Oracle specific backups.
Even if everything is setup correctly, because of our use of VIP’s, if the OS backups are not done, you will see issues backing up and restoring to the VIP, due to the way NetBackup may still try to validate the servers primary name as being a valid client. You’ll also want to do one time backups of any other interfaces on the server that can communicate with the Master Server.

API Library Linking

After the NetBackup client software has been installed and setup correctly, you must also create links to necessary library files so that RMAN and NetBackup can use these libraries to communicate with each other (Oracle uses this library when it needs to write to, or read from, devices). These libraries are part of an API used to integrate Oracle with various Backup utilities.

See page 50 of the NetBackup documentation for more details.

Oracle should be shutdown prior to making the link, or re-started after the link is created. Some more recent versions of oracle may recognize the library without requiring a re-start.

There is an automatic linking script that can be used to create the link.

/usr/openv/netbackup/bin/oracle_link

This script determines the Oracle version level and then links Oracle to NetBackup. This script must be run as the oracle user, and the $ORACLE_HOME variable must be set prior to running the script.

If you have multiple database instances, you must repeat running the script for each separate SID (each $ORACLE_HOME needs to have the library linked to it).

If for some reason this script doesn’t work, or you want to do it yourself, you can manually link the files by doing the following:

Root Link

These steps to create the "Root link" are not necessary, but can make it easier on the DBA’s if they are to create the "Oracle link". It also helps to simplify the "Oracle Link" such that it will be the same for all platforms and OS versions/types.

While logged in as root on the client, make sure that the Media Management API Library is linked for NBU (NetBackup) before proceeding.

Change directory to where the library is located:

# cd /usr/openv/netbackup/bin/

Look for the link, which in this example does not exist.

# ls -l libobk*

-r-xr-xr-x 1 root bin 112780 May 23 04:44 libobk.so.1

-r-xr-xr-x 1 root bin 183928 May 23 04:44 libobk.so64.1

Add the link if necessary. For this example this is an 64-bit SPARC server, so we will use the "so64" version of the library. If you are not sure which one to use, verify with the System Administrator of the server.

# ln -s libobk.so64.1 libobk.so

Then verify the link again

# ls -l libobk*

lrwxrwxrwx 1 root root 13 Aug 27 10:37 libobk.so -> libobk.so64.1

-r-xr-xr-x 1 root bin 112780 May 23 04:44 libobk.so.1

-r-xr-xr-x 1 root bin 183928 May 23 04:44 libobk.so64.1

Oracle Link

Switch User to the oracle account:

# su – oracle

It is important to change the environment variable to those for the instance you are installing.

$ . ~oracle/.ora_<dbname>

Environment variables set for <dbname>

Change directories to the library where the link is to be located.

$ cd $ORACLE_HOME/lib(64)

Verify the link is there:

$ ls -l libobk*

libobk*: No such file or directory

If the link is not there then you will need to add it (if it exists already move it and create a new link):

$ ln -s /usr/openv/netbackup/bin/libobk.so libobk.so

$ ls -l libobk*

lrwxrwxrwx 1 oracle oinstall 34 Aug 27 10:28 libobk.so -> /usr/openv/netbackup/bin/libobk.so

If you have multiple home directories for different oracle instances, you must create links in each home directory.


At this point the client installation and configuration is complete. To get backups jobs running and working properly, additional configuration within NetBackup and RMAN is needed.

Configuration

RMAN configuration

Update Catalog

The RMAN catalog must first be updated to recognize the new database that requires backups:

rman target / nocatalog

RMAN> show all;

RMAN configuration parameters are:

CONFIGURE RETENTION POLICY TO REDUNDANCY 1; # default

CONFIGURE BACKUP OPTIMIZATION OFF; # default

CONFIGURE DEFAULT DEVICE TYPE TO DISK; # default

CONFIGURE CONTROLFILE AUTOBACKUP OFF; # default

CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO ‘%F’; # default

CONFIGURE DEVICE TYPE DISK PARALLELISM 1 BACKUP TYPE TO BACKUPSET; # default

CONFIGURE DATAFILE BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default

CONFIGURE ARCHIVELOG BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default

CONFIGURE MAXSETSIZE TO UNLIMITED; # default

CONFIGURE ENCRYPTION FOR DATABASE OFF; # default

CONFIGURE ENCRYPTION ALGORITHM ‘AES128’; # default

CONFIGURE ARCHIVELOG DELETION POLICY TO NONE; # default

CONFIGURE SNAPSHOT CONTROLFILE NAME TO ‘/usr/oracle/product/10.2.0/dbs/snapcf_<dbname>.f’; # default

For Controlfile Autobackup to work in DDC (needs to be set to ‘ON’), due to the additional security requirements in that environment, additional configuration must be done within NetBackup.
However, this additional configuration may cause slowness due to the way RMAN issues a "bplist" command following a control file backup, and the way NetBackup handles such requests in clustered environments (or those using VIP’s).
Using certain settings in a bp.conf file that lives in Oracles Home Directory, may improve performance in certain situations.

Create Backup Directories

The RMAN shell script requires certain directories exist for logging and temporary files:

ls –lart /XXXXX/dba/xxxxx/backups

mkdir /XXXXX/dba/xxxxx/backups/<SID/DB-name>

mkdir /XXXXXX/dba/xxxxx/backups/<SID/DB-name>/log

Add DB Channels Entry

This file defines how many channels RMAN will generate for backup. Each channels will be separate child jobs in NetBackup.

vi /XXXXX/dba/xxxxx/db_channels

The syntax of this file is simply the Database Name (or SID) followed by a tab, and then a number.

dbname1 3

dbname12 12

dbname3 1

testdb1 8

See section on Maximum Jobs for possible impacts of these settings.

Client add Request

Create ticket to setup NetBackup policies and Control-M job for backup. The Oracle DBA’s should insure everything is ready before submitting a client add ticket to the NetBackup team; and coordinate with the NetBackup team if a specific time is necessary to start backups.


NBU Configuration

General

For generic and universal NBU configurations see:

You will also want to follow the steps on setting up a clustered environment and insure basic file system backups and restores of the VIP work properly.
In particular be sure to setup the altnames and run, at least a one-time long term retention, backups for any other interfaces that can communicate with the master server.
This is especially important if Control file Autobackup is enabled.

NBU Oracle Policies

You will need to create two Oracle specific policies, one for "Hot" backups and one for "Arch" backups. They must have the string "_Hot" and "_Arch" in them so that the Oracle/RMAN script knows which type of backup it is to perform.

  1. Create a NetBackup Policy with a type of "Oracle" and follow our Policy Naming and Policy Configuration Standards.
    It’s likely the necessary policies already exist, so be sure to determine if a new policy is needed before creating a new one.
  2. The "client" should be the VIP or "Backup Alias" that was setup previously that is specific to the database being backed up, and be associated with a backup NIC on the oracle server.
  3. For the schedules, you’ll need a minimum of two for every retention/frequency of backup desired.
    You’ll need an additional one for each additional database.
    See page 81 of the NetBackup documentation for more details on the different schedule types.
    1. An "Application Backup" schedule is needed to allow Oracle to send data to NetBackup
      The Application Backup MUST have a "Start Window" or Oracle will not be allowed to send data to it.
      This start window must cover all possible times when backups could be sent to it.
    2. An "Automatic Backup" schedule is needed to allow NetBackup to run the Oracle/RMAN shell script.
      The "Automatic Backup" schedule should have an appendage on it that specifies the database SID name.
      This is also were you specify the type of backup, but this merely changes certain variables sent to the Oracle/RMAN script and can be overwritten by the script if desired.
      The retention setting for this script has no impact on the actual retention of the data being backed up.
      (it would sever other uses for the NetBackup scheduler – if we were using it instead of control-m).
    3. Be sure to also follow Schedules Standards for NetBackup
  4. The Backup Selection should only list one file. This file is to be the Oracle/RMAN script used to run the rman backup.
    Optionally you can specify a template which resides on the NetBackup Master Server; however these are not used in our environment.
    NetBackup runs the templates or scripts in the order that the templates or scripts appear in the backup selections list (you can list more then one, but this is not necessary for our environment).
    For our environment this will be:

/admin/dba/scripts/netbackup/netbackup_rman.ksh

You will also need to create a third policy used for control file backups, which contains User Directed Schedules using the same "Application Backup" schedule names used in the Oracle policies. This third policy is used by the Oracle/RMAN script to perform user directed backups of certain key configuration files; often called "Control" files.

Again, follow Policy Naming and Policy Configuration Standards for NetBackup.
It’s likely the necessary policies already exist, so be sure to determine if a new policy is needed before creating a new one.

ora-SID_TABLE

Both the NetBackup wrapper script, and the Oracle RMAN shell script use a configuration file to determine which NetBackup Policies can be used, as well as to associate the Oracle SID (DB-name) to it’s related client name as it’s used within NetBackup.

To insure accessibility to all UNIX systems, this file currently lives at:

/xxxxx/sys/xxxxx/ora-SID_TABLE

An example of this file’s formatting can be seen here:

#SID{space} |Host/Client Name |Hot Policy |Cold Policy |Logs Policy |Arch/User Policy

<dbname> |vorc<dbname>bkup.domain.com |LOC#D_Ora_Hot |LOC#D_Ora_Hot |LOC#D_Ora_Arch |LOC#D_Ora_Ctl

This location is used because it is shared out commonly among all Unix servers, so as to grantee that all Oracle and NetBackup servers can access it. It is shared out via NFS.

The location is used for all NetBackup related scripts and configurations that must be accessible from the client; however, this could be changed as needed.

Note: Remember to also add Client Name using the Procedures For Client Altanames Configuration.

Test Backup

At this point backups should work, and a test run should be performed.

This backup can be done either:

  1. Via the RMAN shell script
    Full backup test:

/XXXXX/dba/xxxxx/netbackup_rman.ksh <db-name> hot full <nb_Server> <nb_client>

Archive log backups test:

/XXXXX/dba/xxxxx/netbackup_rman.ksh <db-name> arch full <nb_Server> <nb_client>

  1. From the Control-m wrapper script links created in the next step.
  2. From the NetBackup Policy Manager.
    Be sure to select the correct client, and the appropriate Automatic Backup Schedule.

Control-m Scheduling

Once the backup policy, schedules, and Oracle RMAN configurations are in place, the NetBackup admin submits a ticket to the Control-m team to have new scheduling jobs created.

Control-m is used to schedule all NetBackup jobs, so that the Operations staff can monitor them for us, and to enable automated ITSM ticket creation for all job failures.

The key to making this all work is by using a control-m Wrapper Script for NetBackup.

Be sure to specify the NetBackup Schedule names that contains the Oracle SID in it. Do NOT use schedules without any oracle SID names appended, as those schedules will not run the backup.

You must also specify the client name at the end of the control-m command to be used, so that the backup will still work properly if for some reason the NetBackup server cannot access the ora-SID_TABLE file.

Control-m Jobs

These jobs can be viewed from within the Control-m Enterprise Manager GUI for the current day’s load, or via the Control-m Desktop for the overall jobs configuration.

The timing of these jobs is critical, so it’s a good idea to insure they run when expected after adding a new set of backup jobs.

Other Considerations

Maximum Jobs

Maximum Jobs per client setting within NetBackup can limit the number of active DBchannel/streams.

There’s also a setting on each NetBackup Policy that limits the number of Active jobs for that given policy at any given time.

NetBackup also has a finite number of resources configured and available for use.

Any parallel/simultaneous backups coming from Oracle that exceed these setting or the available resources will be put into a Queued state.

It’s important that the DB-channels are set high enough to allow for good backup performance, but not too high that it consumes all available NetBackup resources, or causes heavy loads on the Oracle server.

Strategies

Oracle backups are done in two stages:

  1. Hot Database backup
    Backup of the Database files while the database is in a Hot-Backup mode.
    “Daily” hot backups can be "Full" or "Incremental", but "Weekly" and "Monthly" backups should be a "Full" backup.
    This is determined by setting the “Automatic” schedule type within the oracle policy’s schedules that is preceded by “_SID”.
    They can be “Automatic Full Backup” or “Automatic Incremental Backup”.
    All Hot backups should be immediately followed by an Archive log backup with the same retention length. This is accomplished through dependencies within the control-m jobs scheduling.
  2. Archive log backup
    Backup of the logs created while the Hot backup was running with the same retention as the associated Hot backup
    Also performed every 4 – 6 hours as a method for cleaning up disk space used by the Archive logs.
    These Schedules are to be set as “Automatic Full Backup” regardless of retention.
    Hourly backups are only done for Archive log backups at a maximum frequency of every 4 hours.

For control-m scheduling setup see the Oracle Jobs spreadsheet: <location>.

Troubleshooting

The status for an RMAN operation is stored in the RMAN catalog or in the database control file. This is the same status that is indicated by the output of the RMAN command used to run the backup or restore. This is the only status that a database administrator must check to verify that a backup or restore has been successful.

NetBackup also logs status, but only for its own part of the operation. You cannot use the NetBackup status to definitively determine whether rman was successful. Errors can occur in rman that do not affect NetBackup and are not recorded in NetBackup’s logs. For NetBackup to report a failure, the RMAN shell script has to determine that there was a failure and report this back to NetBackup by exiting with a non-zero status.

Backup Logs

Logs commonly used for troubleshooting Oracle backups include:

  • RMAN shell script log

To identify the correct RMAN shell script log file:

1. Login to the NBU master server, or Oracle server for the environment where the failure occurred.

2. Run a “cd /XXXXX/dba/xxxxxx/backups/<DATABASE-SID-NAME>/log” with the correct database name in the command.

3. Run a “ls –lrt” and the list will show you the latest updated log files last.

4. Identify the latest log file for the backup type you are searching for, HOT_full, hot_incr, arch_hourly… and so on.
These log files should not only have a last updated date similar to that of when the backup finished, but will also contain the date and time in it’s name as to when the backup started.

5. Run a “more <logfilename>” for the correct backup log.

If an appropriate RMAN log does not exist from the time of the failure, this indicates that either NetBackup was not able to run the RMAN script or that the RMAN script failed to create a log. The Oracle DBA’s may need to verify that the RMAN script does run properly, and if so, a next step may be to investigate why NetBackup could not contact the client to run the script.

RMAN output is technical and requires knowledge of RMAN to interpret them.

RMAN Logs

RMAN uses a command language interpreter, and it can be run in interactive or batch mode. You can use the following syntax to specify a log file on the command line to record significant RMAN actions:

msglog ‘logfile_name’

RMAN Queries

To find the RMAN backup progress

SELECT sid, serial#, context, sofar, totalwork,

round(sofar/totalwork*100,2) "% Complete"

FROM v$session_longops WHERE opname LIKE ‘RMAN%’

AND opname NOT LIKE ‘%aggregate%’

AND totalwork != 0

AND sofar <> totalwork

/

To find RMAN session info

SELECT sid, spid, client_info

FROM v$process p, v$session s

WHERE p.addr = s.paddr

AND client_info LIKE ‘%rman%’;

SELECT sid, spid, client_info

FROM v$process p, v$session s

WHERE p.addr = s.paddr

AND client_info LIKE ‘%rman%’;

To find the RMAN backup progress

SELECT sid, serial#, context, sofar, totalwork,

round(sofar/totalwork*100,2) "% Complete"

FROM v$session_longops WHERE opname LIKE ‘RMAN%’

AND opname NOT LIKE ‘%aggregate%’

AND totalwork != 0

AND sofar <> totalwork

/

REM RMAN Progress

alter session set nls_date_format=’dd/mm/yy hh24:mi:ss’

/

select SID, START_TIME,TOTALWORK, sofar, (sofar/totalwork) * 100 done,

sysdate + TIME_REMAINING/3600/24 end_at

from v$session_longops

where totalwork > sofar

AND opname NOT LIKE ‘%aggregate%’

AND opname like ‘RMAN%’

/

REM RMAN wiats

set lines 120

column sid format 9999

column spid format 99999

column client_info format a25

column event format a30

column secs format 9999

SELECT SID, SPID, CLIENT_INFO, event, seconds_in_wait secs, p1, p2, p3

FROM V$PROCESS p, V$SESSION s

WHERE p.ADDR = s.PADDR

and CLIENT_INFO like ‘rman channel=%’

/

LIST BACKUP;

LIST BACKUP OF DATABASE;

LIST BACKUP SUMMARY;

LIST INCARNATION;

LIST BACKUP BY FILE;

LIST COPY OF DATABASE ARCHIVELOG ALL;

LIST COPY OF DATAFILE 1, 2, 3;

LIST BACKUP OF DATAFILE 11 SUMMARY;

LIST BACKUP OF ARCHIVELOG FROM SEQUENCE 1437;

LIST CONTROLFILECOPY "/tmp/cntrlfile.copy";

LIST BACKUPSET OF DATAFILE 1;

NBU Logs

More in-depth logging for NetBackup can be found in directories under /usr/openv/netbackup/logs/.
Some of these directories are specific to the client, media sever, and/or master server, and will only produce logs on their respective servers.

The information in these log files can help you troubleshoot problems that occur outside of either the database agent or RMAN. Your best sources for Oracle or RMAN error information are the logs provided by Oracle.

Generally, each debug log corresponds to a NetBackup process and executable. However, for an RMAN backup, the debug log is created in the dbclient directory, which has no corresponding executable.

For more details See page 182 of the NetBackup documentation. For more troubleshooting information on which logs to check see page 186 – 188.

To enable these logs, simply create the following directories (under /usr/openv/netbackup/logs/) on their respective systems:

  • On the Client (Oracle Server):

bpdbsbora

bporaexp (or boraexp64)

bporaimp (or boraimp64)

dbclient

bphdb (may contain RMAN shell script output)

bpcd

  • On the Master Server:

bprd

bpdbm

  • On the Media Server:

bpbrm

bptm

  • Other Logs that may also be useful on the Client/Oracle Server:

user_ops/oracle

user_ops/dbext

These directories must have "777" permissions. The VERBOSE level may need to be set to 5 (set within the bp.conf configuration file) in order to see enough details as to why a failure is occurring.

On the Master Server, you may also need to enable logging for the nbpem, nbjm, and nbrb scheduling processes, which use unified logging. NetBackup writes unified logs to /usr/openv/logs. You do not need to create log directories for processes that use unified logging, but rather use vxlogview and vxlogcfg commands to manage them. See the NetBackup Troubleshooting Guide for more information.

NBU Recommended Steps

For more details See page 180 of the NetBackup documentation.

1. Verifying your installation

Ensure that the following NetBackup for Oracle binaries exist.
These are located in /usr/openv/netbackup/bin

· bphdb binary resides on the client and is used by both the NetBackup scheduler and the graphical interface to start backups.
The main purpose of bphdb is to run an Oracle template or shell script that in turn calls rman, bporaexp, or bporaimp.

· libobk is a shared library module that contains functions callable by RMAN. This library is loaded when RMAN is started. The name of this binary depends on the operating system.
This library is loaded when RMAN is started, and requires that oracle be re-started if refreshed or re-installed.
Also see the section on linking to this library.

2. For XML export/import, verify that the bporaexp and bporaimp binaries exist along with the appropriate libbpora library (found under /usr/openv/lib/).

3. For the Backup, Archive, and Restore interface, verify that the following binaries exist.

· /usr/openv/netbackup/bin/bpdbsbora

· /usr/openv/netbackup/bin/bpubsora

· /usr/openv/lib/libdbsbrman.so

· /usr/openv/lib/libnbberman.so

4. Check that both the NetBackup server and client software are working properly. That is, check that normal operating system files can be backed up and restored from the client. It’s best if the NetBackup client is running the same version of software as the NetBackup server.

6. Check the NetBackup JAC’s Activity Monitor for error codes related to the oracle backup.

Keep in mind that there are two different schedules in use, and that the "Application Backup" schedule can have multiple streams (or backup jobs) per a single "Automatic Backup" schedule.
If any one of the "Application Backup" (or child) jobs fail, it will trigger the Oracle RMAN script to finish with a non-zero status and thus cause NetBackup to consider the parent job to be a failure (usually indicated by an Error code of 6).
You must look for the error code from the "Application Backup" schedule jobs, unless they do not appear to have run, in which case you should focus on the "Automatic Backup" schedule Error Code.

In this picture, you can see an example of how the highlighted jobs are all part of the same backup.

In this example the schedule named Hourly_<dbname> is the "Automatic Backup" (or parent) job, and the schedules named Hourly are the "Application Backup" (or child) jobs.

Depending on the error you get from jobs in the activity monitor you may want to check logs for additional clues.

7. If Backup jobs do not make it to the Activity monitor, you’ll need to check logs instead.

You may also need to check Control-m to insure the schedules are setup correctly, and that control-m is running them as expected.
You should not run more then a single backup job on a given database at a time, as this can cause them to fail.
Control-m should have resource constraints in place to disallow more then one backup to run against a given database at a time
(there are other way to run these backups that control-m has no control over)

RMAN Recommended Steps

For more details See page 186 of the NetBackup documentation.

On the Oracle side, an error can be from RMAN or from the target database.

  1. First try to run rman from the command-line, rather then via NetBackup or the RMAN shell script.
    You may even try to run an RMAN backup directly to disk so nothing related to NetBackup is used.
  2. Check the logs under /usr/openv/netbackup/logs/dbclient.
    If a current log exists, rman is communicating with NetBackup, if not there may be a problem with how the libraries are linked or the directory may not have correct permissions to allow Oracle to log to it.
  3. If the backup worked via rman but not from the shell script, the RMAN shell script should be checked for any errors that may have occurred during it’s execution.
    It’s output is logged to:

/XXXXX/dba/xxxxx/backups/<DATABASE-SID-NAME>/log

If the error(s) does not make it to this log or the log is missing, you may see information sent to Standard out or Standard Error.
If the script was run by NetBackup, the Standard out and Standard error get logged in files found in the /usr/openv/netbackup/logs/bphdb directory on the Oracle server.

  1. If rman has issues communicating with the database, you’ll need to have an Oracle DBA check that the database is setup correctly.
  2. If communications appear to be working fine, then you’ll first want to check the NetBackup Activity Monitor to see if it suggests any obvious errors or reasons for failure
    You may also need to check the other NetBackup logs for reasons why the backup may have failed.
    If the Problem appears to be with NetBackup, you’ll need to contact a NetBackup Admin to have it addressed.

Common RMAN error messages

If you see out put from RMAN similar to the following, it’s likely because NetBackup and RMAN are not properly linked:

RMAN-00571:============================================================

RMAN-00569:=============== ERROR MESSAGE STACK FOLLOWS: ===============

RMAN-00571:============================================================

RMAN-03009: failure of allocate commands on t1 channel at 05/11/2005 09:29:37

ORA-19554: error allocating device, device type: SBT_TAPE, device name:

ORA-27211: Failed to load Media Management Library

Additional information: 25

If you see a "Server Status:" message at the end of the "Error Message Stack" Then this is typically an issue related to The NetBackup Media Server generating an error. The actual status message will vary, but is usually the same message you would get for the related error code.

RMAN-00571: ===========================================================

RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============

RMAN-00571: ===========================================================

RMAN-03009: failure of backup command on a1 channel at 06/09/2011 09:02:34

ORA-19506: failed to create sequential file, name="PRDCNDW1_arch_a3meehcq_1_1_753354138", parms=""

ORA-27028: skgfqcre: sbtbackup returned error

ORA-19511: Error received from media manager layer, error text:

VxBSACreateObject: Failed with error:

Server Status: ????????????????????????????

Leave a Reply

Your email address will not be published. Required fields are marked *