BNFD - Your DBA NoteBook On The Net!!!
Showing posts with label Backup Recovery. Show all posts
Showing posts with label Backup Recovery. Show all posts

Saturday, February 9, 2008

What Backup policies your Company Need?

A brief overview of Backup policy Matrics and how you can take advantage of latest Tools and Technologies for safeguarding your data

Ten Best Recovery Scenarios I have ever faced!!!!

Today I am sharing the most difficult and challenging assignments I have experienced in my past jobs - The data recovery scenarios.

You know what , these are the best times in my job profile I would like to tell you today.

Data , the most important and valuable data of any organization when need to be required from your back up sets , then even experienced dbas find it most difficult to get what the companies want.

Tuesday, February 5, 2008

Myths and Realities about Recovery

Last septmber I was asked if we a complete recovery is possible if we donot have a pcf(present control file) and on lin redo log file. - Its a myth - We cannot do a complete recovery with out a pcf or olrdf(online redo log file)

Why ?

Saturday, January 26, 2008

FAQ about Backup & recovery

difference between restoring and recovering?
Restoring involves copying backup files from secondary storage (backup media) to disk. This can be done to replace damaged files or to copy/move a database to a new location.
Recovery is the process of applying redo logs to the database to roll it forward. One can roll-forward until a specific point-in-time (before the disaster occurred), or roll-forward until the last transaction recorded in the log files.

difference between online and offline backups?

A hot (or on-line) backup is a backup performed while the database is open and available for use (read and write activity). Except for Oracle exports, one can only do on-line backups when the database is ARCHIVELOG mode.
A cold (or off-line) backup is a backup performed while the database is off-line and unavailable to its users. Cold backups can be taken regardless if the database is in ARCHIVELOG or NOARCHIVELOG mode.
It is easier to restore from off-line backups as no recovery (from archived logs) would be required to make the database consistent. Nevertheless, on-line backups are less disruptive and doesn't require database downtime.
Point-in-time recovery (regardless if you do on-line or off-line backups) is only available when the database is in ARCHIVELOG mode.

strategies are available for backing-up an Oracle database?
The following methods are valid for backing-up an Oracle database:
Export/Import - Exports are "logical" database backups in that they extract logical definitions and data from the database to a file.
Cold or Off-line Backups - Shut the database down and backup up ALL data, log, and control files.
Hot or On-line Backups - If the database is available and in ARCHIVELOG mode, set the tablespaces into backup mode and backup their files. Also remember to backup the control files and archived redo log files.
RMAN Backups - While the database is off-line or on-line, use the "rman" utility to backup the database.

But Regardless of strategy, take backup of all required software libraries, parameter files, password files, etc. If your database is in ARCHIVELOG mode, you also need to backup archived log files.


If Someone lost an archived/online REDO LOG file, How to restoreDB back?
The following INIT.ORA/SPFILE parameter can be used if your current redologs are corrupted or blown away. It may also be handy if you do database recovery and one of the archived log files are missing and cannot be restored.

_allow_resetlogs_corruption = true

This should allow you to open the database. However, after using this parameter your database will be inconsistent (some committed transactions may be lost or partially applied).
Steps:
--Do a "SHUTDOWN NORMAL" of the database
--Set the above parameter
--Do a "STARTUP MOUNT" and "ALTER DATATBASE OPEN RESETLOGS;"
--If the database asks for recovery, use an UNTIL CANCEL type recovery and apply all available archive and on-line redo logs, then issue CANCEL and reissue the "ALTER DATATBASE OPEN RESETLOGS;" command.
--Wait a couple of minutes for Oracle to sort itself out
--Do a "SHUTDOWN NORMAL"
--Remove the above parameter!
--Do a database "STARTUP" and check your ALERT.LOG file for errors.
--Extract the data and rebuild the entire database

BASICS OF DATABASE BACKUP AND RECOVERY

A backup is a representative copy of data. This copy can include important parts of a database such as the control file, redo logs, and datafiles. A backup protects data from application error and acts as a safeguard against unexpected data loss, by providing a way to restore original data. Backups are divided into physical backups and logical backups. Physical backups are copies of physical database files. The phrase "backup and recovery" usually refers to the transfer of copied files from one location to another, along with the various operations performed on these files.

In contrast, logical backups contain data that is exported using SQL commands and stored in a binary file. Oracle records both committed and uncommitted changes in redo log buffers. Logical backups are used to supplement physical backups. Restoring a physical backup means reconstructing it and making it available to the Oracle server. To recover a restored backup, data is updated using redo records from the transaction log. The transaction log records changes made to the database after the backup was taken.

Oracle performs crash recovery and instance recovery automatically after an instance failure. In the case of media failure, a database administrator (DBA) must initiate a recovery operation. Recovering a backup involves two distinct operations: rolling the backup forward to a more recent time by applying redo data, and rolling back all changes made in uncommitted transactions to their original state. In general, recovery refers to the various operations involved in restoring, rolling forward, and rolling back a backup. Backup and recovery refers to the various strategies and operations involved in protecting the database against data loss and reconstructing the database should a loss occur.

BACKUP AND RECOVERY OPERATIONS:
A backup is a snapshot of a datafile, tablespace, or database at a certain time. If periodic backups of the database have been made and data is lost, users can apply the stored redo information to their latest backup to make the database current again. Oracle enables users to restore an older backup and apply only some redo data, thereby recovering the database to an earlier point in time. This type of recovery is called incomplete media recovery. If the backup was consistent, then users are not required to apply any redo data, at all.

A simple example of media recovery illustrates the concept. Suppose a user makes a backup of the database at noon. Starting at noon, one change to the database is made every minute. At 1 p.m. one of the disk drives fails, causing the loss of all data on that disk. Fortunately, Oracle records all changes in the redo log. The user can then restore the noon backup onto a functional disk drive and use redo data to recover the database to 1 p.m., reconstructing the lost changes.
ELEMENTS OF A BACKUP AND RECOVERY STRATEGY

Although backup and recovery operations can be intricate and vary from one business to another, the basic principles follow these four simple steps:

1.Multiplex the online redo logs
2.Run the database in ARCHIVELOG mode and archive redo logs to multiple locations
3.Maintain multiple concurrent backups of the control file
4.Take frequent backups of physical datafiles and store them in a safe place, making multiple copies if possible

As long as users have backups of the database and archive redo logs in safe storage, the original database can be recreated.
KEY DATA STRUCTURES FOR BACKUP AND RECOVERY Before users begin to think seriously about backup and recovery strategy, the physical data structures relevant for backup and recovery operations must be identified. This section discusses the following physical data structures:

Datafiles
Controlfiles
Online Redo Log Files
Archieved Redo Log Files
Automatic managed Undo

ARCHIVED REDO LOG FILES
Archived log files are redo logs that Oracle has filled with redo entries, rendered inactive, and copied to one or more log archive destinations. Oracle can be run in either of two modes:
ARCHIVELOG - Oracle archives the filled online redo log files before reusing them in the cycle.
NOARCHIVELOG - Oracle does not archive the filled online redo log files before reusing them in the cycle.
Running the database in ARCHIVELOG mode has the following benefits:
The database can be completely recovered from both instance and media failure.
The user can perform online backups, i.e., back up tablespaces while the database is open and available for use.
Archived redo logs can be transmitted and applied to the standby database
Oracle supports multiplexed archive logs to avoid any possible single point of failure on the archive logs.
The user has more recovery options, such as the ability to perform tablespace-point-in-time recovery (TSPITR)
Running the database in NOARCHIVELOG mode has the following consequences:
The user can only back up the database while it is completely closed after a clean shutdown.
Typically, the only media recovery option is to restore the whole database, which causes the loss of all transactions issued since the last backup.

Automatic Managed Undo
Every Oracle database must have a method of maintaining information that is used to roll back, or undo, changes to the database. Such information consists of records of the actions of transactions, primarily before they are committed. Oracle refers to these records collectively as undo. Historically, Oracle has used rollback segments to store undo. Space management for these rollback segments has proven to be quite complex. Oracle now offers another method of storing undo that eliminates the complexities of managing rollback segment space, and allows DBAs to exert control over how long undo is retained before being overwritten. This method uses an undo tablespace. Undo records are used to:
Roll back transactions when a ROLLBACK statement is issued
Recover the database
Provide read consistency
When a rollback statement is issued, undo records are used to undo changes that were made to the database by the uncommitted transaction. During database recovery, undo records are used to undo any uncommitted changes applied from the redo log to the datafiles. Undo records provide read consistency by maintaining the before image of the data for users who are accessing the data at the same time that another user is changing it.

Understanding Basic Backup
A backup strategy provides a safeguard against data loss. Answering the following questions can help database administrators develop a strong backup strategy:
What types of failures can occur?
What information should be backed up?
Which backup method should be used?
Should backups be made online or offline?
How often should backups be made?
How can dangerous backup techniques be avoided?

WHAT TYPES OF FAILURES CAN OCCUR?
Data loss can occur for various reasons. Here are some of the most common types of failures that can lead to data loss.

A statement failure is a logical failure in the handling statement in an Oracle program. For example, a user issues a statement that is not a valid SQL construction. When statement failure occurs, Oracle automatically undoes any effects of the statement and returns control to the user.
A process failure is a failure in a user process accessing Oracle, i.e., an abnormal disconnection or process termination. The failed user process cannot continue work, although Oracle and other user processes can. If the user process fails while modifying the database, Oracle background processes undo the effects of uncommitted transactions.

An instance failure is a problem that prevents an Oracle instance, i.e., the SGA and background processes, from continuing to function. Instance failure can result from a hardware problem such as a power outage, or a software problem such as an operating system crash. When an instance fails, Oracle does not write the data in the buffers of the SGA to the datafiles.

A user or application error is a user mistake that results in the loss of data. For example, a user can accidentally delete data from a payroll table. Such user errors can require the database or object to be recovered to a point in time before the error occurred. To allow recovery from user error and accommodate other unique recovery requirements, Oracle provides Flashback Technology.

A media failure is a physical problem that arises when Oracle tries to write or read a file that is required to operate the database. A common example is a disk head crash that causes the loss of all data on a disk drive. Disk failure can affect a variety of files, including datafiles, redo log files, and control files. Because the database instance cannot continue to function properly, it cannot write the data in the database buffers of the SGA to the datafiles.

Configuration Files Configuration files may consist of spfile or init.ora, password file, tnsnames.ora, and sqlnet.ora. Since these files do not change often, then they require a less frequent backup schedule. If you lost a configuration file it can be easily recreated manually. When restore time is a premium, it will be faster to restore a backup of the configuration file then manually creating a file with a specific format.

WHICH BACKUP METHOD SHOULD BE USED?
Oracle provides users a choice of several basic methods for making backups. The methods include:
Recovery Manager (RMAN) - A component that establishes a connection with a server process and automates the movement of data for backup and recovery operations.
Oracle Enterprise Manager - A GUI interface that invokes Recovery Manager.
Oracle Data Pump - The utility makes logical backups by writing data from an Oracle database to operating system files in a proprietary format. This data can later be imported into a database.
User Managed - The database is backed up manually by executing commands specific to the user's operating system.


UNDERSTANDING BASIC RECOVERY STRATEGY

Basic recovery involves two parts: restoring a physical backup and then updating it with the changes made to the database since the last backup. The most important aspect of recovery is making sure all data files are consistent with respect to the same point in time. Oracle has integrity checks that prevent the user from opening the database until all data files are consistent with one another. When preparing a recovery strategy, it is critical to understand the answers to these questions:
How does recovery work?
What are the types of recovery?
Which recovery method should be used?

HOW DOES RECOVERY WORK?
In every type of recovery, Oracle sequentially applies redo data to data blocks. Oracle uses information in the control file and datafile headers to ascertain whether recovery is necessary. Recovery has two parts: rolling forward and rolling back. When Oracle rolls forward, it applies redo records to the corresponding data blocks. Oracle systematically goes through the redo log to determine which changes it needs to apply to which blocks, and then changes the blocks. For example, if a user adds a row to a table, but the server crashes before it can save the change to disk, Oracle can use the redo record for this transaction to update the data block to reflect the new row.
Once Oracle has completed the rolling forward stage, the Oracle database can be opened. The rollback phase begins after the database is open. The rollback information is stored in transaction tables. Oracle searches through the table for uncommitted transactions, undoing any that it finds. For example, if the user never committed the SQL statement that added the row, then Oracle will discover this fact in a transaction table and undo the change.

WHAT ARE THE TYPES OF RECOVERY?
There are three basic types of recovery: instance recovery, crash recovery, and media recovery. Oracle performs the first two types of recovery automatically at instance startup. Only media recovery requires the user to issue commands.

An instance recovery, which is only possible in an Oracle Real Applications Cluster configuration, occurs in an open database when one instance discovers that another instance has crashed. A surviving instance automatically uses the redo log to recover the committed data in the database buffers that was lost when the instance failed.Oracle also undoes any transactions that were in progress on the failed instance when it crashed, then clears any locks held by the crashed instance after recovery is complete.

A crash recovery occurs when either a single-instance database crashes or all instances of a multi-instance database crash. In crash recovery, an instance must first open the database and then execute recovery operations. In general, the first instance to open the database after a crash or SHUTDOWN ABORT automatically performs crash recovery.

Unlike crash and instance recovery, a media recovery is executed on the user's command, usually in response to media failure. In media recovery, online or archived redo logs can be used to make a restored backup current or to update it to a specific point in time. Media recovery can restore the whole database, a tablespace or a datafile and recover them to a specified time. Whenever redo logs are used or a database is recovered to some non-current time, media recovery is being performed.
A restored backup can always be used to perform the recovery. The principal division in media recovery is between complete and incomplete recovery. Complete recovery involves using redo data combined with a backup of a database, tablespace, or datafile to update it to the most current point in time. It is called complete because Oracle applies all of the redo changes to the backup. Typically, media recovery is performed after a media failure damages datafiles or the control file.

Recovery Options
If the user does not completely recover the database to the most current time, Oracle must be instructed how far to recover. The user can perform:

1. Tablespace point-in-time recovery (TSPITR), which enables users to recover one or more tablespaces to a point-in-time that is different from the rest of the database.

2.Time-based recovery, also called point-in-time recovery (PITR), which recovers the data up to a specified point in time.

3.Cancel-based recovery, which recovers until the CANCEL command is issued.

4.Change-based recovery or log sequence recovery. If O/S commands are used, change-based recovery recovers up to a specified SCN in the redo record.

5.Flashback from human error

WHICH RECOVERY METHOD SHOULD BE USED?

Users have a choice between two basic methods for recovering physical files. They can:

1.Use Recovery Manager to automate recovery.
2.Execute SQL commands.

Recovering with Recovery Manager
The basic RMAN commands are RESTORE and RECOVER. RMAN can be used to restore datafiles from backup sets or image copes, either to their current location or to a new location. If any archived redo logs are required to complete the recovery operation, RMAN automatically restores and applies them. In a recovery catalog, RMAN keeps a record containing all the essential information concerning every backup ever taken. If a recovery catalog is not used, RMAN uses the control file for necessary information. The RMAN RECOVER command can be used to perform complete media recovery and apply incremental backups, and to perform incomplete media recovery.

Recovering with SQL*Plus
Administrators can use the SQL*Plus utility at the command line to restore and perform media recovery on your files. Users can recover a database, tablespace, or datafile. Before performing recovery, users need to:
Determine which files to recover. Often the table V$RECOVER_FILE can be used.
Restore backups of files permanently damaged by media failure. If the user does not have a backup, recovery can still be performed if the user has the necessary redo log files and the control file contains the name of the damaged file.
If a file cannot be restored to its original location, then the user must relocate the restored file and inform the control file of the new location.
Restore necessary archived redo log files.