RECOVERY
========
This document will discuss the various scenarios relating to recovering an
Oracle Database from a mixture of datafiles and/or archived redo log-files.
Most of this information is available in the current DBA guide and the online
RECOVERY.DOC file. The location of the latter is OS dependant.
A degree of understanding of the Oracle Version 6 architecture is assumed.
Before we begin lets take a look at how Oracle manages and tracks the various
elements involved in the recovery process :
The CONTROL FILE
----------------
The Control file is in effect the central repository regarding the physical
layout of the database. More importantly,for our discussion, it also records
the SEQUENCE information related to the REDO LOG Files and their
relationship to the one or more DATABASE Files.
Whenever a LOG Switch occurs (i.e. the current on-line REDO LOG is full) the
control file is updated with the new SEQUENCE number - at the same time ALL
ON-LINE Database file headers are updated with the same SEQUENCE number.
There is one important exception - Database File Headers of any TABLESPACE in
BACKUP mode are not updated until the END BACKUP command is issued AND
a LOG Switch occurs.
For each Database File - the Control File also records - the last LOG SEQUENCE
Number applied to this Datafile.
The SEQUENCE Number of each REDO LOG File is held as part of the
REDO LOG HEADER. (see appendix A)
N.B. We are using the term LOG Switch above to indicate the occurence of a
CHECKPOINT. It is of course possible to have a CHECKPOINT occur
multiple times within a Log file depending on the INIT.ORA parameter
LOG_CHECKPOINT_INTERVAL =
however remains and most systems are set to CHECKPOINT when a
LOG Switch occurs.
Given the above information lets now take a look at a Database which we are
about to STARTUP - do some work - and then SHUTDOWN.
Assuming the following :
There are 2 on-line log files - SEQUENCE Numbers 10 and 11
(we establised this via the command ARCHIVE LOG LIST)
The Current On-Line LOG is sequence 11
The DATABASE was Shutdown cleanly
All the Database Files are on-line
At this point (before we startup) the Control File would record the value 11
as the being Current LOG SEQUENCE Number as well as the Last Log Sequence
Number applied to each Database File.
All the Database File Headers would have the same value for the LOG SEQUENCE
Number.
On Startup Oracle will compare the value of the LOG SEQUENCE Number in each
on-line Database File Header against those held in the Control File - if both
these values are the same no recovery is necessary.
The Control File is then updated and the LOG SEQUNCE Number held against
each Database File is set to INFINITY and the Database opened.
We now do some amount of work which results in,say 5, LOG Switches occuring
thus bringing the current on-line LOG SEQUENCE Number to 16. At each LOG
Switch the Control File and all on-line Database File Headers would be
updated to reflect this increment in the SEQUENCE Number.
CLEAN SHUTDOWN
--------------
The Database is now Shutdown cleanly - at this point the LOG SEQUENCE
Numbers against each on-line Database File,in the Control File, are updated
to 16 and the Database is Closed and Dismounted.
N.B. Both SHUTDOWN NORMAL | IMMEDIATE are considered CLEAN.
ABNORMAL SHUTDOWN
-----------------
If instead the Instance terminates abnormally - the LOG SEQUENCE Numbers held
against each Database File in the Control File would be left at INFINITY. When
we attempt to Startup - the SEQUENCE number check would fail and Oracle would
automatically Recover all on-line Database Files starting from the LOG SEQUENCE
Number held in the Datafile Headers upto the Current on-line LOG SEQUENCE
Number stored in the Control File - in our example - 16.
The mechanisms described above can now be applied to any form of Recovery we
wish to analyse.
TERMINOLOGY
-----------
Before taking a look at some Recovery scenarios - lets briefly look at some of
the terminology we will be using.
COLD BACKUP
-----------
A Cold Backup is assumed to be a copy of ALL Database,On-Line Redo-Log and
Control Files taken when the Database is Shutdown.
HOT BACKUP
----------
A Hot Backup is assumed to be a copy of one or more Datafiles belonging to
one or more Tablespaces which have been placed in Backup mode i.e the command
ALTER TABLESPACE
has been issued.
ARCHIVE LOGGING
---------------
Archive Logging is a mechanism whereby On-line Redo Logs are copied to an
Off-Line location (disk or tape) when they are filled. These Off-Line Logs
are exact replicas of the On-Line Logs. Oracle will not perform a Log Switch
operation if the Redo Log it would switch to has not been archived.
1. LOSS OF CURRENT ON-LINE REDO LOG FILE
----------------------------------------
Loss of the Curent on-line Redo Log File is a single-point of failure in
Oracle Version 6. This can occur due to media failure or user error. In either
case the Database cannot be started without some remedial action being taken.
The simplest course of action is to restore ALL files from the last Cold
Backup and start the Database. If this option is chosen the rest of this
discussion may be ignored.
Other options are discussed below :
The first step is to restore a Redo Log File with the same name and size as
the one that is missing. This can be done by either :
a) Copy of an Archived Redo-Log
b) Copy of another on-line Redo Log.
It is important in either case that the file size is identical and is part of
the same Database (the Database name is stored in the Redo Log Header).
If you cannot do either a) or b) the only other option is to add a new Log File
and take copies of it as required. This can be achieved by :
STARTUP MOUNT
CONNECT INTERNAL
ALTER DATABASE ADD LOGFILE '
OS copies of the new file can then be taken.
There are now 2 courses of action open to us :
a) Only available if running in ARCHIVELOG Mode.
Restore ONLY the Database Files from your last Cold/Hot Backup and Roll Forward
through till the last Archived Redo Log. This approach will guarantee the
consistency of the Database at the end of the recovery operation.
STARTUP MOUNT -> We are using the current Control File at this time
CONNECT INTERNAL
RECOVER DATABASE MANUAL
We will now be prompted for the Log File Sequence as determined by the
information the the Datafile Headers.
As each Log File is applied the Sequence Number in the Datafile Headers is
updated to reflect the fact.
Apply all Archived Redo Logs until you are prompted for one that does not exist:
e.g Enter name for archive log sequence number 72 ("CANCEL" to cancel):
N.B. It is possible that the Redo Log you are being prompted for has not yet
been archived - if you are unsure you can now enter the name of each
of the On-line Redo Log files. Oracle will reject them if they do not
have the correct SEQUENCE Number.
Enter CANCEL at this point
ALTER DATABASE OPEN RESETLOGS
This command first initializes all the On-line Redo Log Files and then resets
the SEQUENCE Number of the Current On-line Log to be that of that last
Archive Log File applied to the Database. The Database is then opened.
b) Using all the current files - 'Fake' Manual Recovery.
STARTUP MOUNT
CONNECT INTERNAL
RECOVER DATABASE MANUAL
When prompted for a Log File Sequence - enter the name of non-existant file
(JUNK is normally a good bet) - you will get a couple of informational
messages telling you that the file does not exist and asking you to re-enter.
Enter CANCEL at this point.
ALTER DATABASE OPEN RESETLOGS.
This will perform as above.
If the Instance had terminated abnormally the above steps will NOT guarantee
the consistency of the Database and the safest thing to do would be an
Export -> Re-create the Database -> Import.
The reason for the uncertainty is that the RESETLOGS option,as mentioned above,
initializes the On-line log files. In our case we did not have one but that is
immmaterial as there was potentially information in the Log which should have
been applied to the Database.
If however the Database was Shutdown cleanly there should be no problem.
A Cold Backup of the Database should be taken at this point.
2. LOSS OF ONE OR MORE DATA FILES
---------------------------------
Recovering from the loss of one or more Datafile(s) depends very much on the
File(s) in question.
The worst-case scenario is the loss of a file(s) from the SYSTEM Tablespace.
a) If the Database is not in ARCHIVELOG Mode you have NO choice except to return
to your last Cold Backup. In this instance you would restore ALL the files.
b) If you are running Archiving :
- Restore the lost/damaged File(s) ONLY from the last Cold/Hot Backup
- STARTUP MOUNT -> using current Control File
- CONNECT INTERNAL
- RECOVER DATABASE
You will be prompted for all necessary Archived Redo Log Files
- ALTER DATABASE OPEN
If the file(s) lost/damaged are from a non-SYSTEM Tablespace:
a) If the Database is not in ARCHIVELOG Mode you have 2 choices :
i) Restore ALL files from your last Cold Backup.
ii) STARTUP MOUNT -> using current Control File
CONNECT INTERNAL
ALTER DATABASE DATAFILE '
(The above statement may be issued for as many files as necessary
ALTER DATABASE OPEN
The Database is now running minus one or more files. It is strongly
recommended that at this stage you decide to either do a full EXPORT and
rebuild the database or take the Tablespace(s) containing the missing
Datafile(s) offline and drop them. This is to avoid Oracle attempting to
allocate space in the offline Datafile(s).
b) If the Database is in ARCHIVELOG Mode you have 2 choices :
i) Restore ONLY the missing/damaged File(s) from the last Cold/Hot Backup.
STARTUP MOUNT -> using current Control File
CONNECT INTERNAL
RECOVER DATABASE
You will be prompted for all the necessary Archive Log Files
ALTER DATABASE OPEN
ii) Restore ONLY the missing/damaged File(s) from the last Cold/Hot Backup
STARTUP MOUNT -> using current Control File
CONNECT INTERNAL
ALTER DATABASE DATAFILE '
Issue this for as many Datafiles as necessary
ALTER DATABASE OPEN
ALTER TABLESPACE
Issue this for all Tablespaces which have missing/damaged Files
The Database can now be used except for the parts OFFLINE
You can now issue one of 2 commands :
- RECOVER TABLESPACE
- RECOVER DATAFILE '
In each case you will be prompted for the necessary Archive Log Files
You can run multiple recovery sessions in Parallel using either of the
above commands.
N.B. This implies that the remainder of the Database can be used while
Datafile/Tablespace Recovery is in progress.
When all the Archive Logs have been applied you can issue
ALTER TABLESPACE
This will bring the Tablespaces and associated Datafiles ONLINE.
3. TABLESPACE(S) LEFT IN BACKUP MODE
------------------------------------
This is only applicable if the Database is in ARCHIVELOG Mode. Having put
one or more Tablespaces in Backup mode via the command :
ALTER TABLESPACE
The files are now backed up at the OS level but the Tablespaces are NOT taken
out of Backup Mode via :
ALTER TABLESPACE
The Instance is now Shutdown or terminates abnormally.
As mentioned earlier this results in the LOG SEQUENCE Number in the associated
Datafile(s) header remaining 'frozen' at the time the BEGIN BACKUP was issued.
The Control File has meanwhile been updated as Log Switches have occured.
Regardless of how the Instance terminated - any attempt to Open the Database
will fail as Oracle will assume that media recovery is necessary starting with
the earliest Log Sequence Number in the 'frozen' Datafile Headers.
There are 2 alternatives :
a) The safest option is to perform the Recovery as demanded by Oracle. There
is no need to restore any files apart from Archive Log Files which are not
on-line.
- STARTUP MOUNT -> using current Control File
- CONNECT INTERNAL
- RECOVER DATABASE
You will be prompted for all required Archive Redo Log files
- ALTER DATABASE OPEN
- ALTER TABLESPACE
This should be issued for all Tablespaces left in Backup Mode. If unsure
the command can be issued for all Tablespaces with no ill effects.
- ALTER SYSTEM SWITCH LOGFILE
b) This option will NOT guarantee the consistency of the Database and should
be used with the greatest discretion ONLY under the guidance of support staff.
If the necessary Archive Redo Logs are not available this is the only option
possible unless a Cold Backup can be used.
Do not restore any files from a backup.
Set the INIT.ORA parameter - _ALLOW_RESETLOGS_CORRUPTION = TRUE
Follow the 'fake' Manual Recovery Procedures documented in 1(b).
Issue the END BACKUP commands for the necessary Tablespaces (see above)
- ALTER SYSTEM SWITCH LOGFILE
A Cold Backup of the Database should now be taken.
ADDITIONAL INFORMATION
----------------------
The contents of the Datafile/Redo File Headers and Control File are an
invaluable source of information when problems occur.
Getting this information is relatively easy. The following commands should
all be issued from SQLDBA while connected INTERNAL.
The output will be sent to the default Trace directory.
ALTER SESSION SET EVENTS 'IMMEDIATE TRACE NAME CONTROLF LEVEL 10'
- dumps the contents of the Control File
ALTER SESSION SET EVENTS 'IMMEDIATE TRACE NAME FILE_HDRS LEVEL 10'
- dumps the Headers of all Database Files
ALTER SESSION SET EVENTS 'IMMEDIATE TRACE NAME REDOHDR LEVEL 10'
- dumps the Headers of all on-line Redo Log Files
See appendix A for annotated samples of the output of the above commands.
Annotations are preceded by ***.
In all the previous Recovery examples you will note that unless restoring
from a Cold Backup it is never necessary to restore the Control File. This is
almost always the case as it is very unusual to be into a Recovery situation
and find that the physical structure of the Database has changed in any way.
If the physical structure of the Database is being changed it is recommended
that a Backup of the Control File be taken prior to the change being made to
allow for Recovery to that point in time. See Chapter 15 of the DBA Guide.
The situation where all the current Control Files have been lost/damaged
should never arise but if it should there is an undocumented command to
allow for the re-creation of the Control File. This should be used with
care as there have been problems in the past AND it is necessary to know
the location of all the Database and On-line Redo Log Files.
The Database should ALWAYS be shutdown when taking copies of control files.
Oracle versions > 6.0.33.1 where the Database has been opened at least once
will work fine.
Oracle version 6.0.33.0 will mark all Datafiles as being part of the
SYSTEM Tablespace.
It is not recommended that this option is used with any Version < 6.0.33.1
See Appendix B for information on the CREATE CONTROFILE command.
Appendix A
----------
Control File Dump (Database Open)
---------------------------------
DUMP OF CONTROL FILES, Seq # 586 = 24a
- FILE HEADER:
- Software vsn = 100798722 = 0x6021102
- Compatibility Vsn = 100786944 = 0x601e300
- DatabaseID = 1655535898 = 0x62ad791a
- Db Name = "S6A"
- File Seq = 1 = 0x1
- Control Seq = 586 = 0x24a
- File size = 114 = 0x72
- File Number = 0
- Blksiz = 512
- File Type = CONTROL (1)
- DBINFO: (offset = 0xec, size = 220, max = 1, hi = 1)
- DB Version = 100798722 = 0x6021102
- Compat Version = 101822976 = 0x611b200
- Date = 1992 5 8 14 29 46 0 0
- Media rcvry = No
- Cross check = No
- DB Name = "S6A" *** Note the Database Name
- DB Description = "something-constructive-should-come-here"
- SCNHI = ffff.ffffffff
- Man rcvy seq # = 0
- Invalid seq # = 0
- Current log # = 1 *** The File Number of the current Log
- Client generic bytes: *** will vary between 1 and number of Log Files
- 04 00 01 16 00 00 00 00 00 00 00 00 00 00 00 00
- 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
- 00 00 00 00 00 00 00 00
- Cache-layer client info field:
- 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
- REDO THREAD ENTRIES: (offset = 0x200, size = 40, max = 1, hi = 1)
- 1
- INSTANCE = s6a
- STATUS = 1
- CKPT = 0000004b.00000003.ffff *** 4b is the current Sequence Number
- LOG FILE ENTRIES: (offset = 0x228, size = 272, max = 50, hi = 2)
- 1
- Name = /usr/oracle/v6033/dbs/log1s6a.dbf
- Size
- Bytes = 2097152
- Blocks = 4096
- Block Size = 512
- Spooled = No
- Sequence # = 75 *** Log Sequence Number (4B)
- 2
- Name = /usr/oracle/v6033/dbs/log2s6a.dbf
- Size
- Bytes = 2097152
- Blocks = 4096
- Block Size = 512
- Spooled = No
- Sequence # = 74
- DB FILE ENTRIES: (offset = 0x3748, size = 288, max = 50, hi = 5)
- 1
- Name = /usr/oracle/v6033/dbs/dbss6a.dbf
- Creation RBA
- Seq # = 0
- Block # = 0
- Offset = 0
- Size
- Bytes = 20971520
- Blocks = 10240
- Block size = 2048
- Inc # = 1
- Status (5)
- Online = Yes -> NOTE
- Offline Automatically = No
- System Table Space = Yes -> NOTE
- 2
- Name = /usr/oracle/v6033/dbs/ts1dbs.dbf
- Creation RBA
- Seq # = 0
- Block # = 0
- Offset = 0
- Size
- Bytes = 2097152
- Blocks = 1024
- Block size = 2048
- Inc # = 1
- Status (1)
- Online = Yes
- Offline Automatically = No
- System Table Space = No
- 3
- Name = /usr/oracle/v6033/dbs/ts1dbs2.dbf
- Creation RBA
- Seq # = 0
- Block # = 0
- Offset = 0
- Size
- Bytes = 2097152
- Blocks = 1024
- Block size = 2048
- Inc # = 1
- Status (1)
- Online = Yes
- Offline Automatically = No
- System Table Space = No
- 4 (empty)
- 5 (empty)
- DB FILE STOP RBA ENTRIES: (offset = 0x6f88, size = 12, max = 50, hi = 50)
- FILE NUMBER THREAD_NUMBER STOP_RBA
1
1 (ffffffff.ffffffff.ffff)
2
1 (ffffffff.ffffffff.ffff)
3
1 (ffffffff.ffffffff.ffff)
***
Note the first set of HEX digits - if the database was shutdown when we did
this dump it would have been set to - 0000004B.
***
File Header Dump
----------------
Dump of file # 1
- FILE HEADER:
- Software vsn = 100798722 = 0x6021102
- Compatibility Vsn = 101822976 = 0x611b200
- DatabaseID = 1655535898 = 0x62ad791a
- Db Name = "S6A"
- File Seq = 1 = 0x1
- Control Seq = 584 = 0x248
- File size = 10240 = 0x2800
- File Number = 1
- Blksiz = 2048
- File Type = DATA (3)
Status: x0
Start Redo Points Per-Thread:
(4b.3.ffff) *** 4B(75) is Curent On-Line Log Sequence Number
Dump of file # 2
- FILE HEADER:
- Software vsn = 100798722 = 0x6021102
- Compatibility Vsn = 101822976 = 0x611b200
- DatabaseID = 1655535898 = 0x62ad791a
- Db Name = "S6A"
- File Seq = 1 = 0x1
- Control Seq = 584 = 0x248
- File size = 1024 = 0x400
- File Number = 2
- Blksiz = 2048
- File Type = DATA (3)
Status: x0
Start Redo Points Per-Thread:
(4b.3.ffff)
Dump of file # 3
- FILE HEADER:
- Software vsn = 100798722 = 0x6021102
- Compatibility Vsn = 101822976 = 0x611b200
- DatabaseID = 1655535898 = 0x62ad791a
- Db Name = "S6A"
- File Seq = 1 = 0x1
- Control Seq = 584 = 0x248
- File size = 1024 = 0x400
- File Number = 3
- Blksiz = 2048
- File Type = DATA (3)
Status: x0
Start Redo Points Per-Thread:
(4b.3.ffff)
On-Line Redo Log File Headers Dump
----------------------------------
2 logs in database
Log #1 < deleted > size=4096 seq=0x4b blksz=512 spl=0 - FILE HEADER:
- Software vsn = 100798723 = 0x6021103
- Compatibility Vsn = 101823232 = 0x611b300
- DatabaseID = 1655535898 = 0x62ad791a
- Db Name = "S6A"
- File Seq = 75 = 0x4b *** Log File Sequence Number
- Control Seq = 583 = 0x247
- File size = 4096 = 0x1000
- File Number = 1
- Blksiz = 512
- File Type = LOG (2)
Stored Sequence number = ##00000000000000004B
Next Available Block: 1001
Redo Thread Allocation Table Entries:
THRD FCB NAB ECC
------------------------------------- 1 0x00000003 0xffffffff 0x00001000
Log #2 < deleted > size=4096 seq=0x4a blksz=512 spl=0 - FILE HEADER:
- Software vsn = 100798723 = 0x6021103
- Compatibility Vsn = 101823232 = 0x611b300
- DatabaseID = 1655535898 = 0x62ad791a
- Db Name = "S6A"
- File Seq = 74 = 0x4a
- Control Seq = 571 = 0x23b
- File size = 4096 = 0x1000
- File Number = 2
- Blksiz = 512
- File Type = LOG (2)
Stored Sequence number = ##00000000000000004A
Next Available Block: 1001
Redo Thread Allocation Table Entries:
THRD FCB NAB ECC
------------------------------------- 1 0x00000003 0x00000b34 0x00001000
Appendix B
----------
CREATE CONTROLFILE [REUSE]
DATABASE name
[LOGFILE filespec [, filespec] ...]
RESETLOGS | NORESETLOGS
[MAXLOGFILES integer]
[DATAFILE filespec [, filespec] ...]
[MAXDATAFILES integer]
[MAXINSTANCES integer]
[ARCHIVELOG | NOARCHIVELOG]
[SHARED | EXCLUSIVE]
Parameters:
REUSE: If present the control files may already exist. The
new control files will overwrite the existing files. If
this option is missing, the new control files must not
yet exist. As in CREATE DATABASE, the names of the
control files are determined by the init.ora parameter
control_files.
DATABASE: Must match the database names in the data and log
files.
LOGFILE: This clause lists all the online logs that will be
used for this database. If not specified the port
dependant defaults will be assumed. The interpretation
of the filespecs depends on the next parameter.
RESETLOGS: If this flag is present the current contents of
the online logs are ignored. The new control files will
contain flags requiring ALTER DATABASE OPEN RESETLOGS,
which initializes the logs. Media recovery may be
applied as needed before the open. Note that either
RESETLOGS or NORESETLOGS must be specified. It is safest
to choose RESETLOGS and follow it with normal media
recovery.
NORESETLOGS: If specified, the log files must be the current
online logs. They must not be restored backups, and all
log files must be listed. Their headers are read to
construct the control file entries. They are used for
recovery. If archiving is enabled all the online logs
must be archived, even if they were already archived.
The SIZE option in the filespecs, if present, will be
used to validate the size of the file named.
MAXLOGFILES: Same as for CREATE DATABASE. May be different
than the value in the original control file, but it may
not be smaller the maximum number of log files the
database ever contained - including ones that have been
dropped. Set it greater than or equal to the value used
at CREATE DATABASE time.
DATAFILE: To ensure proper behavior, all datafiles for the
database must be listed. It is possible to omit a non
system tablespace file only if media recovery is enabled
and you will not be doing an open reset logs on the first
open after the create controlfile. If the omitted
file(s) contain(s) active rollback segments, the open
will most likely fail, in which case the missing
datafile(s) must be found, and the controlfile recreated.
All datafiles listed must be accessible since they are
assumed to be online. They MAY be backup copies needing
recovery. Their headers are read to construct the
control file records. The SIZE option in the filespecs,
if present, is used to validate the size of the file
named. The reuse option is ignored. The next database
open validates that all the files are specified and that
the sizes match.
MAXDATAFILES: Same as for CREATE DATABASE. May be different
than the value in the original control file, but it may
not be smaller the maximum number of data files the
database ever contained - including ones that have been
dropped.
MAXINSTANCES: Same as for CREATE DATABASE. May be different
than the value in the original control file.
[NO]ARCHIVELOG: Same as for CREATE DATABASE. May be
different than the value in the original control file.
If you wish to archive logs, it is recommended that the
ARCHIVELOG option be used with CREATE CONTROLFILE even
though the option can later be enabled with an ALTER
DATABASE command. NOARCHIVELOG is the default.
SHARED: Same as for CREATE DATABASE.
EXCLUSIVE: Same as for CREATE DATABASE.
Usage:
The CREATE CONTROLFILE command should be executed after a
STARTUP NOMOUNT. If it is successful, a new control file is
created and the database is then mounted. Any required
recovery may then be applied. It is strongly advised that a
complete backup of all available files be taken before using
this command. Since this overwrites the flags indicating
which logs were archived, they all will be archived again.
This is not necessary if the RESETLOGS option is chosen, as
the logs are all cleared. The next database open will go
through the data dictionary / controlfile integrity checker
to ensure a consistent database. After the database is
opened and verified to be consistent, it should be shutdown
cleanly. A complete backup should then be taken. This is
particularly important if the RESET flag was used on open.
.
No comments:
Post a Comment