Damir Vadas, Oracle as I learned: How to find correct SCN?

Monday, February 22, 2010

How to find correct SCN?

Recent online help to certain Mr. Muthu, who had problems with database recover, force me to write this blog topic. Even thought there are many problematic situations in this area one is very common and most often. So I'll try to explain what is the right path to solve such a situation (if it can be solved!).

The scenario

To describe starting situation, let us suppose this restore/recover scenario (we are talking about incomplete recover).

startup nomount
restore control file
alter database mount
restore database

Next step should be database recover (with or without until clause). But this action finishes with:

RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of alter db command at 01/11/2010 12:08:19
ORA-01152: file 1 was not restored from a sufficiently old backup 
ORA-01110: data file 1: '+DATA/hac/datafile/system.366.708004385'

So we are stacked because recover could not be consistent. The main problem is that restored data files need ALL to be recovered at least to largest SCN number+1.

This situation is explained world wide but main main question is: what is the lowest SCN that can recover database enough to be open?

Solution-RMAN list backup

First solution is to look in RMAN repository or use control file records to retrieve backup of datafile 1. In a case of RMAN catalog you'll not have too much trouble because you'll find enough information to see backup of wanted datafile and it's SCN. Output may looks like:

RMAN> list backup of datafile 1;

List of Backup Sets
===================

BS Key  Type LV Size       Device Type Elapsed Time Completion Time
------- ---- -- ---------- ----------- ------------ ---------------
645     Full    127.30M    DISK        00:01:02     06-DEC-09
        BP Key: 645   Status: AVAILABLE  Compressed: YES  Tag: TAG20091206T183018
        Piece Name: +BCKP/hac/backupset/2009_12_06/tag20091206t183018_0.7966.704917821
  List of Datafiles in backup set 645
  File LV Type Ckp SCN    Ckp Time  Name
  ---- -- ---- ---------- --------- ----
  1       Full 35728924   06-DEC-09 +DATA/hac/datafile/system.282.694017165
...

If you are lucky and get result look in column "Ckp SCN" and you'll get missing SCN. Unfortunately this may not help you always.

In nocatalog mode it is more then likely that your command may retrieve no data:

RMAN> list backup of datafile 1;


RMAN>

No backup! The reason is no backup was taken in time provided with control_file_record_keep_time parameter.

Solution-RMAN list copy

Next solution is usage of RMAN's "list copy", which shows only backup of "datafile 1", with the same retention as described in previously explained in "List backup" solution. However if you are lucky and get result it is more or less like:

RMAN> list copy of datafile 1;

List of Datafile Copies
Key     File S Completion Time Ckp SCN    Ckp Time        Name
------- ---- - --------------- ---------- --------------- ---------------------------------------------
1917    1    A 08-JAN-10       49152352   08-JAN-10       /u01/HAC_I-1799195971_TS-SYSTEM_FNO-1_nvl2urjr
1867    1    X 07-JAN-10       49125664   07-JAN-10       /u01/HAC_I-1799195971_TS-SYSTEM_FNO-1_m4l2sii8
1841    1    X 07-JAN-10       49117957   07-JAN-10       /u01/HAC_I-1799195971_TS-SYSTEM_FNO-1_l7l2sb2u
1813    1    X 18-SEP-09       9643175    18-SEP-09       /u01/HAC_I-1799195971_TS-SYSTEM_FNO-1_edkpitiv
1666    1    A 17-SEP-09       9174432    17-SEP-09       /u01/HAC_I-1799195971_TS-SYSTEM_FNO-1_ahkpfgq1

Again, regardless shown several SCNs this might not be enough to help you.

Solution-use x$ (C based) tables

x$ tables are Oracle's core tables based on pure C data and should be handle with additional care. The beauty of them is they are available when database in mounted and when all "dba_*" views are not available as well as some v$ based ones.

To see status of all restored datafiles can be done with this SQL

set linesize 200;
set pagesize 100;
col inst_id for 9999999 heading 'Instance #'
col file_nr for 9999999 heading 'File #'
col file_name for A50 heading 'File name'
col checkpoint_change_nr for 99999999999999 heading 'Checkpoint #'
col checkpoint_change_time for A20 heading 'Checkpoint time'
col last_change_nr for 99999999999999 heading 'Last change #'
SELECT
      fe.inst_id,
      fe.fenum file_nr,
      fn.fnnam file_name,
      TO_NUMBER (fe.fecps) checkpoint_change_nr,
      fe.fecpt checkpoint_change_time,
      fe.fests last_change_nr,
      DECODE (
              fe.fetsn,
              0, DECODE (BITAND (fe.festa, 2), 0, 'SYSOFF', 'SYSTEM'),
              DECODE (BITAND (fe.festa, 18),
                      0, 'OFFLINE',
                      2, 'ONLINE',
                      'RECOVER')
      ) status
FROM x$kccfe fe,
     x$kccfn fn
WHERE    (   (fe.fepax != 65535 AND fe.fepax != 0 )
          OR (fe.fepax = 65535 OR fe.fepax = 0)
         )
     AND fn.fnfno = fe.fenum
     AND fe.fefnh = fn.fnnum
     AND fe.fedup != 0
     AND fn.fntyp = 4
     AND fn.fnnam IS NOT NULL
     AND BITAND (fn.fnflg, 4) != 4
ORDER BY fe.fenum
;

The output might looks like:

Instance #   File # File name                                             Checkpoint # Checkpoint time      Last change #    STATUS
---------- -------- -------------------------------------------------- --------------- -------------------- ---------------- ----------------------------
         1        1 /db/SOME_DB/dbf01/system01.dbf                         12039685200 03/10/2021 17:18:23                   SYSTEM
         1        2 /db/SOME_DB/dbf01/sysaux01.dbf                         12039685200 03/10/2021 17:18:23                   ONLINE
         1        3 /db/SOME_DB/und_01/undotbs01.dbf                       12039685200 03/10/2021 17:18:23                   ONLINE
         1        4 /db/SOME_DB/dbf01/users01.dbf                          12039685200 03/10/2021 17:18:23                   ONLINE
         1        5 /db/SOME_DB/dbf01/gen_gfc_prod01.dbf                   12039685200 03/10/2021 17:18:23                   ONLINE
         1        6 /db/SOME_DB/dbf01/gen_gfc_prod02.dbf                   12039685200 03/10/2021 17:18:23                   ONLINE
         1        7 /db/SOME_DB/dbf01/gen_gfc_prod03.dbf                   12039685200 03/10/2021 17:18:23                   ONLINE
         1        8 /db/SOME_DB/dbf01/gen_gfc_prod04.dbf                   12039685200 03/10/2021 17:18:23                   ONLINE
         1        9 /db/SOME_DB/dbf01/gen_gfc_prod05.dbf                   12039685200 03/10/2021 17:18:23                   ONLINE
         1       10 /db/SOME_DB/dbf01/gen_gfc_prod06.dbf                   12039685200 03/10/2021 17:18:23                   ONLINE
         1       11 /db/SOME_DB/dbf01/gen_gfc_prod07.dbf                   12039685200 03/10/2021 17:18:23                   ONLINE
         1       12 /db/SOME_DB/dbf01/sysaud01.dbf                         12039685200 03/10/2021 17:18:23                   ONLINE
         1       13 /db/SOME_DB/dbf01/gen_gfc_prod08                        9340774962 01/04/2021 11:04:30  9341443930       RECOVER
         1       14 /db/SOME_DB/dbf01/gen_gfc_prod08.dbf                   12039685200 03/10/2021 17:18:23                   ONLINE
         1       15 /db/SOME_DB/dbf01/gen_gfc_prod09.dbf                   12039685200 03/10/2021 17:18:23                   ONLINE
         1       16 /db/SOME_DB/dbf01/gen_gfc_prod10.dbf                   12039685200 03/10/2021 17:18:23                   ONLINE

16 rows selected.

As you see File# 13 needs recovery and last applied change was in SCN=9341443930. If you cannot retrive exact time (scn_to_timestamp), here you have in column Checkpoint time date value which you can uses as well.
Remember that SCN for recover is always xxxxxxxx+1, for one bigger then one found in list!

To cover the whole story, for successful recover of any backup set like that, you need to apply archive logs which's SCN's are between min(xxxxxxxx)-max(xxxxxxxx)+1 but you do not need to specify min(xxxxxxxx) because RMAN will know that automatically.

So your restore should use until SCN clause in RMAN script. In pseudo language that looks like:

startup nomount;
restore controlfile;
alter database mount;

set until scn 9341443931

restore database;
recover database;

alter database open resetlogs;

If all SCNs from previous query are the same then (as they are in our case) then you are dealing with consistent (aka cold backup). Cold backup means that all datafiles are having same SCN so practicly they do not need recover but only as declarative term (you do not need any archived log file beside). In a cases like that, your restore RMAN script might looks like:

restore database;
recover database noredo;
alter database open resetlogs;

However, in some consistent backup cases, I remember that I found on some database version a bug as explained at the begginig (ORA-01152). Based on previously shown real numbers, your "full proof" restore script for consistent backup, should look like:

set until scn=9341443931;
restore database;
recover database noredo;
alter database open resetlogs;

So there is no affraid of use until clause even in restoring consistent backups. This is especially the case in "backup as copy" when you deal with backup outside of ASM storage.

Solution-RMAN restore database preview

To prevent searching for information in restore/recover time, there is elegant method that will tell you which SCN is needed for successful restore.

After regular database backup, use MAN command which will collect and show all important data for this backupset.

backup database;
restore database preview;

At the end of command shows exact SCN that you need for successful recover. At the end of log file output is like:

Media recovery start SCN is 9341443930 
Recovery must be done beyond SCN 9341443930 to clear data files fuzziness
Finished restore at 17-SEP-21
released channel: t1
released channel: t2

RMAN> 

Recovery Manager complete.

where xxxxxxxxx+1 represent SCN you are searching for.

Cheers!

22 comments :

AnonymousFebruary 23, 2010 at 3:34 AM
Very nice. So clear and easy to understand the main point. Appreciate a lot.
Leika
ReplyDelete
Replies
Damir VadasMay 15, 2010 at 10:26 PM
Thank you Leika.
Glad you'd find it useful.
ReplyDelete
Replies
AnonymousMay 26, 2011 at 5:09 AM
When I run a RESTORE PREVIEW, it ends with:

Media recovery start SCN is 2483662717
Recovery must be done beyond SCN 2484364327 to clear datafile fuzziness
Finished restore at 25-MAY-2011 22:03:08

Why are the SCNs different here?
Since this was a one off special bkup taken to disk (not tsm which is typical), how do I verfiy I have all my archive logs on disk before I begin restore & recovery?
ReplyDelete
Replies
Damir VadasDecember 28, 2011 at 6:35 PM
How do you obtain SCN in your post?
ReplyDelete
Replies
Damir VadasMay 25, 2012 at 3:14 PM
"Media recovery start SCN is 2483662717"
This is starting SCN
When backup was taking place, SCN are changed so when backup ends, then SCN was "2484364327".
This is the number of SCN that you must have in archived redo logs to have valid backup.
ReplyDelete
Replies
AnonymousJanuary 19, 2013 at 2:27 AM
Thank you. This post helped me do a successful restore.
ReplyDelete
Replies
AbdulApril 4, 2013 at 6:31 PM
Appreciated your efforts!! Thanks A lot for writing this block.
ReplyDelete
Replies
AbdulApril 4, 2013 at 6:34 PM
one question, do we need all the time to be increased the scn by 1 , suppose if the scn is 12345 while recovering the db should it be 12346? also, what about if I go through the recovery using time stamp?
ReplyDelete
Replies
AnonymousAugust 11, 2014 at 9:47 PM
Hi, can you explain how you got to these two tables/views?
FROM x$kccfe fe,
x$kccfn fn
ReplyDelete
Replies
AnonymousOctober 23, 2014 at 3:00 AM
Hi Damir, what about this scenario: We have a level 0 backup that starts on day 1 at 6:00 am and runs for over 30 hours, ending on day 2 at 13:00, during the time no archive log backups are taken but the LV0 includes archive log all and deletes them.

Now, we need to restore the DB back to day 1 at 14:00, obviously this LV0 would not help since it had not finished by then; but we do have previous LV0, previous LV1 and all archive logs to roll forward.

Question is: what is the best place to find the exact SCN to use here (the closest to day 1 14:00+), we don't have much info on the catalog because no archive logs were backed up during the LV0, so the RC_ARCHIVE_LOG view does not help much, we could use the v$log_history and list backup, and a few other places, but is there a better and more accurate way? I like the idea of using the x$ tables, although even querying them might be dangerous sometimes, but probably more accurate.

Any thoughts and guidance would be very much appreciated, you info here is great and I have enjoyed reading it very much. Keep up the good work.

Thanks

Orlando.
ReplyDelete
Replies
ashumDecember 24, 2015 at 6:26 PM
Hi, I was running backup to disk but now I want to switch to SBT_TAPE. I have run a full database backup to my SBT device using the following command: backup as backupset full check logical database
Then I am trying to restore from the SBT backup but I am getting errors "RMAN-06054: media recovery requesting unknown archived log for thread...".
Why is it looking for archived logs?
From the controlfile information I have noticed that are datafiles with different Ckp SCN. For instance, in BS 19160 the datafiles have the Ckp SCN = 12489436489235, and for the BS 19161 the Ckp SCN is 12489436489236 (lastest BS).
I am running the RESTORE/RECOVERY using the UNTIL SCN=12489436489237 (12489436489236 + 1).
ReplyDelete
Replies
AnonymousMarch 9, 2017 at 7:43 AM
sooper explanation. Still helped me.
ReplyDelete
Replies
shiveshdbaApril 17, 2019 at 5:20 AM
Good Blog. Nicely written.
ReplyDelete
Replies

Add comment

Subscribe to: Post Comments ( Atom )

Blog Archive

Zagreb u srcu!

Sign by Danasoft - Get Your Sign