Thursday, April 19, 2012

Materialized view, mine approach

Recently I have a case with Oracle materialized view (MV later in tekst), which was causing performance problems in our RAC database. In AWR report, SQL ordered by Gets, SQL ordered by CPU Time, SQL ordered by Reads, it was number one event in monitored period:
Quick view in general numbers, characteristic for any Oracle MV, gave a result:
select count(*) from "TAB"."MLOG$_PORTAL_KORISNIK"
38.773.887
select count(*) from "TAB"."PORTAL_KORISNIK"
207.180
Characteristic of this MV (and 99% of all used MV logs) is that materialized view log is not getting purged regularly. Fast refreshes on the materialized views do not purge the materialized view logs. As detailed in the Oracle documentation, complete refresh on the materialized views is not purging the materialized view log as well. But when purge process want to perform deletion in a way, that is logged in AWR:
delete from "TAB"."MLOG$_PORTAL_KORISNIK" where snaptime$$ <= :1
a massive full table scan is performed with huge I/O overhead (look closer in "Physical reads" column-2.1GB). Time to do something!

Before solution

Obviously MV purge needs to be performed. But before purging ... one important step should be done-get wider picture why log is dozen times bigger then parent table? For that I have used next SQL, which query MV log:
SELECT TRUNC(SNAPTIME$$) DATUM, COUNT(*) 
  FROM TAB.MLOG$_PORTAL_KORISNIK 
GROUP BY TRUNC(SNAPTIME$$) 
ORDER BY DATUM DESC;

DATUM               COUNT(*)                               
---------           ---------
1.1.4000            99083                                  
19.4.2012           198159                                 
18.4.2012           594574                                 
17.4.2012           594595                                 
16.4.2012           594574                                 
15.4.2012           594408                                 
14.4.2012           594415                                 
13.4.2012           594505                                 
12.4.2012           594578                                 
11.4.2012           594469                                 
10.4.2012           594517                                 
9.4.2012            594192                                 
8.4.2012            594192                                 
7.4.2012            594197                                 
6.4.2012            594249                                 
5.4.2012            594157                                 
4.4.2012            594075                                 
3.4.2012            594038                                 
2.4.2012            593951                                 
1.4.2012            593808                                 
31.3.2012           593817                                 
30.3.2012           593843                                 
29.3.2012           593863                                 
28.3.2012           593820                                 
27.3.2012           593811                                 
26.3.2012           593774                                 
25.3.2012           593622                                 
24.3.2012           593630                                 
23.3.2012           593678                                 
22.3.2012           593624                                 
21.3.2012           593578                                 
20.3.2012           593644                                 
19.3.2012           593613                                 
18.3.2012           593370                                 
17.3.2012           593373                                 
16.3.2012           593458                                 
15.3.2012           593448                                 
14.3.2012           593417                                 
13.3.2012           593375                                 
12.3.2012           593298                                 
11.3.2012           593172                                 
10.3.2012           593178                                 
9.3.2012            593241                                 
8.3.2012            593222                                 
7.3.2012            593153                                 
6.3.2012            593072                                 
5.3.2012            593133                                 
4.3.2012            592980                                 
3.3.2012            592982                                 
2.3.2012            593124                                 
1.3.2012            593054                                 
29.2.2012           593014                                 
28.2.2012           592984                                 
27.2.2012           592897                                 
26.2.2012           592644                                 
25.2.2012           592656                                 
24.2.2012           592727                                 
23.2.2012           592649                                 
22.2.2012           592645                                 
21.2.2012           592598                                 
20.2.2012           592516                                 
19.2.2012           592416                                 
18.2.2012           592417                                 
17.2.2012           592479                                 
16.2.2012           592453                                 
15.2.2012           592397                                 
14.2.2012           592378                                 
SUM             38.773.887
------------------------------------------
Well something is wrong because MV log was refreshed cca 600.000 times a day!! Closer log on smaller segment gave me better picture, in which part that happened. I have use one random day as pattern:
SELECT TRUNC(SNAPTIME$$) DATUM, DMLTYPE$$, COUNT(*) 
  FROM TAB.MLOG$_PORTAL_KORISNIK 
WHERE TRUNC (SNAPTIME$$)='18.04.2012' 
GROUP BY TRUNC(SNAPTIME$$), DMLTYPE$$ 
ORDER BY DATUM DESC;

DATUM     DMLTYPE$$ COUNT(*)  
--------- --------- --------                 
18.4.2012 I         49                                     
18.4.2012 U         594525                                 
So problem is in updating MV log part. This was enough for me because finding what is updating parent table through database job is simple task if you go to "OWNER" of PL/SQL code that is dealing with it. And I knew a guy!

The solution

Now I was prepared to purge MV. I have wait until evening (general advice is to avoid to much DBA actions on loaded database) and according previous value, perform next purge statement:
execute DBMS_MVIEW.PURGE_LOG( master => 'TAB.PORTAL_KORISNIK', num=>'40000000', flag => 'delete' ) ;
If you do not have enough time for one time execution repeat n times with smaller chunks (num parameter). If you leave MV in this state when first FAST refresh happened, you'll get this error:
BEGIN dbms_mview.refresh( list => 'TAB.MLOG$_PORTAL_KORISNIK', method => 'F' ); END;

*
ERROR at line 1:
ORA-12034: materialized view log on "TAB"."PORTAL_KORISNIK" younger than last refresh
ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2537
ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2743
ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2712
ORA-06512: at line 1
This is because Oracle MV rules says that after MV log purge, first MV refresh must be COMPLETE, not FAST. So be it:
QL> execute dbms_mview.refresh( list => 'TAB.PORTAL_KORISNIK ', method => 'C' );
BEGIN dbms_mview.refresh( list => 'TAB.PORTAL_KORISNIK ', method => 'C' ); END;

*
ERROR at line 1:
ORA-12008: error in materialized view refresh path
ORA-02292: integrity constraint (TAB.RAC_KOR_FK) violated - child record found
ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2563
ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2776
ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2745
ORA-06512: at line 1
This is first error that unprepared DBA might face (and frankly I was in this case-admit). Because COMPLETE refresh delete all records in MV, if this table is used as parent table for some other table, action cannot be performed before disable action on that constraint is performed. This constraint is, obviously, on child table, so let us disable:
SQL> ALTER TABLE TAB.SOME_CHILD_TABLE
  2    MODIFY CONSTRAINT RAC_KOR_FK
  3    DISABLE;
In this case I was lucky to have just one table with just one level connected constrains. In real life you might get in real trouble when trying to disable all needed constraints around database. But as stated, I was lucky. After that COMPLETE refresh pass:
SQL> execute dbms_mview.refresh( list => 'TAB.PORTAL_KORISNIK ', method => 'C' );

PL/SQL procedure successfully completed.
Do not forget to enable previously disabled constraint:
SQL> ALTER TABLE TAB.SOME_CHILD_TABLE
  2    MODIFY CONSTRAINT RAC_KOR_FK
  3    ENABLE;

Table altered.

SQL>
And for the end make test with FAST refresh type:
execute dbms_mview.refresh( list => 'PORTAL_KORISNIK', method => 'F' );

PL/SQL procedure successfully completed.
Looks OK. Promissing.

Is that all? No!

In this moment many MV purge guides finish, but this one goes one step beyond. Happy DBA might check MV log in a way to see number of records (that all are purged). And this is moment of happiness and sadness in the same moment.
COUNT(*)
----------
0

Elapsed: 00:00:31.00
SQL> 
0 records is expected result but time to gather that result is not-31 seconds for empty segment! MV left in this state will show minor performance gain-because reclaimable storage from previos data left and is still active.

So shrinking space is needed because space for MV is not shorten (remember, before purge was performed on MV log with 38 millions of records in it). This may be done in two steps.

First step is one performed where MV table is placed, where MV will be shrink:
SQL> alter table tab.portal_korisnik enable row movement;

Table altered.

SQL>  ALTER MATERIALIZED VIEW "TAB"."PORTAL_KORISNIK" shrink space cascade;

Materialized view altered.

SQL>
Next step should be done on side where MV log is placed. This is where MV log will be shrinked:
SQL> alter MATERIALIZED VIEW LOG ON TAB.PORTAL_KORISNIK ENABLE ROW MOVEMENT;

Materialized view log altered.

Elapsed: 00:00:00.87
SQL> alter MATERIALIZED VIEW LOG ON TAB.PORTAL_KORISNIK SHRINK SPACE CASCADE;

Materialized view log altered.

SQL> alter MATERIALIZED VIEW LOG ON TAB.PORTAL_KORISNIK DISABLE ROW MOVEMENT;

Materialized view log altered.

Elapsed: 00:00:00.15
SQL>

SQL> alter table tab.portal_korisnik disable row movement;

Table altered.

SQL>
Notice that these two sides actions, mostly differ in "LOG" part only. Now all seems to be OK, let us try once again problematic count(*) statement:
SQL> select count(*) from TAB.MLOG$_PORTAL_KORISNIK;

COUNT(*)
----------
0

Elapsed: 00:00:00.09
SQL>
This is just fine now.

Performance gain

If you ask how much this affect in performance, here are some jobs statistic query, based on custom jobs_diary tables, which has column SECS_RUN which shows how much seconds has job lasted. First let me show job which was inserting records (minor impact in MV log):
ID    INST_ID JOB_INFO                           SECS_RUN CREATE_D CREATE_USER                      EDIT_DAT EDIT_USER
---------- ---------- -------------------------------- ---------- -------- -------------------------------- -------- ----------
      2809          1 423 snapshots                           385 19.04.12 TAB                              19.04.12 TAB
      2816          1 423 snapshots                           277 19.04.12 TAB                              19.04.12 TAB
      2825          1 423 snapshots                           322 19.04.12 TAB                              19.04.12 TAB
      2832          1 423 snapshots                           326 19.04.12 TAB                              19.04.12 TAB
      2842          1 423 snapshots                           277 19.04.12 TAB                              19.04.12 TAB
      2849          1 423 snapshots                           342 19.04.12 TAB                              19.04.12 TAB
      2858          1 423 snapshots                           356 19.04.12 TAB                              19.04.12 TAB
      2866          1 423 snapshots                           286 19.04.12 TAB                              19.04.12 TAB
      2878          1 423 snapshots                           528 19.04.12 TAB                              19.04.12 TAB
      2885          1 423 snapshots                           415 19.04.12 TAB                              19.04.12 TAB
      2894          1 423 snapshots                           187 19.04.12 TAB                              19.04.12 TAB
      2901          1 423 snapshots                           263 19.04.12 TAB                              19.04.12 TAB
      2911          1 423 snapshots                           179 19.04.12 TAB                              19.04.12 TAB
      2918          1 423 snapshots                           339 19.04.12 TAB                              19.04.12 TAB
      2927          1 423 snapshots                           372 19.04.12 TAB                              19.04.12 TAB
      2934          1 423 snapshots                           535 19.04.12 TAB                              19.04.12 TAB
      2945          1 423 snapshots                           437 19.04.12 TAB                              19.04.12 TAB
      2952          1 423 snapshots                           252 19.04.12 TAB                              19.04.12 TAB
      2961          1 423 snapshots                           248 19.04.12 TAB                              19.04.12 TAB
      2969          1 423 snapshots                           268 19.04.12 TAB                              19.04.12 TAB
      2979          1 423 snapshots                           214 19.04.12 TAB                              19.04.12 TAB
      2986          1 423 snapshots                           268 19.04.12 TAB                              19.04.12 TAB
      2995          1 423 snapshots                           260 19.04.12 TAB                              19.04.12 TAB
      3003          1 423 snapshots                           258 19.04.12 TAB                              19.04.12 TAB
      3014          1 423 snapshots                               19.04.12 TAB
      3016          1 423 snapshots                               19.04.12 TAB
      3018          1 423 snapshots                               19.04.12 TAB
      3021          1 423 snapshots                           106 19.04.12 TAB                              19.04.12 TAB
      3030          1 423 snapshots                           159 19.04.12 TAB                              19.04.12 TAB
      3037          1 423 snapshots                           175 19.04.12 TAB                              19.04.12 TAB
      3047          1 423 snapshots                            64 19.04.12 TAB                              19.04.12 TAB
      3054          1 423 snapshots                            60 19.04.12 TAB                              19.04.12 TAB
      3063          1 423 snapshots                            67 19.04.12 TAB                              19.04.12 TAB

33 rows selected.
3014, 3016 and 3016 were jobs which failed to run because purging was taking place.
3021, 3030, 3037 were jobs before shrink space happened.
3047, 3054, 3063 were jobs after shrink space happened.

Generally speaking, it looks like:
SQL> select min(secs_run), max(secs_run), avg(secs_run) from tools.jobs_diary where job_info='423 snapshots' ;

MIN(SECS_RUN) MAX(SECS_RUN) AVG(SECS_RUN)
------------- ------------- -------------
           60           669    280.133739
This is, on average base, at least 4 times less then before.

Let us look for job which was doing that massive updates. Interesting, that job was running pretty fast (MV log is on remote database) so all seems nice all the time:
SQL> select * from tools.jobs_diary where job_info='3364 Ažuriranje korisnika' order by id;

        ID    INST_ID JOB_INFO                           SECS_RUN CREATE_D CREATE_USER                      EDIT_DAT EDIT_USER
---------- ---------- -------------------------------- ---------- -------- -------------------------------- -------- ----------
       710          3 3364 Ažuriranje korisnika                 8 13.04.12 TAB                              13.04.12 TAB
       780          1 3364 Ažuriranje korisnika                 8 13.04.12 TAB                              13.04.12 TAB
       849          2 3364 Ažuriranje korisnika                 8 13.04.12 TAB                              13.04.12 TAB
       919          2 3364 Ažuriranje korisnika                 8 13.04.12 TAB                              13.04.12 TAB
       989          2 3364 Ažuriranje korisnika                16 13.04.12 TAB                              13.04.12 TAB
      1058          1 3364 Ažuriranje korisnika                 9 13.04.12 TAB                              13.04.12 TAB
      1129          1 3364 Ažuriranje korisnika                 8 14.04.12 TAB                              14.04.12 TAB
      1198          3 3364 Ažuriranje korisnika                16 14.04.12 TAB                              14.04.12 TAB
      1267          1 3364 Ažuriranje korisnika                 7 14.04.12 TAB                              14.04.12 TAB
      1336          3 3364 Ažuriranje korisnika                13 14.04.12 TAB                              14.04.12 TAB
      1406          3 3364 Ažuriranje korisnika                 8 14.04.12 TAB                              14.04.12 TAB
      1474          2 3364 Ažuriranje korisnika                10 14.04.12 TAB                              14.04.12 TAB
      1548          3 3364 Ažuriranje korisnika                13 15.04.12 TAB                              15.04.12 TAB
      1618          4 3364 Ažuriranje korisnika                18 15.04.12 TAB                              15.04.12 TAB
      1686          4 3364 Ažuriranje korisnika                14 15.04.12 TAB                              15.04.12 TAB
      1756          4 3364 Ažuriranje korisnika                17 15.04.12 TAB                              15.04.12 TAB
      1826          3 3364 Ažuriranje korisnika                11 15.04.12 TAB                              15.04.12 TAB
      1894          4 3364 Ažuriranje korisnika                12 15.04.12 TAB                              15.04.12 TAB
      1966          3 3364 Ažuriranje korisnika                12 16.04.12 TAB                              16.04.12 TAB
      2036          3 3364 Ažuriranje korisnika                10 16.04.12 TAB                              16.04.12 TAB
      2105          1 3364 Ažuriranje korisnika                21 16.04.12 TAB                              16.04.12 TAB
      2175          1 3364 Ažuriranje korisnika                 6 16.04.12 TAB                              16.04.12 TAB
      2245          4 3364 Ažuriranje korisnika                19 16.04.12 TAB                              16.04.12 TAB
      2314          2 3364 Ažuriranje korisnika                 9 16.04.12 TAB                              16.04.12 TAB
      2387          1 3364 Ažuriranje korisnika                11 17.04.12 TAB                              17.04.12 TAB
      2456          3 3364 Ažuriranje korisnika                12 17.04.12 TAB                              17.04.12 TAB
      2525          3 3364 Ažuriranje korisnika                23 17.04.12 TAB                              17.04.12 TAB
      2595          2 3364 Ažuriranje korisnika                41 17.04.12 TAB                              17.04.12 TAB
      2665          3 3364 Ažuriranje korisnika                 9 17.04.12 TAB                              17.04.12 TAB
      2734          1 3364 Ažuriranje korisnika                24 17.04.12 TAB                              17.04.12 TAB
      2806          4 3364 Ažuriranje korisnika                 7 18.04.12 TAB                              18.04.12 TAB
      2875          2 3364 Ažuriranje korisnika                13 18.04.12 TAB                              18.04.12 TAB
      2944          2 3364 Ažuriranje korisnika                 8 18.04.12 TAB                              18.04.12 TAB
      3014          1 3364 Ažuriranje korisnika                 9 18.04.12 TAB                              18.04.12 TAB
      3114          4 3364 Ažuriranje korisnika                 7 18.04.12 TAB                              18.04.12 TAB
      3251          2 3364 Ažuriranje korisnika               108 18.04.12 TAB                              18.04.12 TAB
      3399          1 3364 Ažuriranje korisnika                13 19.04.12 TAB                              19.04.12 TAB
      3543          2 3364 Ažuriranje korisnika                 8 19.04.12 TAB                              19.04.12 TAB
      3683          3 3364 Ažuriranje korisnika                10 19.04.12 TAB                              19.04.12 TAB
      3840          2 3364 Ažuriranje korisnika                 0 19.04.12 TAB                              19.04.12 TAB
      3983          3 3364 Ažuriranje korisnika                 1 19.04.12 TAB                              19.04.12 TAB

41 rows selected.
3840 and 3983 were jobs that happened after shrink space happened. As stated before, performance gain seems small. Look in general data:
SQL> select min(secs_run), max(secs_run), avg(secs_run) from tools.jobs_diary where job_info='3364 Ažuriranje korisnika';

MIN(SECS_RUN) MAX(SECS_RUN) AVG(SECS_RUN)
------------- ------------- -------------
            0           108    14.2682927
If you look in average, then again performance gain is 10 times, even on small numbers this is huge percent value.

The end

But the best gain was when MV log was purged, because FTS was small. This gain may not be exposed in numbers in any mean.

For the end test once again FAST refresh and if all went OK, purging is done in all way with best result.

Hope this helps someone...

Cheers!

1 comment: