Data … as usual

All things about data by Laurent Leturgez

Category Archives: CBO

Oracle 11g, frequency histograms, and character columns …

Recently I worked on a performance issue on two Oracle 11g instances (11.2.0.3 and 11.2.0.4) and I hit a bug concerning histograms and character columns. Oracle 11g is now about 7 years old but it’s a version that is still used in lots of systems .. that’s why I decided to write this blog post.

Well, in Oracle 11g, you have two kind of histograms :

  • Frequency histograms: they are calculated when the number of distinct values (NDV) in a column is lower than the number of buckets in the histogram. Frequency histograms are the most accurate histogram, because they count the exact number of occurrences for each value.
  • Height balanced histograms: they are calculated when the NDV is greater than the number of buckets in the histogram. Height balanced histograms are less accurate but they give some precious information to the optimizer on data distribution.

Histograms help the cost optimizer to take the good decision when the CBO have to choose for an access path or a join. (You can find very interesting papers on this topic by Jonathan Lewis or Mohamed Houri)

The problem I met was related to frequency histograms on character columns (CHAR, VARCHAR2 and NVARCHAR2).

In the dictionary, there are two views that give interesting information about how data is distributed:

  • DBA_TAB_COL_STATISTICS: this view gives information about column statistics (density, NDV, sample_size etc.) and if there’s an histogram on the column.
  • DBA_HISTOGRAMS: this view gives information on column histograms and how data values are distributed

Let’s took an example:

LAURENT @ orcl >; create table foo (id number);

Table created.

LAURENT @ orcl > begin
 2 for i in 1..10
 3 loop
 4 insert into foo select i from dual connect by level <=i; 5 end loop; 6 end; 7 / PL/SQL procedure successfully completed. LAURENT @ orcl > commit;

Commit complete.

LAURENT @ orcl > select id,count(*) from foo group by id order by 1;

	ID   COUNT(*)
---------- ----------
	 1	    1
	 2	    2
	 3	    3
	 4	    4
	 5	    5
	 6	    6
	 7	    7
	 8	    8
	 9	    9
	10	   10

10 rows selected.

Now, we gather statistics with a frequency histogram (that’s why I chose 11 buckets):

LAURENT @ orcl > exec dbms_stats.gather_table_stats(ownname=>USER, tabname=>'FOO', method_opt=>'FOR COLUMNS ID SIZE 11', estimate_percent=>100);

PL/SQL procedure successfully completed.

LAURENT @ orcl > select table_name,column_name,num_distinct,density,num_nulls,sample_size,histogram
 2 from user_tab_col_statistics
 3 where table_name='FOO';

TABLE_NAME COLUMN_NAM NUM_DISTINCT    DENSITY  NUM_NULLS SAMPLE_SIZE HISTOGRAM
---------- ---------- ------------ ---------- ---------- ----------- ---------------
FOO        ID                   10 .009090909          0          55 FREQUENCY

Now let’s have a look to the DBA_HISTOGRAMS view (or USER or ALL depending on the context)


LAURENT @ orcl > select table_name,column_name,
  2  endpoint_number,endpoint_value,
  3  endpoint_actual_value
  4  from user_histograms
  5  where table_name='FOO';

TABLE_NAME COLUMN_NAM ENDPOINT_NUMBER ENDPOINT_VALUE ENDPOINT_A
---------- ---------- --------------- -------------- ----------
FOO        ID                       1              1
FOO        ID                       3              2
FOO        ID                       6              3
FOO        ID                      10              4
FOO        ID                      15              5
FOO        ID                      21              6
FOO        ID                      28              7
FOO        ID                      36              8
FOO        ID                      45              9
FOO        ID                      55             10

The ENDPOINT_VALUE represents the column value, for example, and because it’s a frequency histogram, if we’d like to know how many values of 7 there is in the table, we have to take the endpoint_number for this value: 28 and to substract the endpoint_number of the preceding endpoint-value (6) … So there are 28-21=7 lines with the value 7.

Now, you’ve understood this basic … let’s take another example:


LAURENT @ orcl > begin
  2  for i in 1..10000
  3  loop
  4  case mod(i,10)
  5    when 0 then insert into t values(mod(i,10), concat(lpad('X',30,'X'),'01'));
  6    when 1 then insert into t values(mod(i,10), concat(lpad('X',30,'X'),'02'));
  7    when 2 then insert into t values(mod(i,10), concat(lpad('X',30,'X'),'03'));
  8    when 3 then insert into t values(mod(i,10), concat(lpad('X',30,'X'),'04'));
  9    when 4 then insert into t values(mod(i,10), concat(lpad('X',30,'X'),'05'));
 10    when 5 then insert into t values(mod(i,10), concat(lpad('X',30,'X'),'06'));
 11    when 6 then insert into t values(mod(i,10), concat(lpad('X',30,'X'),'07'));
 12    when 7 then insert into t values(mod(i,10), concat(lpad('X',30,'X'),'08'));
 13    when 8 then insert into t values(mod(i,10), concat(lpad('X',30,'X'),'09'));
 14    when 9 then insert into t values(mod(i,10), concat(lpad('X',30,'X'),'10'));
 15  end case;
 16  end loop;
 17  commit;
 18  end;
 19  /

PL/SQL procedure successfully completed.

LAURENT @ orcl > insert into t select 10,'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX' from dual connect by level <=100000; 

100000 rows created. 

LAURENT @ orcl > commit;

Commit complete.

LAURENT @ orcl > exec dbms_stats.gather_table_stats(ownname=>USER, tabname=>'T', method_opt=>'for all columns size 12', estimate_percent=>100);

PL/SQL procedure successfully completed.

LAURENT @ orcl > select table_name,column_name,density,histogram,sample_size from user_tab_col_statistics where table_name='T';

TABLE_NAME COLUMN_NAM    DENSITY HISTOGRAM       SAMPLE_SIZE
---------- ---------- ---------- --------------- -----------
T          ID         4.5455E-06 FREQUENCY            110000
T          V          4.5455E-06 FREQUENCY            110000

And if we have a look to the data distribution of the column V, no problem …


LAURENT @ orcl > select table_name,column_name,
  2  endpoint_number,endpoint_value,
  3  endpoint_actual_value
  4  from user_histograms
  5  where table_name='T' and column_name='V'
  6  order by ENDPOINT_ACTUAL_VALUE;

TABLE_NAME COLUMN_NAM ENDPOINT_NUMBER ENDPOINT_VALUE ENDPOINT_ACTUAL_VALUE
---------- ---------- --------------- -------------- --------------------------------
T          V                     1000     4.5871E+35 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX01
T          V                     2000     4.5871E+35 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX02
T          V                     3000     4.5871E+35 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX03
T          V                     4000     4.5871E+35 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX04
T          V                     5000     4.5871E+35 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX05
T          V                     6000     4.5871E+35 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX06
T          V                     7000     4.5871E+35 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX07
T          V                     8000     4.5871E+35 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX08
T          V                     9000     4.5871E+35 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX09
T          V                    10000     4.5871E+35 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX10
T          V                   110000     4.5871E+35 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

11 rows selected.

The problem comes when the size of the character column exceeds 32 bytes:


LAURENT @ orcl > drop table t purge;

Table dropped.

LAURENT @ orcl > create table t (id number, v varchar2(64));

Table created.

LAURENT @ orcl > begin
for i in 1..10000
loop
case mod(i,10)
  when 0 then insert into t values(mod(i,10), concat(lpad('X',32,'X'),'01'));
  when 1 then insert into t values(mod(i,10), concat(lpad('X',32,'X'),'02'));
  when 2 then insert into t values(mod(i,10), concat(lpad('X',32,'X'),'03'));
  when 3 then insert into t values(mod(i,10), concat(lpad('X',32,'X'),'04'));
  when 4 then insert into t values(mod(i,10), concat(lpad('X',32,'X'),'05'));
  when 5 then insert into t values(mod(i,10), concat(lpad('X',32,'X'),'06'));
  when 6 then insert into t values(mod(i,10), concat(lpad('X',32,'X'),'07'));
  when 7 then insert into t values(mod(i,10), concat(lpad('X',32,'X'),'08'));
  when 8 then insert into t values(mod(i,10), concat(lpad('X',32,'X'),'09'));
  when 9 then insert into t values(mod(i,10), concat(lpad('X',32,'X'),'10'));
end case;
end loop;
commit;
end;
/

LAURENT @ orcl > insert into t select 10,'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX' from dual connect by level <=100000; 100000 rows created. LAURENT @ orcl > commit;

Commit complete.

LAURENT @ orcl > exec dbms_stats.gather_table_stats(ownname=>USER, tabname=>'T', method_opt=>'for all columns size 12', estimate_percent=>100);

PL/SQL procedure successfully completed.

LAURENT @ orcl > select table_name,column_name,density,histogram,sample_size from user_tab_col_statistics where table_name='T';

TABLE_NAME COLUMN_NAM    DENSITY HISTOGRAM       SAMPLE_SIZE
---------- ---------- ---------- --------------- -----------
T          ID         4.5455E-06 FREQUENCY            110000
T          V          4.5455E-06 FREQUENCY            110000

But … if we have a look to the DBA_HISTOGRAMS view …


LAURENT @ orcl > select table_name,column_name,
  2  endpoint_number,endpoint_value,
  3  endpoint_actual_value
  4  from user_histograms
  5  where table_name='T'
  6  order by COLUMN_NAME,ENDPOINT_ACTUAL_VALUE;

TABLE_NAME COLUMN_NAM ENDPOINT_NUMBER ENDPOINT_VALUE ENDPOINT_ACTUAL_VALUE
---------- ---------- --------------- -------------- --------------------------------
T          ID                    1000              0
T          ID                    2000              1
T          ID                    3000              2
T          ID                    4000              3
T          ID                    5000              4
T          ID                    6000              5
T          ID                    7000              6
T          ID                    8000              7
T          ID                    9000              8
T          ID                   10000              9
T          ID                  110000             10
T          V                   110000     4.5871E+35

LAURENT @ orcl > select count(*) from t where v=lpad('X',32,'X')||'05';

  COUNT(*)
----------
      1000

LAURENT @ orcl > select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS'));

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------
SQL_ID  aass8rg64h38b, child number 0
-------------------------------------
select count(*) from t where v=lpad('X',32,'X')||'05'

Plan hash value: 2966233522

-------------------------------------------------------------------------------------
| Id  | Operation          | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |      2 |        |      2 |00:00:00.01 |    1416 |
|   1 |  SORT AGGREGATE    |      |      2 |      1 |      2 |00:00:00.01 |    1416 |
|*  2 |   TABLE ACCESS FULL| T    |      2 |   110K |   2000 |00:00:00.01 |    1416 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("V"='XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX05')

19 rows selected.

LAURENT @ orcl > select count(*) from t where v='XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX';

  COUNT(*)
----------
    100000

LAURENT @ orcl > select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS'));

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------
SQL_ID  adnfzsc6buqa3, child number 0
-------------------------------------
select count(*) from t where v='XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX'

Plan hash value: 2966233522

-------------------------------------------------------------------------------------
| Id  | Operation          | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |      1 |        |      1 |00:00:00.01 |     708 |
|   1 |  SORT AGGREGATE    |      |      1 |      1 |      1 |00:00:00.01 |     708 |
|*  2 |   TABLE ACCESS FULL| T    |      1 |   110K |   100K |00:00:00.01 |     708 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("V"='XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX')

19 rows selected.

So, with this kind of difference between E-rows and A-rows, you can easily face performance problems, specially if this column has been indexed. So we have a frequency histogram and it doesn’t help us to determine the cardinality in the character column…

This behaviour has been reported in the Bug 18377553 : POOR CARDINALITY ESTIMATE WITH HISTOGRAMS AND VALUES > 32 BYTES.

In my case, the problem occurred with NVARCHAR2 columns (national character set was AL16UTF16), so when the column was containing more that 16 characters, performance problems occurred.

Hopefully, Oracle has published a one-off patch for this bug (available for 11.2.0.3 and 11.2.0.4 on linux x86-64), or as a workaround, you can remove histogram on these kind of columns (with the method_opt: FOR COLUMNS … SIZE 1)

In 12.1, the problem occurs only when the size of the character column exceeds 64 bits (32 characters under AL16UTF16 character set)


SQL> select * from v$version;

BANNER                                                                               CON_ID
-------------------------------------------------------------------------------- ----------
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production              0
PL/SQL Release 12.1.0.2.0 - Production                                                    0
CORE	12.1.0.2.0	Production                                                        0
TNS for Linux: Version 12.1.0.2.0 - Production                                            0
NLSRTL Version 12.1.0.2.0 - Production                                                    0

SQL> create table t (id number, v nvarchar2(64));

Table created.

SQL> begin
for i in 1..10000
loop
case mod(i,10)
  when 0 then insert into t values(mod(i,10), concat(lpad('X',32,'X'),'01'));
  when 1 then insert into t values(mod(i,10), concat(lpad('X',32,'X'),'02'));
  when 2 then insert into t values(mod(i,10), concat(lpad('X',32,'X'),'03'));
  when 3 then insert into t values(mod(i,10), concat(lpad('X',32,'X'),'04'));
  when 4 then insert into t values(mod(i,10), concat(lpad('X',32,'X'),'05'));
  when 5 then insert into t values(mod(i,10), concat(lpad('X',32,'X'),'06'));
  when 6 then insert into t values(mod(i,10), concat(lpad('X',32,'X'),'07'));
  when 7 then insert into t values(mod(i,10), concat(lpad('X',32,'X'),'08'));
  when 8 then insert into t values(mod(i,10), concat(lpad('X',32,'X'),'09'));
  when 9 then insert into t values(mod(i,10), concat(lpad('X',32,'X'),'10'));
end case;
end loop;
commit;
end;
/

PL/SQL procedure successfully completed.

SQL> insert into t select 10,'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX' from dual connect by level <=100000; 100000 rows created. SQL> commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(ownname=>USER, tabname=>'T', method_opt=>'for all columns size 12', estimate_percent=>100);

PL/SQL procedure successfully completed.

SQL> select table_name,column_name,density,histogram,sample_size from user_tab_col_statistics where table_name='T';

TABLE_NAME COLUMN_NAM    DENSITY HISTOGRAM       SAMPLE_SIZE
---------- ---------- ---------- --------------- -----------
T          ID         4.5455E-06 FREQUENCY            110000
T          V          4.5455E-06 FREQUENCY            110000

SQL> select table_name,column_name,
  2  endpoint_number,endpoint_value,
  3  endpoint_actual_value
  4  from user_histograms
  5  where table_name='T'
  6  order by COLUMN_NAME,ENDPOINT_NUMBER;

TABLE_NAME COLUMN_NAM ENDPOINT_NUMBER ENDPOINT_VALUE ENDPOINT_ACTUAL_VALUE
---------- ---------- --------------- -------------- ----------------------------------
T          ID                    1000              0 0
T          ID                    2000              1 1
T          ID                    3000              2 2
T          ID                    4000              3 3
T          ID                    5000              4 4
T          ID                    6000              5 5
T          ID                    7000              6 6
T          ID                    8000              7 7
T          ID                    9000              8 8
T          ID                   10000              9 9
T          ID                  110000             10 10
T          V                   110000     1.7849E+33  X X X X X X X X X X X X X X X X X
                                                      X X X X X X X X X X X X X X X

12 rows selected.

Problem with 12.1 databases, the one-off patch has not been released yet and It will be fixed in 12.2.

Oracle Database 12c CDB$VIEW function

In Oracle 12c multitenant container, there are some new views that complement the USER_, ALL_ and DBA_Views, the CDB_ views.

For example, CDB_TABLES, CDB_OBJECTS, CDB_VIEWS etc. Those views give information for all containers. For example, CDB_TABLES references all tables information in all containers (CDB$ROOT, PDB$SEED and all plugabble databases). Those views have a new column CON_ID that reference the container id, for example:

[SYS@CDB$ROOT | SID:CDB]> select con_id,count(*) from cdb_tables group by con_id;

    CON_ID   COUNT(*)
---------- ----------
         2       2316
         3       2316
         1       2324
 If you have a closer look to the view definition, you will find that every CDB_ view uses a specific function named CDB$VIEW:
[SYS@CDB$ROOT | SID:CDB]> select dbms_metadata.get_ddl('VIEW','CDB_TABLES') from dual;

DBMS_METADATA.GET_DDL('VIEW','CDB_TABLES')
--------------------------------------------------------------------------------
  CREATE OR REPLACE FORCE NONEDITIONABLE VIEW "SYS"."CDB_TABLES"  CONTAINER_DATA
 ("OWNER", "TABLE_NAME", "TABLESPACE_NAME", "CLUSTER_NAME", "IOT_NAME", "STATUS"
, "PCT_FREE", "PCT_USED", "INI_TRANS", "MAX_TRANS", "INITIAL_EXTENT", "NEXT_EXTE
NT", "MIN_EXTENTS", "MAX_EXTENTS", "PCT_INCREASE", "FREELISTS", "FREELIST_GROUPS
", "LOGGING", "BACKED_UP", "NUM_ROWS", "BLOCKS", "EMPTY_BLOCKS", "AVG_SPACE", "C
HAIN_CNT", "AVG_ROW_LEN", "AVG_SPACE_FREELIST_BLOCKS", "NUM_FREELIST_BLOCKS", "D
EGREE", "INSTANCES", "CACHE", "TABLE_LOCK", "SAMPLE_SIZE", "LAST_ANALYZED", "PAR
TITIONED", "IOT_TYPE", "TEMPORARY", "SECONDARY", "NESTED", "BUFFER_POOL", "FLASH
_CACHE", "CELL_FLASH_CACHE", "ROW_MOVEMENT", "GLOBAL_STATS", "USER_STATS", "DURA
TION", "SKIP_CORRUPT", "MONITORING", "CLUSTER_OWNER", "DEPENDENCIES", "COMPRESSI
ON", "COMPRESS_FOR", "DROPPED", "READ_ONLY", "SEGMENT_CREATED", "RESULT_CACHE",
"CLUSTERING", "ACTIVITY_TRACKING", "DML_TIMESTAMP", "HAS_IDENTITY", "CONTAINER_D
ATA", "CON_ID") AS
  SELECT "OWNER","TABLE_NAME","TABLESPACE_NAME","CLUSTER_NAME","IOT_NAME","STATU
S","PCT_FREE","PCT_USED","INI_TRANS","MAX_TRANS","INITIAL_EXTENT","NEXT_EXTENT",
"MIN_EXTENTS","MAX_EXTENTS","PCT_INCREASE","FREELISTS","FREELIST_GROUPS","LOGGIN
G","BACKED_UP","NUM_ROWS","BLOCKS","EMPTY_BLOCKS","AVG_SPACE","CHAIN_CNT","AVG_R
OW_LEN","AVG_SPACE_FREELIST_BLOCKS","NUM_FREELIST_BLOCKS","DEGREE","INSTANCES","
CACHE","TABLE_LOCK","SAMPLE_SIZE","LAST_ANALYZED","PARTITIONED","IOT_TYPE","TEMP
ORARY","SECONDARY","NESTED","BUFFER_POOL","FLASH_CACHE","CELL_FLASH_CACHE","ROW_
MOVEMENT","GLOBAL_STATS","USER_STATS","DURATION","SKIP_CORRUPT","MONITORING","CL
USTER_OWNER","DEPENDENCIES","COMPRESSION","COMPRESS_FOR","DROPPED","READ_ONLY","
SEGMENT_CREATED","RESULT_CACHE","CLUSTERING","ACTIVITY_TRACKING","DML_TIMESTAMP"
,"HAS_IDENTITY","CONTAINER_DATA","CON_ID" FROM CDB$VIEW("SYS"."DBA_TABLES")
After searching a long time, I didn’t find any definition of this function, so I decided to search more about its behaviour.
First of all, this function seems “to transform” the view or the table given in parameter by adding a CON_ID table. We will see later that it’s not a CBO transformation as we know it but a “low level” transformation performed before SQL parsing.
CDB$VIEW function can be used with static views (DBA_TABLES, DBA_OBJECTS etc.), dictionary tables (OBJ$, USER$, FILE$ etc.), on dynamic performance views and X$ fixed tables, but on those two last items, the result is a little bit different than other.
[SYS@CDB$ROOT | SID:CDB]> select count(*) from cdb$view("DBA_TABLES");

  COUNT(*)
----------
      6956

[SYS@CDB$ROOT | SID:CDB]> select count(*) from cdb$view("OBJ$");

  COUNT(*)
----------
    272326
In a multitenant database, each container has its own dictionary, for example, CDB$ROOT has its OBJ$ table, the PDB$SEED has its own one and every pluggable database has its own OBJ$ table. So, the goal of this function is to give a global view from all containers.
But, as the OBJ$ table has the same name in every container, and as there’s no method to access a specific dictionary table of a pluggable database from the root container, the CDB$VIEW function is used to aggregate data from a specific view or table executed in each container:
[SYS@CDB$ROOT | SID:CDB]> show con_id;

CON_ID
------------------------------
1
[SYS@CDB$ROOT | SID:CDB]> select count(*) from obj$;

  COUNT(*)
----------
     90847

[SYS@CDB$ROOT | SID:CDB]>  select con_id,count(*) from cdb$view("OBJ$") group by con_id;

    CON_ID   COUNT(*)
---------- ----------
         1      90847
         2      90716
         3      90763
Now, let’s have a closer look to the generated plan:
[SYS@CDB$ROOT | SID:CDB]> select count(*) from obj$;

  COUNT(*)
----------
     90847

Execution Plan
----------------------------------------------------------
Plan hash value: 3951003077

------------------------------------------------------------------------
| Id  | Operation             | Name   | Rows  | Cost (%CPU)| Time     |
------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |        |     1 |    91   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE       |        |     1 |            |          |
|   2 |   INDEX FAST FULL SCAN| I_OBJ1 | 90910 |    91   (0)| 00:00:01 |
------------------------------------------------------------------------
[SYS@CDB$ROOT | SID:CDB]>  select count(*) from cdb$view("OBJ$");

  COUNT(*)
----------
    272326

Execution Plan
----------------------------------------------------------
Plan hash value: 2345629731

-----------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                 | Name             | Rows  | Cost (%CPU)| Time     | Pstart| Pstop |    TQ  |IN-OUT| PQ Distrib |
-----------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |                  |     1 |     2 (100)| 00:00:01 |       |       |        |      |            |
|   1 |  SORT AGGREGATE           |                  |     1 |            |          |       |       |        |      |            |
|   2 |   PX COORDINATOR          |                  |       |            |          |       |       |        |      |            |
|   3 |    PX SEND QC (RANDOM)    | :TQ10000         |     1 |            |          |       |       |  Q1,00 | P->S | QC (RAND)  |
|   4 |     SORT AGGREGATE        |                  |     1 |            |          |       |       |  Q1,00 | PCWP |            |
|   5 |      PX PARTITION LIST ALL|                  | 90910 |     2 (100)| 00:00:01 |     1 |   254 |  Q1,00 | PCWC |            |
|   6 |       FIXED TABLE FULL    | X$CDBVW$c2fad3da | 90910 |     2 (100)| 00:00:01 |       |       |  Q1,00 | PCWP |            |
-----------------------------------------------------------------------------------------------------------------------------------
There are two interesting things to note:
  • CDB$VIEW transform the statement to query an X$CDBVW$c2fad3da fixed table
  • This fixed table is read in parallel

About the transformation, I noticed many things.

First, if we query another table or view, another X$CDBVW$ is generated:

[SYS@CDB$ROOT | SID:CDB]> select count(*) from cdb$view("TAB$");

  COUNT(*)
----------
      7098

Execution Plan
----------------------------------------------------------
Plan hash value: 111784239

-----------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                 | Name             | Rows  | Cost (%CPU)| Time     | Pstart| Pstop |    TQ  |IN-OUT| PQ Distrib |
-----------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |                  |     1 |     0   (0)| 00:00:01 |       |       |        |      |            |
|   1 |  SORT AGGREGATE           |                  |     1 |            |          |       |       |        |      |            |
|   2 |   PX COORDINATOR          |                  |       |            |          |       |       |        |      |            |
|   3 |    PX SEND QC (RANDOM)    | :TQ10000         |     1 |            |          |       |       |  Q1,00 | P->S | QC (RAND)  |
|   4 |     SORT AGGREGATE        |                  |     1 |            |          |       |       |  Q1,00 | PCWP |            |
|   5 |      PX PARTITION LIST ALL|                  |  2372 |     0   (0)| 00:00:01 |     1 |   254 |  Q1,00 | PCWC |            |
|   6 |       FIXED TABLE FULL    | X$CDBVW$00303d4d |  2372 |     0   (0)| 00:00:01 |       |       |  Q1,00 | PCWP |            |
-----------------------------------------------------------------------------------------------------------------------------------
So, it seems that there’s a hash added at the end of this X$CDBVW$. It’s interesting to see that the fixed table X$CDBVW$ exists in the instance but have no rows recorded in:
[SYS@CDB$ROOT | SID:CDB]> desc X$CDBVW$
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 ADDR                                               RAW(8)
 INDX                                               NUMBER
 INST_ID                                            NUMBER
 CON_ID                                             NUMBER

[SYS@CDB$ROOT | SID:CDB]> select * from X$CDBVW$;

no rows selected
Second thing to note is this transformation is not a CBO transformation, indeed if you have a closer look to a 10053 trace file, the query has already been rewrited before parsing:
Stmt: ******* UNPARSED QUERY IS *******
SELECT COUNT(*) "COUNT(*)" FROM "SYS"."X$CDBVW$00303d4d" "TAB$"
Objects referenced in the statement
  X$CDBVW$00303d4d[TAB$] 4, type = 1
Objects in the hash table
  Hash table Object 4, type = 1, ownerid = 795352840147398986:
    Dynamic Sampling Directives at location 1:
       dirid = 11259849835960924452, state = 5, flags = 1, loc = 1 {E(4)[30]}
Return code in qosdInitDirCtx: ENBLD
Concerning the parallel execution, I noticed partition pruning is made upon the CON_ID column:
[SYS@CDB$ROOT | SID:CDB]> select count(*) from cdb$view("TAB$") where con_id=1;

  COUNT(*)
----------
      2372

Execution Plan
----------------------------------------------------------
Plan hash value: 2579428923

----------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name             | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |    TQ  |IN-OUT| PQ Distrib |
----------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                  |     1 |    13 |     0   (0)| 00:00:01 |       |       |        |      |            |
|   1 |  SORT AGGREGATE              |                  |     1 |    13 |            |          |       |       |        |      |            |
|   2 |   PX COORDINATOR             |                  |       |       |            |          |       |       |        |      |            |
|   3 |    PX SEND QC (RANDOM)       | :TQ10000         |     1 |    13 |            |          |       |       |  Q1,00 | P->S | QC (RAND)  |
|   4 |     SORT AGGREGATE           |                  |     1 |    13 |            |          |       |       |  Q1,00 | PCWP |            |
|   5 |      PX PARTITION LIST SINGLE|                  |    24 |   312 |     0   (0)| 00:00:01 |     1 |     1 |  Q1,00 | PCWC |            |
|   6 |       FIXED TABLE FULL       | X$CDBVW$00303d4d |    24 |   312 |     0   (0)| 00:00:01 |       |       |  Q1,00 | PCWP |            |
----------------------------------------------------------------------------------------------------------------------------------------------
 To sum up, cdb$view function creates a fixed table on the fly and load data taken from all containers, the CON_ID column in this fixed table references the container id from where data are read. This fixed table is partitioned by CON_ID and reading this table is made by default in parallel mode (partition pruning is made on the CON_ID column).
There are two hidden parameters used to change this behaviour:
  • _partition_cdb_view_enabled: this parameter seems to disable the partitioning of the X$CDBVW$ view
[SYS@CDB$ROOT | SID:CDB]> alter session set "_partition_cdb_view_enabled"=FALSE;

Session altered.

[SYS@CDB$ROOT | SID:CDB]> select count(*) from cdb$view("TAB$");

  COUNT(*)
----------
      7098

Execution Plan
----------------------------------------------------------
Plan hash value: 4209588238

------------------------------------------------------------------------------
| Id  | Operation         | Name             | Rows  | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |                  |     1 |     0   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE   |                  |     1 |            |          |
|   2 |   FIXED TABLE FULL| X$CDBVW$00303d4d |  2372 |     0   (0)| 00:00:01 |
------------------------------------------------------------------------------
  • _px_cdb_view_enabled: this parameter seems to disable parallel scan of the X$CDBVW$ view
[SYS@CDB$ROOT | SID:CDB]> alter session set "_px_cdb_view_enabled"=FALSE;

Session altered.

[SYS@CDB$ROOT | SID:CDB]> select count(*) from cdb$view("TAB$");

  COUNT(*)
----------
      7098

Execution Plan
----------------------------------------------------------
Plan hash value: 3630495286

------------------------------------------------------------------------------------------------
| Id  | Operation           | Name             | Rows  | Cost (%CPU)| Time     | Pstart| Pstop |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |                  |     1 |     0   (0)| 00:00:01 |       |       |
|   1 |  SORT AGGREGATE     |                  |     1 |            |          |       |       |
|   2 |   PARTITION LIST ALL|                  |  2372 |     0   (0)| 00:00:01 |     1 |   254 |
|   3 |    FIXED TABLE FULL | X$CDBVW$00303d4d |  2372 |     0   (0)| 00:00:01 |       |       |
------------------------------------------------------------------------------------------------

On the CBO side, I tried to have a closer look on how statistics are evaluated for this particular view.

It seems that the CBO uses the cardinality of the dictionary table available in the CDB$ROOT container and ignore the statistics of the other dictionary table (taken from the other containers):

[SYS@CDB$ROOT | SID:CDB]> select con_id,table_name,num_rows from cdb_tables where table_name='TAB$';

    CON_ID TABLE_NAME        NUM_ROWS
---------- --------------- ----------
         1 TAB$                  2372
         2 TAB$                  1462
         3 TAB$                  2363

[SYS@CDB$ROOT | SID:CDB]> alter session set events '10053 trace name context forever, level 1';

Session altered.

[SYS@CDB$ROOT | SID:CDB]> select /* PARSE_ME */ count(*) from cdb$view("TAB$");

  COUNT(*)
----------
      7098
 But If we have a closer look to the generated 10053 trace file, and the cardinalities seems to be a little bit different from reality:
***************************************
BASE STATISTICAL INFORMATION
***********************
Table Stats::
  Table: X$CDBVW$00303d4d  Alias: TAB$
  #Rows: 2372  #Blks:  1487  AvgRowLen:  134.00  ChainCnt:  0.00
  FixedTabRowCost:  1000.00
=======================================
SPD: BEGIN context at query block level
=======================================
Query Block SEL$1 (#0)
Return code in qosdSetupDirCtx4QB: NOCTX
=====================================
SPD: END context at query block level
=====================================
Access path analysis for X$CDBVW$00303d4d
***************************************
SINGLE TABLE ACCESS PATH
  Single Table Cardinality Estimation for X$CDBVW$00303d4d[TAB$]
SPD: Return code in qosdDSDirSetup: NOCTX, estType = TABLE
  Table: X$CDBVW$00303d4d  Alias: TAB$
    Card: Original: 2372.000000  Rounded: 2372  Computed: 2372.00  Non Adjusted: 2372.00
  Access Path: TableScan
    Cost:  0.06  Resp: 0.06  Degree: 0
      Cost_io: 0.00  Cost_cpu: 2372000
      Resp_io: 0.00  Resp_cpu: 2372000
  Best:: AccessPath: TableScan
         Cost: 0.06  Degree: 1  Resp: 0.06  Card: 2372.00  Bytes: 0

***************************************
 .../...
============
Plan Table
============
-----------------------------------------------------+-----------------------------------+-------------------------+---------------+
| Id  | Operation                  | Name            | Rows  | Bytes | Cost  | Time      |  TQ  |IN-OUT|PQ Distrib | Pstart| Pstop |
-----------------------------------------------------+-----------------------------------+-------------------------+---------------+
| 0   | SELECT STATEMENT           |                 |       |       |     1 |           |      |      |           |       |       |
| 1   |  SORT AGGREGATE            |                 |     1 |       |       |           |      |      |           |       |       |
| 2   |   PX COORDINATOR           |                 |       |       |       |           |      |      |           |       |       |
| 3   |    PX SEND QC (RANDOM)     | :TQ10000        |     1 |       |       |           |:Q1000| P->S |QC (RANDOM)|       |       |
| 4   |     SORT AGGREGATE         |                 |     1 |       |       |           |:Q1000| PCWP |           |       |       |
| 5   |      PX PARTITION LIST ALL |                 |  2372 |       |     0 |           |:Q1000| PCWC |           | 1     | 254   |
| 6   |       FIXED TABLE FULL     | X$CDBVW$00303d4d|  2372 |       |     0 |           |:Q1000| PCWP |           |       |       |
-----------------------------------------------------+-----------------------------------+-------------------------+---------------+
Now imagine, you have consolidated a multitenant database with 200 pdbs, each pdb host an ERP with tons of table, you will have a small difference between evaluation and real row count.
To conclude this part, If you perform the same analysis but you query the CDB_TABLES view, this one will be rewrite in CDB$VIEW(“DBA_TABLES”). As the DBA_TABLES is a view and is not analyzed, Oracle seems to set the cardinality to 10000 rows. I have made some tests on many views and it’s always the same cardinality : 10000 rows and 0 block.
***************************************
BASE STATISTICAL INFORMATION
***********************
Table Stats::
  Table: X$CDBVW$73c6ce97  Alias: DBA_TABLES  (NOT ANALYZED)
  #Rows: 10000  #Blks:  0  AvgRowLen:  464.00  ChainCnt:  0.00
  FixedTabRowCost:  1000.00
=======================================
SPD: BEGIN context at query block level
=======================================
Query Block SEL$F5BB74E1 (#0)
Return code in qosdSetupDirCtx4QB: NOCTX
=====================================
SPD: END context at query block level
=====================================
Access path analysis for X$CDBVW$73c6ce97
***************************************
SINGLE TABLE ACCESS PATH
  Single Table Cardinality Estimation for X$CDBVW$73c6ce97[DBA_TABLES]
SPD: Return code in qosdDSDirSetup: NOCTX, estType = TABLE
  Table: X$CDBVW$73c6ce97  Alias: DBA_TABLES
    Card: Original: 10000.000000  Rounded: 10000  Computed: 10000.00  Non Adjusted: 10000.00
  Access Path: TableScan
    Cost:  0.25  Resp: 0.25  Degree: 0
      Cost_io: 0.00  Cost_cpu: 10000000
      Resp_io: 0.00  Resp_cpu: 10000000
  Best:: AccessPath: TableScan
         Cost: 0.25  Degree: 1  Resp: 0.25  Card: 10000.00  Bytes: 0

***************************************
To conclude this part, even if statistics on CDB$VIEW(“DBA_TABLES”) are always set to 10000 rows, it is safe to use CDB_ views directly instead on CDB$VIEW function on a dictionary table, even if I didn’t find any more accurate method.
The last researches I’ve made is on how the Oracle kernel will transform the query before parsing.
As we saw previously, the CBO receives an unparsed query which is already rewritten with the X$CDBVW$ view, so the transformation might be performed at a low level.
In Oracle 12c there’s a new object file fitted in libserver.a library (you can get this file in the 12.1 PSU2 files directory). This object file is named kpdbcv.o and it contains many function named kpdbcv* (I think this means Kernel PDB Cdb View).
[oracle@oel63 libserver12.a]$ pwd
/home/oracle/patch/17552800/files/lib/libserver12.a
[oracle@oel63 libserver12.a]$ readelf -a kpdbcv.o | grep FUNC
     7: 0000000000000270   448 FUNC    LOCAL  DEFAULT    4 kpdbcvRwtToInlineView
    10: 00000000000008f0  1216 FUNC    LOCAL  DEFAULT    4 kpdbcvLoadFragDescr
    13: 0000000000001c60    96 FUNC    LOCAL  DEFAULT    4 kpdbcvLoadBaseInfo
    14: 00000000000003b0    80 FUNC    LOCAL  DEFAULT    3 kpdbcvBuildColNames
    15: 0000000000009d50  1680 FUNC    LOCAL  DEFAULT    3 kpdbcvGetSysCtxCurrUser
    16: 0000000000001940  7616 FUNC    LOCAL  DEFAULT    3 kpdbcvFetchCbkCon
    17: 0000000000003700    96 FUNC    LOCAL  DEFAULT    3 kpdbcvFetchCbkSys
    18: 0000000000004ff0  1680 FUNC    LOCAL  DEFAULT    3 kpdbcvInsBaseViewStats
    19: 0000000000005680  1744 FUNC    LOCAL  DEFAULT    3 kpdbcvGetBaseViewRowCnt
    20: 0000000000006bc0  1792 FUNC    LOCAL  DEFAULT    3 kpdbcvGetBaseViewStats
    21: 00000000000072c0  1136 FUNC    LOCAL  DEFAULT    3 kpdbcvGetFxdViewStats
    22: 0000000000007730   544 FUNC    LOCAL  DEFAULT    3 kpdbcvFxdViewStatsCbk
    23: 0000000000007b30    80 FUNC    LOCAL  DEFAULT    3 kpdbcvGetConIdCnt
    24: 0000000000007b80   448 FUNC    LOCAL  DEFAULT    3 kpdbcvConIdMorph
    25: 0000000000007d40    80 FUNC    LOCAL  DEFAULT    3 kpdbcvGatherBind
    26: 0000000000007d90  2304 FUNC    LOCAL  DEFAULT    3 kpdbcvPrintPredText
    27: 0000000000008690   304 FUNC    LOCAL  DEFAULT    3 kpdbcvVisitPreds
    28: 00000000000087c0   384 FUNC    LOCAL  DEFAULT    3 kpdbcvVisitOpns
    29: 0000000000008940   384 FUNC    LOCAL  DEFAULT    3 kpdbcvReplaceBind
    30: 0000000000008ac0  2944 FUNC    LOCAL  DEFAULT    3 kpdbcvIsValidWhere
    31: 0000000000009640  1648 FUNC    LOCAL  DEFAULT    3 kpdbcvPredicateToText
    37: 0000000000000010    32 FUNC    GLOBAL DEFAULT    3 kpdbcdbvwcbk
    38: 0000000000000030    16 FUNC    GLOBAL DEFAULT    3 kpdbcvComDataCbk
    40: 0000000000000010   608 FUNC    GLOBAL DEFAULT    4 kpdbcvXformFrodef
    52: 0000000000000430  1216 FUNC    GLOBAL DEFAULT    4 kpdbcvLoadPartDescr
    73: 0000000000000db0   320 FUNC    GLOBAL DEFAULT    4 kpdbcvIsPartitioned
    88: 0000000000000050   784 FUNC    GLOBAL DEFAULT    3 kpdbcvIsParallelizable
    93: 0000000000000360    64 FUNC    GLOBAL DEFAULT    3 kpdbcvGetDOP
    99: 0000000000000f00  3232 FUNC    GLOBAL DEFAULT    4 kpdbcvLoad
   135: 0000000000000400  5440 FUNC    GLOBAL DEFAULT    3 kpdbcvFetchStart
   201: 0000000000003760  4560 FUNC    GLOBAL DEFAULT    3 kpdbcvFetch
   216: 0000000000004930   304 FUNC    GLOBAL DEFAULT    3 kpdbcvCleanup
   217: 0000000000001bb0   160 FUNC    GLOBAL DEFAULT    4 kpdbcvIsCDBViewOwner
   219: 0000000000004a70  1408 FUNC    GLOBAL DEFAULT    3 kpdbcvDelALLBaseViewStats
   221: 0000000000005d50  3696 FUNC    GLOBAL DEFAULT    3 kpdbcvInsALLBaseViewStats
   222: 0000000000007950    48 FUNC    GLOBAL DEFAULT    3 kpdbcvCDBViewStatsCbk
   223: 0000000000007980    48 FUNC    GLOBAL DEFAULT    3 kpdbcvComDataStatsCbk
   224: 00000000000079b0   384 FUNC    GLOBAL DEFAULT    3 kpdbcvGetParamValue
   241: 0000000000009cb0   160 FUNC    GLOBAL DEFAULT    3 kpdbcvAllocate
Depending on the context (cdb view partitoning enabled or not, parallel scan enabled or not), you will have a different call stack, but with gdb I found that the common point is the first function called which is kpdbcvXformFrodef function.
If you are curious and want to view the kpdbcv* function call stack during a select on CDB_ view, you can use gdb:
(gdb) rbreak ^kpdbcv
(gdb) commands
 Type commands for breakpoint(s) 1-38, one per line.
 End with a line saying just "end".
 >continue
 >end
(gdb) c
Continuing.

Breakpoint 1, 0x0000000002578620 in kpdbcvXformFrodef ()

Breakpoint 6, 0x0000000002579510 in kpdbcvLoad ()

Breakpoint 5, 0x00000000025793c0 in kpdbcvIsPartitioned ()

Breakpoint 5, 0x00000000025793c0 in kpdbcvIsPartitioned ()

Breakpoint 3, 0x0000000002578a40 in kpdbcvLoadPartDescr ()

Breakpoint 4, 0x0000000002578f00 in kpdbcvLoadFragDescr ()

Breakpoint 4, 0x0000000002578f00 in kpdbcvLoadFragDescr ()

.../...
Breakpoint 4, 0x0000000002578f00 in kpdbcvLoadFragDescr ()

Breakpoint 4, 0x0000000002578f00 in kpdbcvLoadFragDescr ()

Breakpoint 4, 0x0000000002578f00 in kpdbcvLoadFragDescr ()

Breakpoint 7, 0x000000000257a1c0 in kpdbcvIsCDBViewOwner ()

Breakpoint 25, 0x00000000083cd580 in kpdbcvCDBViewStatsCbk ()

Breakpoint 24, 0x00000000083cd360 in kpdbcvFxdViewStatsCbk ()

Breakpoint 23, 0x00000000083ccef0 in kpdbcvGetFxdViewStats ()

Breakpoint 22, 0x00000000083cc7f0 in kpdbcvGetBaseViewStats ()

Breakpoint 22, 0x00000000083cc7f0 in kpdbcvGetBaseViewStats ()

Breakpoint 10, 0x00000000083c5c80 in kpdbcvIsParallelizable ()

Breakpoint 5, 0x00000000025793c0 in kpdbcvIsPartitioned ()

Breakpoint 11, 0x00000000083c5f90 in kpdbcvGetDOP ()

Breakpoint 37, 0x00000000083cf8e0 in kpdbcvAllocate ()

Breakpoint 36, 0x00000000083cf270 in kpdbcvPredicateToText ()
To sum up this post, we have to keep in mind that queries on CDB_ views use an internal function CDB$VIEW that transform the sql text before parsing and as a result, there is a new fixed table X$CDBVW$ concatenated to a hash that might identify the source table or view. The statement that will run by default in parallel mode because the X$CDBVW$ fixed table is build as partitioned (by default).
On the CBO side, the result of the CDB_ view transformation give an object which have a default cardinality that seems to be always set at a value of 10000 rows. The result is quite different if you try to run CDB$VIEW on dictionary tables (OBJ$, TAB$ etc) and can give some trouble if you have a lot of PDBs opened in your container.
All this transformations seem to be hard coded (partially) in an object file kpdbcv.o included in the libserver12.a library.

Statistics on fixed objects

On a previous post, I spoke about statistics and I promised new posts about statistics on fixed objects and system stats and their potential impact.

Fixed objects are internal structures where oracle internal data are stored. This objects are the source of data for dynamic performance views (V$SQL, V$ACCESS, and many V$ views).

Let’s see how statistics can be important on this.

To demonstrate this, I used the V$ACCESS view which have the SQL above for definition:

SELECT DISTINCT s.inst_id,
  s.ksusenum,
  o.kglnaown,
  o.kglnaobj,
  o.kglobtyd,
  s.con_id
FROM x$ksuse s,
  x$kglob o,
  x$kgldp d,
  x$kgllk l
WHERE l.kgllkuse=s.addr
AND l.kgllkhdl  =d.kglhdadr
AND l.kglnahsh  =d.kglnahsh
AND o.kglnahsh  =d.kglrfhsh
AND o.kglhdadr  =d.kglrfhdl
First, we’ll have a look at a simple query on V$ACCESS on a 12cR1 instance without any statistics. Notice that I disable cardinality feedback … we’ll see why later.
SQL> exec dbms_stats.delete_fixed_objects_stats;

PL/SQL procedure successfully completed.

SQL> select rowcnt, blkcnt, analyzetime, samplesize
  2  from tab_stats$
  3  where obj# in (select OBJECT_ID from V$FIXED_TABLE)
  4  /

no rows selected

SQL> select /*+ opt_param('_OPTIMIZER_USE_FEEDBACK','FALSE') */ count(*) from v$access;

  COUNT(*)
----------
       845

SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------
SQL_ID  bqjbmqr1xtypb, child number 0
-------------------------------------
select /*+ opt_param('_OPTIMIZER_USE_FEEDBACK','FALSE') */ count(*)
from v$access

Plan hash value: 2879213603

-------------------------------------------------------------------------------------
| Id  | Operation                    | Name            | Rows  | Bytes | Cost (%CPU)|
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                 |       |       |     1 (100)|
|   1 |  SORT AGGREGATE              |                 |     1 |       |            |
|   2 |   VIEW                       | GV$ACCESS       |     1 |       |     0   (0)|
|   3 |    HASH UNIQUE               |                 |     1 |   697 |     0   (0)|
|   4 |     NESTED LOOPS             |                 |     1 |   697 |     0   (0)|
|   5 |      NESTED LOOPS            |                 |     1 |   108 |     0   (0)|
|   6 |       HASH JOIN              |                 |     1 |    70 |     0   (0)|
|   7 |        FIXED TABLE FULL      | X$KSUSE         |     1 |    45 |     0   (0)|
|   8 |        FIXED TABLE FULL      | X$KGLLK         |   100 |  2500 |     0   (0)|
|   9 |       FIXED TABLE FIXED INDEX| X$KGLDP (ind:1) |     1 |    38 |     0   (0)|
|  10 |      FIXED TABLE FIXED INDEX | X$KGLOB (ind:1) |     1 |   589 |     0   (0)|
-------------------------------------------------------------------------------------

23 rows selected.

SQL> exec dbms_stats.gather_fixed_objects_stats;

PL/SQL procedure successfully completed.

SQL> select /*+ opt_param('_OPTIMIZER_USE_FEEDBACK','FALSE') */ count(*)
  2  from v$access;

  COUNT(*)
----------
       911

SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------
SQL_ID  12qyw31bvmptb, child number 0
-------------------------------------
select /*+ opt_param('_OPTIMIZER_USE_FEEDBACK','FALSE') */ count(*)
from v$access

Plan hash value: 2296283840

-----------------------------------------------------------------------------------------------
| Id  | Operation                   | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                 |       |       |     1 (100)|          |
|   1 |  SORT AGGREGATE             |                 |     1 |       |            |          |
|   2 |   VIEW                      | GV$ACCESS       |  1226 |       |     1 (100)| 00:00:01 |
|   3 |    HASH UNIQUE              |                 |  1226 |   335K|     1 (100)| 00:00:01 |
|   4 |     NESTED LOOPS            |                 |  1226 |   335K|     1 (100)| 00:00:01 |
|   5 |      HASH JOIN              |                 |  1226 | 91950 |     1 (100)| 00:00:01 |
|   6 |       HASH JOIN             |                 |  1226 | 52718 |     0   (0)|          |
|   7 |        FIXED TABLE FULL     | X$KSUSE         |   472 |  8496 |     0   (0)|          |
|   8 |        FIXED TABLE FULL     | X$KGLLK         |  1226 | 30650 |     0   (0)|          |
|   9 |       FIXED TABLE FULL      | X$KGLDP         |  4524 |   141K|     0   (0)|          |
|  10 |      FIXED TABLE FIXED INDEX| X$KGLOB (ind:1) |     1 |   205 |     0   (0)|          |
-----------------------------------------------------------------------------------------------

23 rows selected.
Notice that, the cardinality estimation have changed before and after statistics gathering. The second one is more precise and closer from the real value. The plan have changed too but without any effect on execution.
Starting in Oracle 11gR2, cardinality feedback can help to calculate a better cardinality based on previous execution. If a first execution is estimated as a bad execution (ie. estimated cardinality is significantly different from the real cardinality), oracle stores the real cardinality and uses it for a future execution. This feature can be very helpful for the CBO in the case of repeated statement and cardinalities which don’t evolve too much. (See this link for more information: https://blogs.oracle.com/optimizer/entry/cardinality_feedback)
So, you can choose to gather statistics or to leave cardinality feedback do the job.
The next example show how cardinality feedback can have a bad impact on fixed objects cardinality estimation.
SQL> exec dbms_stats.delete_fixed_objects_stats;

PL/SQL procedure successfully completed.

SQL> select count(*) from v$access;

  COUNT(*)
----------
       315

Elapsed: 00:00:00.03
SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------
SQL_ID  dw4h66wchcsm7, child number 0
-------------------------------------
select count(*) from v$access

Plan hash value: 2879213603

-------------------------------------------------------------------------------------
| Id  | Operation                    | Name            | Rows  | Bytes | Cost (%CPU)|
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                 |       |       |     1 (100)|
|   1 |  SORT AGGREGATE              |                 |     1 |       |            |
|   2 |   VIEW                       | GV$ACCESS       |     1 |       |     0   (0)|
|   3 |    HASH UNIQUE               |                 |     1 |   697 |     0   (0)|
|   4 |     NESTED LOOPS             |                 |     1 |   697 |     0   (0)|
|   5 |      NESTED LOOPS            |                 |     1 |   108 |     0   (0)|
|   6 |       HASH JOIN              |                 |     1 |    70 |     0   (0)|
|   7 |        FIXED TABLE FULL      | X$KSUSE         |     1 |    45 |     0   (0)|
|   8 |        FIXED TABLE FULL      | X$KGLLK         |   100 |  2500 |     0   (0)|
|   9 |       FIXED TABLE FIXED INDEX| X$KGLDP (ind:1) |     1 |    38 |     0   (0)|
|  10 |      FIXED TABLE FIXED INDEX | X$KGLOB (ind:1) |     1 |   589 |     0   (0)|
-------------------------------------------------------------------------------------

22 rows selected.

Elapsed: 00:00:00.23
 This takes less than a second.
I wait a while and take a coffee. Then, I execute a the same query, cardinality feedback will be used because there’s a significant difference between the first estimation and the real cardinality.
SQL> select count(*) from v$access;

  COUNT(*)
----------
       429

Elapsed: 00:00:10.44
SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------
SQL_ID  dw4h66wchcsm7, child number 2
-------------------------------------
select count(*) from v$access

Plan hash value: 1711387798

------------------------------------------------------------------------------------
| Id  | Operation                   | Name            | Rows  | Bytes | Cost (%CPU)|
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                 |       |       |     1 (100)|
|   1 |  SORT AGGREGATE             |                 |     1 |       |            |
|   2 |   VIEW                      | GV$ACCESS       |   315 |       |     0   (0)|
|   3 |    HASH UNIQUE              |                 |   315 |   214K|     0   (0)|
|   4 |     NESTED LOOPS SEMI       |                 |   470 |   319K|     0   (0)|
|   5 |      HASH JOIN              |                 |   100 | 65900 |     0   (0)|
|   6 |       FIXED TABLE FULL      | X$KGLLK         |   100 |  2500 |     0   (0)|
|   7 |       MERGE JOIN CARTESIAN  |                 |   100 | 63400 |     0   (0)|
|   8 |        FIXED TABLE FULL     | X$KSUSE         |     1 |    45 |     0   (0)|
|   9 |        BUFFER SORT          |                 |   100 | 58900 |     0   (0)|
|  10 |         FIXED TABLE FULL    | X$KGLOB         |   100 | 58900 |     0   (0)|
|  11 |      FIXED TABLE FIXED INDEX| X$KGLDP (ind:1) |   470M|    16G|     0   (0)|
------------------------------------------------------------------------------------

Note
-----
   - statistics feedback used for this statement

27 rows selected.

Elapsed: 00:00:00.04
The query is executed in 10 seconds with a changed plan which is now using a MERGE JOIN CARTESIAN operation.
As these internal structures are used by many views and in internal processes, a misestimation can directly have an impact on the oracle kernel performance.
That’s why it’s recommended to gather these stats every time there’s a significant change on the instance configuration (memory configuration for example), new apps deployment etc. And, if you can, it’s recommended to gather these stats during a representative workload.

Write SQL statements on internal structures in multitenant databases.

In a multitenant configuration, if you are querying X$ structures and fixed tables (OBJ$, TAB$ etc.), you will face differents behaviour depending on the fact you are connected to the root container or to a pluggable database.

  • If you are connected to a root container (CDB$ROOT)
    • In fixed tables, for example OBJ$, there’s no CON_ID column defined in this table so this will give you all objects for your root container, and not for all containers.
SQL> show con_name
CON_NAME
------------------------------
CDB$ROOT
SQL> desc obj$
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 OBJ#                                      NOT NULL NUMBER
 DATAOBJ#                                           NUMBER
 OWNER#                                    NOT NULL NUMBER
 NAME                                      NOT NULL VARCHAR2(128)
 NAMESPACE                                 NOT NULL NUMBER
 SUBNAME                                            VARCHAR2(128)
 TYPE#                                     NOT NULL NUMBER
 CTIME                                     NOT NULL DATE
 MTIME                                     NOT NULL DATE
 STIME                                     NOT NULL DATE
 STATUS                                    NOT NULL NUMBER
 REMOTEOWNER                                        VARCHAR2(128)
 LINKNAME                                           VARCHAR2(128)
 FLAGS                                              NUMBER
 OID$                                               RAW(16)
 SPARE1                                             NUMBER
 SPARE2                                             NUMBER
 SPARE3                                             NUMBER
 SPARE4                                             VARCHAR2(1000)
 SPARE5                                             VARCHAR2(1000)
 SPARE6                                             DATE
 SIGNATURE                                          RAW(16)
 SPARE7                                             NUMBER
 SPARE8                                             NUMBER
 SPARE9                                             NUMBER
    • X$ structures will give you information for all containers (ROOT$CDB, SEED and and all pluggable database). And there’s a CON_ID column defined in all those structures.
SQL> show con_name

CON_NAME
------------------------------
CDB$ROOT

SQL> desc x$ksppi
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 ADDR                                               RAW(8)
 INDX                                               NUMBER
 INST_ID                                            NUMBER
 CON_ID                                             NUMBER
 KSPPINM                                            VARCHAR2(80)
 KSPPITY                                            NUMBER
 KSPPDESC                                           VARCHAR2(255)
 KSPPIFLG                                           NUMBER
 KSPPILRMFLG                                        NUMBER
 KSPPIHASH                                          NUMBER

SQL> select c.name,KSPPINM
  2  from x$ksppi x, v$containers c
  3  where x.con_id=c.con_id and  KSPPINM='open_cursors';

NAME                           KSPPINM
------------------------------ ------------------------------
CDB$ROOT                       open_cursors
PDB$SEED                       open_cursors
PDB1                           open_cursors
PDB2                           open_cursors
  • If you are connected to a PDB
    • In fixed tables, for example OBJ$, there’s still no CON_ID, so this will give you all objects on your PDB.
    • In X$ structures, there’s a CON_ID column, but if you are connected to a PDB, you will see only the data related to this PDB.
To summarize, static fixed tables contain data for the actual container (CDB$ROOT or PDB), but X$ structures contain data for all containers if you are connected to the CDB$ROOT container, and contain data related to the PDB you are connected to, in case of a PDB.
If you disassemble static views like CDB_TABLES, you will see a new function used to have data for all containers. This is CDB$VIEW function :
SQL> show con_id

CON_ID
------------------------------
1
SQL> show con_name

CON_NAME
------------------------------
CDB$ROOT
SQL> select count(*) from obj$;

  COUNT(*)
----------
     91005

SQL> select con_id,count(*) from CDB$VIEW("SYS"."OBJ$") group by con_id order by 1;

    CON_ID   COUNT(*)
---------- ----------
         1      91005
         2      90708
         3      90960
         4      90948
 Ok, now we have the information for all our containers. But if we want to join an X$ Structure and a CDB$VIEW transformed object, time for execute this is too long:
select x.con_id,o.name,count(*)
from x$bh x, (select name,dataobj#,con_id from CDB$VIEW("SYS"."OBJ$")) o
where x.con_id=o.con_id
and o.dataobj#=x.obj
and o.name like 'T_PDB%'
group by x.con_id,o.name
/
... never ends :(
To execute it in a better time, I used query factorization with a WITH block, and forced materialization of it:
SQL> with o as (select /*+ MATERIALIZE */ name,dataobj#,con_id from CDB$VIEW("SYS"."OBJ$"))
  2  select x.con_id,o.name,count(*)
  3  from x$bh x,o
  4  where x.con_id=o.con_id
  5  and o.dataobj#=x.obj
  6  and o.name like 'T_PDB%'
  7  group by x.con_id,o.name
  8  /

    CON_ID NAME                   COUNT(*)
---------- -------------------- ----------
         4 T_PDB2                        9
         3 T_PDB1                        9

Elapsed: 00:00:02.40

Trace Oracle CBO computations for a specific sql_id

In this post, I will explain how to trace CBO computation (aka 10053 event) for a specific sql_id and in another session.

To do this, we need to know two things:

1) How to trace another session? To do this, I will use undocumented oracle tool “oradebug”. More precisely, I will use the new event declaration syntax which is not based on event id.

2) How to trace CBO computation for a specific sql_id? To do this, I will use the new event declaration syntax (more details in the demonstration above)

To demonstrate this trick, I will consider two sessions:

– The first session (S1) is logged as an application user named LAURENT. This user owns two table T1 and T4, and we only want to trace a specific SQL Query (select count(*) from t4 where id between 500 and 550;).

– The second session (S2) is logged as SYS user who will launch oradebug commands.

* S1 (logged as LAURENT)

SQL> select count(*) from t4 where id between 500 and 550;
COUNT(*)
----------
 51

* S2 (logged as SYS) : querying the dictionary to obtain sql_id associated to the SQL query:

SQL> select sql_id,sql_text from v$sql
 2 where sql_text like 'select count(*) from t4 where id between 500 and 550%';
SQL_ID        SQL_TEXT
------------- --------------------------------------------------------------------------------
2zg40utr7a08n select count(*) from t4 where id between 500 and 550

* S1 (logged as LAURENT): obtain Oracle PID and system PID (spid) of the session we will trace. This information will be used for oradebug in the next step.

SQL> select pid,spid from v$process
 2 where addr=(select paddr from v$session
 3 where sid=(select sid from v$mystat where rownum=1));
PID        SPID
---------- ---------
25         4850

* S2 (Logged as SYS) : we will use oradebug new features to trace a specific sql_id by using the new syntax for tracing CBO Computations (trace[RDBMS.SQL_Optimizer.*])

SQL> ALTER SYSTEM FLUSH SHARED_POOL;
SQL> -- Setting Oracle PID to trace and verify by crossing the result of the system pid
SQL> oradebug setorapid 25
Oracle pid: 25, Unix process pid: 4850, image: oracle@oel (TNS V1-V3)
SQL> -- nolimit to tracefile
SQL> oradebug unlimit
Statement processed.
SQL> -- tracing SQL_Optimizer computation for a specific sql (here's our sql_id)
SQL> oradebug event trace[RDBMS.SQL_Optimizer.*][sql:2zg40utr7a08n]
Statement processed.
SQL> -- obtain the trace file name
SQL> oradebug tracefile_name
/u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_4850.trc

NB : Flushing shared pool is mandatory if you have already shared cursor for the statement to trace in the shared pool.

 

* S1 (Logged as LAURENT) : execute many sql statements in the session including our specific sql statement:

S1 (execute sql_id 2zg40utr7a08n one time, and others sql):
SQL> select count(*) from t4 where id between 500 and 550;
COUNT(*)
----------
 51

SQL> select count(*) from t4 ;
COUNT(*)
----------
 300000

SQL> select count(*) from t1;
COUNT(*)
----------
 294958

Finally, open the tracefile generated, you will only have the CBO computations and statistics for our specific sql_id:

[oracle@oel ~]$ vi /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_4850.trc
Registered qb: SEL$1 0xe47325b8 (PARSER)
---------------------
QUERY BLOCK SIGNATURE
---------------------
 signature (): qb_name=SEL$1 nbfros=1 flg=0
 fro(0): flg=4 objn=78460 hint_alias="T4"@"SEL$1"
SPM: statement not found in SMB
**************************
Automatic degree of parallelism (ADOP)
**************************
Automatic degree of parallelism is disabled: Parameter.
PM: Considering predicate move-around in query block SEL$1 (#0)
**************************
Predicate Move-Around (PM)
**************************
OPTIMIZER INFORMATION
******************************************
----- Current SQL Statement for this session (sql_id=2zg40utr7a08n) -----
select count(*) from t4 where id between 500 and 550
*******************************************

.../...

Query Block Registry:
SEL$1 0xe47325b8 (PARSER) [FINAL]
:
 call(in-use=13920, alloc=49184), compile(in-use=88336, alloc=152104), execution(in-use=6016, alloc=8088)
End of Optimizer State Dump
Dumping Hints
=============
====================== END SQL Statement Dump ======================

Update : Bertrand Drouvot has blogged a tricky way to flush a specific sql_id before generating its CBO computation trace file. See link : http://bdrouvot.wordpress.com/2013/09/16/flush-a-single-sql-statement-and-capture-a-10053-trace-for-it/