MacLochlainns Weblog

Michael McLaughlin's Technical Blog

Site Admin

Substitutable Columns

with 2 comments

Oracle’s substitutable columns are interesting and substantially different than Oracle’s nested tables. The benefit of substitutable columns is that you can create one for an object type or any subtypes of that object type. Unfortunately, you can’t create the same behavior with nested tables because Oracle’s implementation of collection types are always final data types and you can’t extend their behaviors.

The Oracle Database has three types of collections. Two are SQL scoped collection types and the remaining one is a PL/SQL-only collection. You can only use the two SQL scoped collection types as column data types. One of the SQL-scoped collection types is an Attribute Data Type (ADT), which uses a base data type of DATA, NUMBER, or VARCHAR2.

The base data types of a UDT are scalar data types and scalar data types are data types that hold one thing. The other SQL-scoped collection type is a collection of User-Defined Types (UDTs), which are object types that you create like record structures by assembling sets of basic scalar data types. The elements of a UDT are known as members, whereas the instances of a collection are known as elements because they indexed in a set.

You can join a row with any nested table by using a cross join because they match the row with the nested table by using an ID-dependent join. An ID-dependent join is inexpensive because it relies on a structural dependency, the existence of the nested table in a column of a row. Typical joins on the other hand are joins between two tables or two copies of the same table. These non ID-dependent joins use at least matching values in one column of each table or one column of two copies of a table.

Joins between substitutable columns that hold UDTs are unlike joins between nested tables. The following sets up an example to demonstrate how you can join the non-substitutable columns of a row with the substitutable columns.

  1. You need a base UDT object type that you can extend, where extend means you can create a subtype of the base object type. While this is straight forward when you create an Oracle object type with methods, it isn’t necessarily straight forward when you want to simply create a base data structure as a generalized type with subtypes.

    The important clause is overriding the FINAL default by making the base type NOT FINAL. The example use BASE_T as the generalized type or data structure of a substitutable column:

    CREATE OR REPLACE TYPE base_t AS OBJECT
    ( base_id  NUMBER ) NOT FINAL;
    /
  2. After you create your base data structure, you create a specialized subtype. The following example creates a PERSON_T type and accepts the default of FINAL, which means you can’t create another subtype level.

    CREATE OR REPLACE TYPE person_t UNDER base_t
    ( first_name   VARCHAR2(20)
    , middle_name  VARCHAR2(20)
    , last_name    VARCHAR2(20));
    /
  3. With a generalized BASE_T type and a specialized PERSON_T subtype, you create a CUSTOMER table with a substitutable CUSTOMER_NAME column. The CUSTOMER_NAME column uses the generalized BASE_T data type. You should also create a CUSTOMER_S sequence that you can use as a surrogate key column for the table.

    CREATE TABLE customer
    ( customer_id    NUMBER
    , customer_name  BASE_T );
     
    CREATE SEQUENCE customer_s;
  4. You can now populate the table with instances of the BASE_T type or the PERSON_T subtype. The following inserts three rows into the CUSTOMER table. One for Hank Pym the original Ant-Man, one for Scott Lang the succeeding Ant-Man, and another for Darren Cross the original Yellowjacket.

    INSERT INTO customer
    VALUES
    ( customer_s.NEXTVAL
    , person_t( customer_s.CURRVAL
              , first_name => 'Hank'
              , middle_name => NULL
              , last_name => 'Pym'));
     
    INSERT INTO customer
    VALUES
    ( customer_s.NEXTVAL
    , person_t( customer_s.CURRVAL
              , first_name => 'Scott'
              , middle_name => NULL
              , last_name => 'Lang'));
     
    INSERT INTO customer
    VALUES
    ( customer_s.NEXTVAL
    , person_t( customer_s.CURRVAL
              , first_name => 'Darren'
              , middle_name => NULL
              , last_name => 'Cross'));
  5. The significance or problem associated with substitutable columns is that the actual columns of the object data type are hidden, which means you can’t query them like they’re nested elements of the substitutable column. The following query demonstrates what happens when you try to access those hidden member columns:

    SELECT customer_id
    ,      customer_name.base_id
    ,      customer_name.first_name
    ,      customer_name.middle_name
    ,      customer_name.last_name
    FROM   customer;

    It returns the following error message:

    ,      customer_name.last_name
           *
    ERROR at line 5:
    ORA-00904: "CUSTOMER_NAME"."LAST_NAME": invalid identifier
  6. It only raises the last column in the SELECT-list because that’s the first place where it fails to recognize an identifier, which is a valid column name in scope of the query.

  7. This error message may lead you to call the CUSTOMER_NAME column in a subquery and use the TABLE function to convert it to a result set. However, it also fails because a UDT object type by itself is an ordinary object type not a collection of object types. The TABLE function can’t promote the single instance to collection.

    SELECT *
    FROM   TABLE(SELECT TREAT(customer_name AS person_t) FROM customer);

    It returns the following error message:

    FROM   TABLE(SELECT TREAT(customer_name AS person_t) FROM customer)
           *
    ERROR at line 2:
    ORA-22905: cannot access rows from a non-nested table item
  8. The non-nested table error message should lead you to wrap the call to the TREAT function in a call to the COLLECT function, like this:

    COL base_id        FORMAT 9999  HEADING "Base|ID #"
    COL customer_name  FORMAT A38   HEADING "Customer Name"
    COL first_name     FORMAT A6    HEADING "First|Name"
    COL middle_name    FORMAT A6    HEADING "Middle|Name"
    COL last_name      FORMAT A6    HEADING "Last|Name"
    SELECT *
    FROM   TABLE(
             SELECT COLLECT(TREAT(customer_name AS person_t)) AS cte
             FROM customer);

    It returns the substitutable column’s hidden column labels and their values:

     Base First  Middle Last
     ID # Name   Name   Name
    ----- ------ ------ ------
        1 Hank	    Pym
        2 Scott	    Lang
        3 Darren	    Cross
  9. After learning how to unwrap the hidden columns of the substitutable column, you can now join the ordinary columns to the hidden columns like this:

    COL customer_id    FORMAT 9999  HEADING "Customer|ID #"
    COL base_id        FORMAT 9999  HEADING "Base|ID #"
    COL customer_name  FORMAT A38   HEADING "Customer Name"
    COL first_name     FORMAT A6    HEADING "First|Name"
    COL middle_name    FORMAT A6    HEADING "Middle|Name"
    COL last_name      FORMAT A6    HEADING "Last|Name"
    SELECT   c.customer_id
    ,        o.*
    FROM     customer c INNER JOIN
             TABLE(SELECT COLLECT(TREAT(customer_name AS person_t)) AS cte
                   FROM   customer) o
    ON       c.customer_id = o.base_id
    ORDER BY c.customer_id;

    It returns the ordinary column and substitutable column’s hidden column labels and their values:

    Customer  Base First  Middle Last
        ID #  ID # Name   Name   Name
    -------- ----- ------ ------ ------
           1     1 Hank	     Pym
           2     2 Scott	     Lang
           3     3 Darren	     Cross
  10. The preceding query only returns values when the substitutable column holds a value. It fails to return a value when the substitutable column holds a null value. You need to use a LEFT JOIN to ensure you see all ordinary columns whether or not the substitutable column holds a value.

    COL customer_id    FORMAT 9999  HEADING "Customer|ID #"
    COL base_id        FORMAT 9999  HEADING "Base|ID #"
    COL customer_name  FORMAT A38   HEADING "Customer Name"
    COL first_name     FORMAT A6    HEADING "First|Name"
    COL middle_name    FORMAT A6    HEADING "Middle|Name"
    COL last_name      FORMAT A6    HEADING "Last|Name"
    SELECT   c.customer_id
    ,        o.*
    FROM     customer c LEFT JOIN
             TABLE(SELECT COLLECT(TREAT(customer_name AS person_t)) AS cte
                   FROM   customer) o
    ON       c.customer_id = o.base_id
    ORDER BY c.customer_id;

    It returns the ordinary column and substitutable column’s hidden column labels and their values when the substitutable column holds an instance value. However, it only returns the ordinary column when the substitutable column holds a null value, as shown below:

    Customer  Base First  Middle Last
        ID #  ID # Name   Name   Name
    -------- ----- ------ ------ ------
           1     1 Hank	     Pym
           2     2 Scott	     Lang
           3     3 Darren	     Cross
           4
  11. It should be noted that queries like this have a cost, and that cost is high. So, you should only implement substitutable columns when the maintenance coding costs (or sustaining engineering) outweighs the processing cost.

    You can determine the cost like this:

    EXPLAIN PLAN
    SET STATEMENT_ID = 'Strange'
    FOR
    SELECT   c.customer_id
    ,        o.*
    FROM     customer c LEFT JOIN
             TABLE(SELECT COLLECT(TREAT(customer_name AS person_t)) AS cte
                   FROM   customer) o
    ON       c.customer_id = o.base_id
    ORDER BY c.customer_id;

    You can query the cost like this:

    SET LINESIZE 130
    SELECT *
    FROM   TABLE(dbms_xplan.display(NULL,'Strange'));

    It should return something like this for the sample table and solution:

    PLAN_TABLE_OUTPUT
    ---------------------------------------------------------------------------------------------------------
    Plan hash value: 2373055701
     
    ---------------------------------------------------------------------------------------------------------
    | Id  | Operation			     | Name	| Rows	| Bytes |TempSpc| Cost (%CPU)| Time	|
    ---------------------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT		     |		|  8168 |   550K|	|   167   (2)| 00:00:03 |
    |   1 |  SORT ORDER BY			     |		|  8168 |   550K|   624K|   167   (2)| 00:00:03 |
    |*  2 |   HASH JOIN OUTER		     |		|  8168 |   550K|	|    32   (4)| 00:00:01 |
    |   3 |    TABLE ACCESS FULL		     | CUSTOMER |     5 |    15 |	|     2   (0)| 00:00:01 |
    |   4 |    VIEW 			     |		|  8168 |   526K|	|    29   (0)| 00:00:01 |
    |   5 |     COLLECTION ITERATOR PICKLER FETCH|		|  8168 |	|	|    29   (0)| 00:00:01 |
    |   6 |      SORT AGGREGATE		     |		|     1 |    14 |	|	     |		|
    |   7 |       TABLE ACCESS FULL 	     | CUSTOMER |     5 |    70 |	|     2   (0)| 00:00:01 |
    ---------------------------------------------------------------------------------------------------------
     
    Predicate Information (identified by operation id):
    ---------------------------------------------------
     
       2 - access("C"."CUSTOMER_ID"="O"."SYS_NC_ROWINFO$"."BASE_ID"(+))

As always, I hope this explains how to insert and query the hidden columns of a substitutable column, and how you join ordinary columns and hidden columns of a substitutable column from a table.

Written by maclochlainn

December 8th, 2017 at 11:17 pm