Archive for the ‘sql’ Category
Oracle CSV Imports
The first step in creating an effective import plan for comma-separated value (CSV) files is recognizing your options in a database. There are several options in an Oracle database. You can read the file with Java, C/C++, C#, PL/SQL (through the UTL_FILE package), PHP, Perl, or any other C-callable programming language; or you can use SQL*Loader as a standalone utility or through externally managed tables (known as external tables). The most convenient and non-programming solution is using external tables.
Adopting external tables as your import solution should drive you to consider how to manage the security surrounding this type of methodology. Host hardening is a critical security step because it shuts down most, hopefully all, unauthorized use of the operating system where the database and external files reside. Next, you need to manage the access to the external tables and ensure that exposure of business sensitive information in CSV files is minimized.
This post explains how to manage access and police (cleanup external files) once they’re read into the database. It assumes you have root-level permissions to the operating system and database. The SYS and SYSTEM accounts have the equivalent of root permissions for database configuration. The rule of thumb with these accounts is simple, manage as much as possible with the SYSTEM account before you use the SYS account.
Setting up the Import File System
While you can do all the setup of virtual directories in Oracle regardless of whether you’ve set them up in the operating system, it’s a good idea to set them up in the OS first. The example is using a Windows 7 OS, so you’ll need to change the directories when working in Linux or Unix. Here are the directories:
C:\Imports\ImportFiles C:\Imports\ImportLogs |
You may take note that there are two directories. That’s because you don’t want to grant write privileges to the Oracle virtual directory where you put the files. You can grant read-only privileges to the virtual directory and read-write privileges to the log directory.
Setting up the Import User/Schema
This step lets you create an IMPORT
user/schema in the Oracle database. You need to connect as the SYSTEM user to perform these steps (or another authorized DBA account with adequate privileges):
CREATE USER import IDENTIFIED BY import DEFAULT TABLESPACE users QUOTA 1000M ON users TEMPORARY TABLESPACE temp; |
After creating the user, grant the privileges like this as the SYSTEM user:
GRANT CREATE CLUSTER, CREATE INDEXTYPE, CREATE OPERATOR , CREATE PROCEDURE, CREATE SEQUENCE, CREATE SESSION , CREATE SYNONYM, CREATE TABLE, CREATE TRIGGER , CREATE TYPE, CREATE VIEW TO import; |
Setting up Virtual Directories
A virtual directory in Oracle acts maps an internal database directory name (known as a virtual directory) to a physical directory of the operating system. You create two virtual directories in this example, one holds read-only permissions to the directory where you’re putting the data file, and the other holds read-write permissions to the directory where you’re writing any log files from the external file process.
Log files are generated from this process when you query the data from the external file. Any error in the files conformity is written to a log file.
CREATE DIRECTORY upload_files AS 'C:\Imports\ImportFiles'; CREATE DIRECTORY upload_logs AS 'C:\Imports\ImportLogs'; |
After creating the virtual directories in the database, you must grant appropriate access to the user account that will access the data. This grants those permissions to the IMPORT user:
GRANT READ ON DIRECTORY upload_files TO import; GRANT READ, WRITE ON DIRECTORY upload_logs TO import; |
Setting up an External Table
An external table references both the UPLOAD_FILES and UPLOAD_LOGS virtual directories, and the virtual directories must map to physical directories that allow read and write privileges to the Oracle user. Here’s the external table for this example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | CREATE TABLE item_import_ext_table ( asin_number VARCHAR2(10) , item_type VARCHAR2(15) , item_title VARCHAR2(60) , item_subtitle VARCHAR2(60) , item_rating VARCHAR2(8) , item_rating_agency VARCHAR2(4) , item_release_date DATE) ORGANIZATION EXTERNAL ( TYPE oracle_loader DEFAULT DIRECTORY upload_files ACCESS PARAMETERS ( RECORDS DELIMITED BY NEWLINE CHARACTERSET US7ASCII BADFILE 'UPLOAD_LOGS':'item_import_ext_table.bad' DISCARDFILE 'UPLOAD_LOGS':'item_import_ext_table.dis' LOGFILE 'UPLOAD_LOGS':'item_import_ext_table.log' FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY "'" MISSING FIELD VALUES ARE NULL ) LOCATION ('item_import.csv')) REJECT LIMIT UNLIMITED; |
Setting up a Physical File
You should put the following in a item_import.csv physical file (case sensitivity won’t matter on the Windows 7 platform but will matter on the Linux or Unix platforms):
'B000W74EQC','DVD_WIDE_SCREEN','Harry Potter and the Sorcerer''s Stone',,'PG','MPAA','11-DEC-2007' 'B000W746GK','DVD_WIDE_SCREEN','Harry Potter and the Chamber of Secrets',,'PG','MPAA','11-DEC-2007' 'B000W796OM','DVD_WIDE_SCREEN','Harry Potter and the Prisoner of Azkaban',,'PG','MPAA','11-DEC-2007' 'B000E6EK2Y','DVD_WIDE_SCREEN','Harry Potter and the Goblet of Fire',,'PG-13','MPAA','07-MAR-2006' 'B000W7F5SS','DVD_WIDE_SCREEN','Harry Potter and the Order of the Phoenix',,'PG-13','MPAA','11-DEC-2007' 'B002PMV9FG','DVD_WIDE_SCREEN','Harry Potter and the Half-Blood Prince',,'PG','MPAA','08-DEC-2009' 'B001UV4XHY','DVD_WIDE_SCREEN','Harry Potter and the Deathly Hallows, Part 1',,'PG-13','MPAA','15-APR-2011' 'B001UV4XIS','DVD_WIDE_SCREEN','Harry Potter and the Deathly Hallows, Part 2',,'PG-13','MPAA','11-NOV-2011' |
Testing the External Table
After putting the item_import.csv file in the C:\Imports\ImportFiles directory, you can test the process at this point by running the following query:
SET PAGESIZE 99 COLUMN asin_number FORMAT A11 HEADING "ASIN #" COLUMN item_title FORMAT A46 HEADING "ITEM TITLE" COLUMN item_rating FORMAT A6 HEADING "RATING" COLUMN item_release_date FORMAT A11 HEADING "RELEASE|DATE" SELECT asin_number , item_title , item_rating , TO_CHAR(item_release_date,'DD-MON-YYYY') AS item_release_date FROM item_import_ext_table; |
It should return eight rows.
Extending Access to the Data Dictionary
The physical directory names of virtual directories are hidden from generic users. They’re available in the ALL_DIRECTORIES and DBA_DIRECTORIES administrative view for queries by the SYS, SYSTEM, and any DBA role privileged users.
While a privileged user can query the view, placing the view inside a function or procedure deployed in the privileged user’s schema would raise an ORA-00942 error. That error signals that the table or view does not exist.
This example deploys the view in the SYSTEM schema. That means it requires you make the following grant as the SYS user:
GRANT SELECT ON sys.dba_directories TO system; |
After making the grant from the SYS schema to the SYSTEM schema, connect to the SYSTEM schema. Then, create the following GET_DIRECTORY_PATH function:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | CREATE OR REPLACE FUNCTION get_directory_path ( virtual_directory IN VARCHAR2 ) RETURN VARCHAR2 IS -- Define RETURN variable. directory_path VARCHAR2(256) := ''; --Define dynamic cursor. CURSOR get_directory (virtual_directory VARCHAR2) IS SELECT directory_path FROM sys.dba_directories WHERE directory_name = virtual_directory; -- Define a LOCAL exception FOR name violation. directory_name EXCEPTION; PRAGMA EXCEPTION_INIT(directory_name,-22284); BEGIN OPEN get_directory (virtual_directory); FETCH get_directory INTO directory_path; CLOSE get_directory; -- RETURN file name. RETURN directory_path; EXCEPTION WHEN directory_name THEN RETURN NULL; END get_directory_path; / |
It’s tempting to make the grant on this function to PUBLIC user but that would expose information that any DBA should try and limit. That means you grant EXECUTE privilege only to the IMPORT schema.
This grant should be made as the SYSTEM user:
GRANT EXECUTE ON get_directory_path TO import; |
After granting the EXECUTE privilege to the IMPORT user, connect to the IMPORT schema and create a synonym to the GET_DIRECTORY_PATH function. The syntax for that command is:
CREATE SYNONYM get_directory_path FOR system.get_directory_path; |
You can now test your access to the function with the following query from the IMPORT schema:
SELECT get_directory_path('UPLOAD_FILES') FROM dual; |
You should return the following if you’ve got everything working at this point:
GET_DIRECTORY_PATH('UPLOAD_FILES') ------------------------------------ C:\Imports\ImportFiles |
At this point, you’ve completed the second major configuration component. You now need the ability to read files outside the database, which can be done with Java in Oracle 10g or Oracle 11g (that’s not possible in Oracle 10g XE or Oracle 11g XE because they don’t support an internal JVM). The
Reading Virtual Directory Files
The GET_DIRECTORY_PATH function provides you with the ability to read the Oracle data catalog and find the absolute directory path of a virtual directory. In this framework, you need this value to find whether the item_import.csv physical file is present in the file system before you read the file.
There doesn’t appear to be a neat little function to read an external directory. At least, there’s not one in the UTL_FILE or DBMS_LOB packages where you’d think it should be found. Unfortunately, that leaves us with two alternatives. One is to write an external library in C, C++, or C#. Another is to write an internal Java library that reads the file system. You accomplish this by granting permissions to a target directory or directories.
The first step is to create a scalar array of VARCHAR2 variables, like
CREATE OR REPLACE TYPE file_list AS TABLE OF VARCHAR2(255); / |
The second step is to write the Java library file. You can write it three ways. One accepts default error handling and the others override the default exception handling. If you’re new to Java, you should take the basic library with default handling. If you’ve more experience, you may want to override the helpful message with something that causes the developer to check with the DBA or simply suppress the message to enhance security.
You should note that the database connection is an Oracle Database 11g internal database connection. The connection only does one thing. It allows you to map the ArrayDescriptor to a schema-level SQL collection type. The element types of these collections should be scalar variables, like DATE, NUMBER, or VARCHAR2 data types.
The more advanced method overrides exception handling by suppressing information about the java.properties settings. You can do it by catching the natively thrown exception and re-throw it or ignore it. The example ignores it because handling it in Java reports an unhandled exception at the PL/SQL or SQL layer, which leads end users to think you have a major design problem.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 | CREATE OR REPLACE AND COMPILE JAVA SOURCE NAMED "ListVirtualDirectory" AS // Import required classes. import java.io.*; import java.security.AccessControlException; import java.sql.*; import java.util.Arrays; import oracle.sql.driver.*; import oracle.sql.ArrayDescriptor; import oracle.sql.ARRAY; // Define the class. public class ListVirtualDirectory { // Define the method. public static ARRAY getList(String path) throws SQLException { // DECLARE variable AS a NULL, required because OF try-catch block. ARRAY listed = NULL; // Define a connection (this IS FOR Oracle 11g). Connection conn = DriverManager.getConnection("jdbc:default:connection:"); // USE a try-catch block TO trap a Java permission error ON the directory. try { // DECLARE a class WITH the file list. File directory = NEW File(path); // DECLARE a mapping TO the schema-level SQL collection TYPE. ArrayDescriptor arrayDescriptor = NEW ArrayDescriptor("FILE_LIST",conn); // Translate the Java String[] TO the Oracle SQL collection TYPE. listed = NEW ARRAY(arrayDescriptor,conn,((Object[]) directory.list())); } catch (AccessControlException e) {} RETURN listed; }} / |
You can’t call an internal Java library without a PL/SQL wrapper function. Here’s the wrapper function for this Java library:
CREATE OR REPLACE FUNCTION list_files(path VARCHAR2) RETURN FILE_LIST IS LANGUAGE JAVA NAME 'ListVirtualDirectory.getList(java.lang.String) return oracle.sql.ARRAY'; / |
You MUST grant the Oracle Database’s internal JVM authority to read the external directory before you can return the directory contents. Any attempt to read a directory without the proper permissions raises an ORA-29532 exception.
The following is an anonymous block to grant permissions to a directory. You must grant a minimum of read permissions but since you’ll also delete this file later in the post you should grant read, write, and delete. You must run it from the SYSDBA role as the SYS user.
1 2 3 4 5 6 7 | BEGIN DBMS_JAVA.GRANT_PERMISSION('IMPORT' ,'SYS:java.io.FilePermission' ,'C:\Imports\ImportFiles' ,'read,write,delete'); END; / |
While you’re connected, it’s a good idea to grant the same privileges to your log directory:
1 2 3 4 5 6 7 | BEGIN DBMS_JAVA.GRANT_PERMISSION('IMPORT' ,'SYS:java.io.FilePermission' ,'C:\Imports\ImportLogs' ,'read,write,delete'); END; / |
You should now be able to read the contents of an external file from another PL/SQL block or from a SQL statement. Here’s an example of the SQL statement call that uses everything developed to this point:
SELECT column_value AS "File Names" FROM TABLE(list_files(get_directory_path('UPLOAD_FILES'))); |
It should return the item_import.csv physical file as the only file in the physical directory, like:
File Names ----------------- item_import.csv |
Mapping an External Table to a source File
The next step leverages the user segment of the Oracle Database’s data catalog and all the components developed above to find and display the external table and external file. This query returns the results:
COLUMN TABLE_NAME FORMAT A30 COLUMN file_name FORMAT A30 SELECT xt.table_name , xt.file_name FROM (SELECT uxt.TABLE_NAME , ixt.column_value AS file_name FROM user_external_tables uxt CROSS JOIN TABLE(list_files(get_directory_path(uxt.default_directory_name))) ixt) xt JOIN user_external_locations xl ON xt.table_name = xl.table_name AND xt.file_name = xl.location; |
It should return the following:
TABLE_NAME FILE_NAME ------------------------------ ------------------------------ ITEM_IMPORT_EXT_TABLE item_import.csv |
You can migrate the query into the following function. It returns a zero when the file isn’t found and a one when it is found.
CREATE OR REPLACE FUNCTION external_file_found ( table_in VARCHAR2 ) RETURN NUMBER IS -- Define a default return value. retval NUMBER := 0; -- Decalre a cursor to find external tables. CURSOR c (cv_table VARCHAR2) IS SELECT xt.table_name , xt.file_name FROM (SELECT uxt.TABLE_NAME , ixt.column_value AS file_name FROM user_external_tables uxt CROSS JOIN TABLE(list_files(get_directory_path(uxt.default_directory_name))) ixt) xt JOIN user_external_locations xl ON xt.table_name = xl.table_name AND xt.file_name = xl.location AND xt.table_name = UPPER(cv_table); BEGIN FOR i IN c(table_in) LOOP retval := 1; END LOOP; RETURN retval; END; / |
With the EXTERNAL_FILE_FOUND function, you can create a function that returns rows when the external file is found and no rows when the external file isn’t found. The following view hides the logic required to make that work:
CREATE OR REPLACE VIEW item_import AS SELECT * FROM item_import_ext_table WHERE external_file_found('ITEM_IMPORT_EXT_TABLE') = 1; |
Conveniently, you can now query the ITEM_IMPORT view without the risk of raising the following error when the file is missing:
SELECT * FROM item_import_ext_table * ERROR at line 1: ORA-29913: error IN executing ODCIEXTTABLEOPEN callout ORA-29400: DATA cartridge error KUP-04040: file item_import.csv IN UPLOAD_FILES NOT found |
You can now grant the SELECT privilege on the ITEM_IMPORT view to your application schema, like:
GRANT SELECT ON item_import TO application; |
After granting the SELECT privilege on the ITEM_IMPORT view to the APPLICATION schema, you can create a synonym to hide the IMPORT schema.
CREATE SYNONYM item_import FOR item_import; |
At this point, many developers feel they’re done. Enclosing the results in a schema-level function provides more utility than a view. The next section shows you how to replace the view with a schema-level function.
Replacing the View with an Object Table Function
Inside a schema-level function, you can assign the results from the query to a SQL collection of an object type. The object type should mirror the structure of the table, like the following:
1 2 3 4 5 6 7 8 9 | CREATE OR REPLACE TYPE item_import_object IS OBJECT ( asin_number VARCHAR2(10) , item_type VARCHAR2(15) , item_title VARCHAR2(60) , item_subtitle VARCHAR2(60) , item_rating VARCHAR2(8) , item_rating_agency VARCHAR2(4) , item_release_date DATE); / |
After creating the object type that mirrors the structure of the ITEM_IMPORT_EXT_TABLE table, you need to create a list like collection of the object type. The nested table collection type acts like a list in Oracle:
1 2 3 | CREATE OR REPLACE TYPE item_import_object_table IS TABLE OF item_import_object; / |
After defining the object type and collection, you can access them in the following type of function:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 | CREATE OR REPLACE FUNCTION external_file_contents ( table_in VARCHAR2 ) RETURN item_import_object_table IS -- Define a local counter. lv_counter NUMBER := 1; -- Construct an empty collection of ITEM_IMPORT_OBJECT data types. lv_item_import_table ITEM_IMPORT_OBJECT_TABLE := item_import_object_table(); -- Decalre a cursor to find external tables. CURSOR c (cv_table VARCHAR2) IS SELECT * FROM item_import_ext_table WHERE external_file_found(cv_table) = 1; BEGIN FOR i IN c(table_in) LOOP lv_item_import_table.EXTEND; lv_item_import_table(lv_counter) := item_import_object(i.asin_number ,i.item_type ,i.item_title ,i.item_subtitle ,i.item_rating ,i.item_rating_agency ,i.item_release_date); lv_counter := lv_counter + 1; END LOOP; /* * This is where you can place autonomous function calls: * ====================================================== * - These can read source and log files, and write them * to CLOB attributes for later inspection or review. * - These can call Java libraries to delete files, but * you should note that Java deletes any file rather * than moving it to the trash bin (where you might * recover it. */ RETURN lv_item_import_table; END; / |
Between the assignment to the collection and the return statement of the function, you have the ability of calling any number of autonomous functions. Any schema-level function can call autonomous functions that read and write tables with DML statements, like the INSERT, UPDATE, and DELETE statements. You can also call schema-functions that wrap Java libraries that delete external files.
You can confirm that the steps work by running the following query with or without the SQL*Plus formatting:
/* * SQL*Plus formatting. */ SET PAGESIZE 99 COLUMN asin_number FORMAT A11 HEADING "ASIN #" COLUMN item_title FORMAT A46 HEADING "ITEM TITLE" COLUMN item_rating FORMAT A6 HEADING "RATING" COLUMN item_release_date FORMAT A11 HEADING "RELEASE|DATE" /* * Query works only when item_import.csv file is present. */ SELECT asin_number , item_title , item_rating , TO_CHAR(item_release_date,'DD-MON-YYYY') AS item_release_date FROM TABLE(external_file_contents('ITEM_IMPORT_EXT_TABLE')); |
It should return the following from SQL*Plus:
RELEASE ASIN # ITEM TITLE RATING DATE ----------- ---------------------------------------------- ------ ----------- B000W74EQC Harry Potter and the Sorcerer's Stone PG 11-DEC-2007 B000W746GK Harry Potter and the Chamber of Secrets PG 11-DEC-2007 B000W796OM Harry Potter and the Prisoner of Azkaban PG 11-DEC-2007 B000E6EK2Y Harry Potter and the Goblet of Fire PG-13 07-MAR-2006 B000W7F5SS Harry Potter and the Order of the Phoenix PG-13 11-DEC-2007 B002PMV9FG Harry Potter and the Half-Blood Prince PG 08-DEC-2009 B001UV4XHY Harry Potter and the Deathly Hallows, Part 1 PG-13 15-APR-2011 B001UV4XIS Harry Potter and the Deathly Hallows, Part 2 PG-13 11-NOV-2011 |
The creation of the schema-level function lets you recreate the ITEM_IMPORT view. The following view would encapsulate (or hide) the presence of the function, which hides all the infrastructure components developed before this section (see line 14 in the function):
1 2 3 | CREATE OR REPLACE VIEW item_import AS SELECT * FROM TABLE(external_file_contents('ITEM_IMPORT_EXT_TABLE')); |
Implementing a Managed Import Process
During any import the information from the import process is exposed and one or more items may fail during the import process. That means the source file and loading log files must be preserved immediately after reading the data successfully. This is done by loading the data source file and log, discard, and bad import files into database tables. Only the source and log files exist when all rows are well formed, but the log files are reused for any subsequent load and require human inspection to isolate a specific upload.
The best way to implement this requires creating individual tables to hold each of the four potential large objects. The ITEM_MASTER table holds a transactional primary key and a table name for the import table. The primary key of the ITEM_MASTER table is the base key for imports and the ITEM_DATA, ITEM_LOG, ITEM_DISCARD, and ITEM_BAD tables hold foreign keys that point back to the ITEM_MASTER table’s primary key. These tables also hold a character large object column (CLOB), which will hold the respective source data file or log, discard, or bad files.
The following create the tables for the logging framework:
CREATE TABLE import_master ( import_master_id NUMBER CONSTRAINT pk_import_master PRIMARY KEY , import_table VARCHAR2(30)); -- Create sequence for import master. CREATE SEQUENCE import_master_s; -- Create import table. CREATE TABLE import_data ( import_data_id NUMBER CONSTRAINT pk_import_data PRIMARY KEY , import_master_id NUMBER , import_data CLOB , CONSTRAINT fk_import_data FOREIGN KEY (import_data_id) REFERENCES import_master (import_master_id)) LOB (import_data) STORE AS BASICFILE item_import_clob (TABLESPACE users ENABLE STORAGE IN ROW CHUNK 32768 PCTVERSION 10 NOCACHE LOGGING STORAGE (INITIAL 1048576 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645)); -- Create sequence for import master. CREATE SEQUENCE import_data_s; -- Create import table. CREATE TABLE import_log ( import_log_id NUMBER CONSTRAINT pk_import_log PRIMARY KEY , import_master_id NUMBER , import_log CLOB , CONSTRAINT fk_import_log FOREIGN KEY (import_log_id) REFERENCES import_master (import_master_id)) LOB (import_log) STORE AS BASICFILE item_import_log_clob (TABLESPACE users ENABLE STORAGE IN ROW CHUNK 32768 PCTVERSION 10 NOCACHE LOGGING STORAGE (INITIAL 1048576 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645)); -- Create sequence for import master. CREATE SEQUENCE import_log_s; -- Create import table. CREATE TABLE import_discard ( import_discard_id NUMBER CONSTRAINT pk_import_discard PRIMARY KEY , import_master_id NUMBER , import_discard CLOB , CONSTRAINT fk_import_discard FOREIGN KEY (import_discard_id) REFERENCES import_master (import_master_id)) LOB (import_discard) STORE AS BASICFILE item_import_discard_clob (TABLESPACE users ENABLE STORAGE IN ROW CHUNK 32768 PCTVERSION 10 NOCACHE LOGGING STORAGE (INITIAL 1048576 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645)); -- Create sequence for import master. CREATE SEQUENCE import_discard_s; -- Create import table. CREATE TABLE import_bad ( import_bad_id NUMBER CONSTRAINT pk_import_bad PRIMARY KEY , import_master_id NUMBER , import_bad CLOB , CONSTRAINT fk_import_bad FOREIGN KEY (import_bad_id) REFERENCES import_master (import_master_id)) LOB (import_bad) STORE AS BASICFILE item_import_bad_clob (TABLESPACE users ENABLE STORAGE IN ROW CHUNK 32768 PCTVERSION 10 NOCACHE LOGGING STORAGE (INITIAL 1048576 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645)); -- Create sequence for import master. CREATE SEQUENCE import_bad_s; |
The tables set the targets for uploading the source and log files. You should note that the table name is also the column name for the CLOB column, this becomes convenient when supporting a Native Dynamic SQL (NDS) statement in a single autonomous function. The LOAD_CLOB_FROM_FILE function supports reading the external source and log files and writing them their respective tables.
There is a DEADLOCK possibility with this type of architecture. It requires that the base row in the IMPORT_MASTER table is committed before attempting inserts into one of the dependent tables. A call to the function raises an error when the primary key column hasn’t been committed before hand.
You already set the access privileges for the DBMS_LOB package when you granted them to the UPLOAD_FILES and UPLOAD_LOGS virtual directories. This function only requires read permissions, which were granted to both virtual directories.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 | CREATE OR REPLACE FUNCTION load_clob_from_file ( pv_src_file_name IN VARCHAR2 , pv_virtual_dir IN VARCHAR2 , pv_table_name IN VARCHAR2 , pv_column_name IN VARCHAR2 , pv_foreign_key IN NUMBER ) RETURN NUMBER IS -- Declare placeholder for sequence generated primary key. lv_primary_key NUMBER; -- Declare default return value. lv_retval NUMBER := 0; -- Declare local variables for DBMS_LOB.LOADCLOBFROMFILE procedure. des_clob CLOB; src_clob BFILE := BFILENAME(pv_virtual_dir,pv_src_file_name); des_offset NUMBER := 1; src_offset NUMBER := 1; ctx_lang NUMBER := dbms_lob.default_lang_ctx; warning NUMBER; -- Declare pre-reading size. src_clob_size NUMBER; -- Declare variables for handling NDS sequence value. lv_sequence VARCHAR2(30); lv_sequence_output NUMBER; lv_sequence_tagline VARCHAR2(10) := '_s.nextval'; -- Define local variable for Native Dynamic SQL (NDS) Statement. stmt VARCHAR2(2000); -- Declare the function as an autonomous transaction. PRAGMA AUTONOMOUS_TRANSACTION; BEGIN -- Open file only when found. IF dbms_lob.fileexists(src_clob) = 1 AND NOT dbms_lob.isopen(src_clob) = 1 THEN src_clob_size := dbms_lob.getlength(src_clob); dbms_lob.open(src_clob,dbms_lob.lob_readonly); END IF; -- Concatenate the sequence name with the tagline. lv_sequence := pv_table_name || lv_sequence_tagline; -- Assign the sequence through an anonymous block. stmt := 'BEGIN ' || ' :output := '||lv_sequence||';' || 'END;'; -- Run the statement to extract a sequence value through NDS. EXECUTE IMMEDIATE stmt USING IN OUT lv_sequence_output; -- Create a dynamic statement that works for all source and log files. -- ---------------------------------------------------------------------- -- NOTE: This statement requires that the row holding the primary key -- has been committed because otherwise it raises the following -- error because it can't verify the integrity of the foreign -- key constraint. -- ---------------------------------------------------------------------- -- DECLARE -- * -- ERROR at line 1: -- ORA-00060: deadlock detected while waiting for resource -- ORA-06512: at "IMPORT.LOAD_CLOB_FROM_FILE", line 50 -- ORA-06512: at line 20 -- ---------------------------------------------------------------------- stmt := 'INSERT INTO '||pv_table_name||' '||CHR(10)|| 'VALUES '||CHR(10)|| '('||lv_sequence_output||CHR(10)|| ','||pv_foreign_key||CHR(10)|| ', empty_clob())'||CHR(10)|| 'RETURNING '||pv_column_name||' INTO :locator'; -- Run dynamic statement. EXECUTE IMMEDIATE stmt USING OUT des_clob; -- Read and write file to CLOB, close source file and commit. dbms_lob.loadclobfromfile( dest_lob => des_clob , src_bfile => src_clob , amount => dbms_lob.getlength(src_clob) , dest_offset => des_offset , src_offset => src_offset , bfile_csid => dbms_lob.default_csid , lang_context => ctx_lang , warning => warning ); -- Close open source file. dbms_lob.close(src_clob); -- Commit write and conditionally acknowledge it. IF src_clob_size = dbms_lob.getlength(des_clob) THEN COMMIT; lv_retval := 1; ELSE RAISE dbms_lob.operation_failed; END IF; RETURN lv_retval; END load_clob_from_file; / |
You can test this procedure against the data source file with the following script file:
-- Insert a sample row in the master table. INSERT INTO import_master VALUES (import_master_s.nextval,'ITEM_IMPORT_EXT_TABLE'); -- Record the row value to avoid deadlock on uncommitted master record. COMMIT; -- Test program for loading CLOB files. DECLARE -- Declare testing variables. lv_file_name VARCHAR2(255) := 'item_import.csv'; lv_virtual_dir VARCHAR2(255) := 'UPLOAD_FILES'; lv_table_name VARCHAR2(30) := 'IMPORT_DATA'; lv_column_name VARCHAR2(30) := 'IMPORT_DATA'; lv_foreign_key NUMBER; BEGIN -- Assign the current value of the sequence to a local variable. lv_foreign_key := import_master_s.currval; -- Check if you can read and insert a CLOB column. IF load_clob_from_file(lv_file_name ,lv_virtual_dir ,lv_table_name ,lv_table_name ,lv_foreign_key) = 1 THEN -- Display a successful subordinate routine. dbms_output.put_line('Subordinate routine succeeds.'); ELSE -- Display a failed subordinate routine. dbms_output.put_line('Subordinate routine fails.'); END IF; END load_clob_from_file; / |
You can test this procedure against the log file with the following script file:
DECLARE -- Declare testing variables. lv_file_name VARCHAR2(255) := 'item_import_ext_table.log'; lv_virtual_dir VARCHAR2(255) := 'UPLOAD_LOGS'; lv_table_name VARCHAR2(30) := 'IMPORT_LOG'; lv_column_name VARCHAR2(30) := 'IMPORT_LOG'; lv_foreign_key NUMBER; BEGIN -- Assign the current value of the sequence to a local variable. lv_foreign_key := import_master_s.currval; dbms_output.put_line('Foreign key ['||lv_foreign_key||']'); -- Check if you can read and insert a CLOB column. IF load_clob_from_file(lv_file_name ,lv_virtual_dir ,lv_table_name ,lv_table_name ,lv_foreign_key) = 1 THEN -- Display a successful subordinate routine. dbms_output.put_line('Subordinate routine succeeds.'); ELSE -- Display a failed subordinate routine. dbms_output.put_line('Subordinate routine fails.'); END IF; END; / |
You now have the ability to read and store the source and log files in CLOB columns. The next step is to write a master function that writes the master row and calls the LOAD_CLOB_FROM_FILE function for the source file and each of the log files. That’s what the CLEANUP_EXTERNAL_FILES function provides.
Unfortunately, the Java logic requires using the logical and operation, which is two ampersands (&&). This requires that you turn off substitution variables in SQL*Plus. You do that by disabling DEFINE, like this:
SET DEFINE OFF |
You can compile this Java library file after you’ve disabled LOAD_CLOB_FROM_FILE:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 | CREATE OR REPLACE FUNCTION cleanup_external_files ( table_in VARCHAR2 , data_directory_in VARCHAR2 , log_directory_in VARCHAR2 ) RETURN NUMBER IS -- Declare a local Attribute Data Type (ADT). TYPE list IS TABLE OF VARCHAR2(3); -- Declare a collection. lv_extension LIST := list('csv','log','bad','dis'); -- Define a default return value. retval NUMBER := 0; -- Declare base target table name. lv_target_table VARCHAR2(30) := 'IMPORT'; lv_foreign_key NUMBER; -- Decalre a cursor to find external tables. CURSOR check_source (cv_table_name VARCHAR2) IS SELECT xt.file_name FROM (SELECT uxt.TABLE_NAME , ixt.column_value AS file_name FROM user_external_tables uxt CROSS JOIN TABLE(list_files(get_directory_path(uxt.default_directory_name))) ixt) xt JOIN user_external_locations xl ON xt.TABLE_NAME = xl.TABLE_NAME AND xt.file_name = xl.location AND xt.TABLE_NAME = UPPER(cv_table_name); -- Declare a cursor to find files and compare for one input file name. CURSOR check_logs (cv_file_name VARCHAR2) IS SELECT list.column_value FROM TABLE(list_files(get_directory_path('UPLOAD_LOGS'))) list JOIN (SELECT cv_file_name AS file_name FROM dual) FILTER ON list.column_value = FILTER.file_name; -- Declare the function as autonomous. PRAGMA AUTONOMOUS_TRANSACTION; BEGIN -- Master loop to check for source and log files. FOR i IN check_source (table_in) LOOP -- Assign next sequence value to local variable. lv_foreign_key := import_master_s.nextval; -- Write the master record and commit it for the autonomous threads. INSERT INTO import_master VALUES (lv_foreign_key,'ITEM_IMPORT_EXT_TABLE'); COMMIT; -- Process all file extensions. FOR j IN 1..lv_extension.COUNT LOOP -- The source data file is confirmed by the CHECK_SOURCE cursor. IF lv_extension(j) = 'csv' THEN -- Load the source data file. -- ---------------------------------------------------------- -- The RETVAL holds success or failure, this approach -- suppresses an error when the file can't be loaded. -- It should only occur when there's no space available -- in the target table. retval := load_clob_from_file(i.file_name ,data_directory_in ,lv_target_table||'_DATA' ,lv_target_table||'_DATA' ,lv_foreign_key); lv_foreign_key := lv_foreign_key + 1; ELSE -- Verify that log file exists before attempting to load it. FOR k IN check_logs (LOWER(table_in)||'.'||lv_extension(j)) LOOP -- Load the log, bad, or dis(card) file. -- ---------------------------------------------------------- -- The RETVAL holds success or failure, as mentioned above. retval := load_clob_from_file(LOWER(table_in)||'.'||lv_extension(j) ,log_directory_in ,lv_target_table||'_'||lv_extension(j) ,lv_target_table||'_'||lv_extension(j) ,lv_foreign_key); END LOOP; END IF; END LOOP; retval := 1; END LOOP; RETURN retval; END; / |
Deleting Files from Virtual Directories
After you’ve read the files through a query and uploaded the source and log files to the database, you need to cleanup the files. This can be done by using another Java library function, provided you granted read, write, and delete privileges to the internal Java permissions file.
The DeleteFile Java library deletes files from the file system. It doesn’t put them in the trash can for final delete, it removes them completely.
Now you can build the Java library that lets you delete a file. A quick caveat, this code includes an AND logical operator that is two ampersands (&&). SQL uses an ampersand (&) for substitution variables. You’ll need to suppress that behavior when you run this code.
You do that by issuing the following command to disable substitution variables in SQL*Plus:
1 | SET DEFINE OFF |
You create the DeleteFile library like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | CREATE OR REPLACE AND COMPILE JAVA SOURCE NAMED "DeleteFile" AS // Java import statements import java.io.File; import java.security.AccessControlException; // Class definition. public class DeleteFile { // Define variable(s). private static File file; // Define copyTextFile() method. public static void deleteFile(String fileName) throws AccessControlException { // CREATE files FROM canonical file names. file = NEW File(fileName); // DELETE file(s). IF (file.isFile() && file.delete()) {}}} / |
You need a PL/SQL Wrapper to call the library, and here it is:
1 2 3 4 | CREATE OR REPLACE PROCEDURE delete_file (dfile VARCHAR2) IS LANGUAGE JAVA NAME 'DeleteFile.deleteFile(java.lang.String)'; / |
You can call this separately or embed it inside the UPLOAD_LOGS function, which saves re-writing the logic to find any source or log files.
This has provided you with an external table import framework. You can extend the framework by wrapping the query in an object table function. Such a function would afford you the opportunity to cleanup the source and log files after the query operation.
How to use object types?
A tale of Oracle SQL object types, their constructors, and how you use them. This demonstrates what you can and can’t do and gives brief explanations about why.
The following creates a base SAMPLE_OBJECT
data type and a sample_table
collection of the base SAMPLE_OBJECT
data type.
CREATE OR REPLACE TYPE sample_object IS OBJECT (id NUMBER ,name VARCHAR2(30)); / CREATE OR REPLACE TYPE sample_table IS TABLE OF sample_object; / |
If the base SAMPLE_OBJECT
data type were a Java object, the default constructor of an empty call parameter list would allow you to construct an instance variable. This doesn’t work for an Oracle object type because the default constructor is a formal parameter list of the object attributes in the positional order of their appearance in the declaration statement.
The test case on this concept is:
1 2 3 4 5 6 | DECLARE lv_object_struct SAMPLE_OBJECT := sample_object(); BEGIN NULL; END; / |
Running the program raises the following exception, which points to the object instance constructor from line 2 above:
lv_object_struct SAMPLE_OBJECT := sample_object(); * ERROR at line 2: ORA-06550: line 2, column 37: PLS-00306: wrong number or types of arguments in call to 'SAMPLE_OBJECT' ORA-06550: line 2, column 20: PL/SQL: Item ignored |
Changing the instantiation call to the Oracle design default, two null values let you create
an instance of the SAMPLE_OBJECT
type. The following shows that concept, which works when the base object type allows null values.
1 2 3 4 5 6 | DECLARE lv_object_struct SAMPLE_OBJECT := sample_object(NULL, NULL); BEGIN NULL; END; / |
If you want to have a null parameter constructor for an object type, you must implement a type and type body with an overloaded no argument constructor, like this:
1 2 3 4 5 | CREATE OR REPLACE TYPE sample_object IS OBJECT ( id NUMBER , name VARCHAR2(30) , CONSTRUCTOR FUNCTION sample_object RETURN SELF AS RESULT); / |
1 2 3 4 5 6 7 8 9 | CREATE OR REPLACE TYPE BODY sample_object IS CONSTRUCTOR FUNCTION sample_object RETURN SELF AS RESULT IS sample_obj SAMPLE_OBJECT := sample_object(NULL,NULL); BEGIN SELF := sample_obj; RETURN; END sample_object; END; / |
Unlike Java, the addition of an overloaded constructor doesn’t drop the default constructor. You can also create a single parameter constructor that leverages the sequence like this:
1 2 3 4 5 6 | CREATE OR REPLACE TYPE sample_object IS OBJECT ( id NUMBER , name VARCHAR2(30) , CONSTRUCTOR FUNCTION sample_object RETURN SELF AS RESULT , CONSTRUCTOR FUNCTION sample_object (pv_name VARCHAR2) RETURN SELF AS RESULT); / |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | CREATE OR REPLACE TYPE BODY sample_object IS CONSTRUCTOR FUNCTION sample_object RETURN SELF AS RESULT IS sample_obj SAMPLE_OBJECT := sample_object(sample_object_id.NEXTVAL,NULL); BEGIN SELF := sample_obj; END sample_object; CONSTRUCTOR FUNCTION sample_object (pv_name VARCHAR2) RETURN SELF AS RESULT IS sample_obj SAMPLE_OBJECT := sample_object(sample_object_id.NEXTVAL,pv_name); BEGIN SELF := sample_obj; RETURN; END sample_object; END; / |
You can test the final object type and body with this anonymous block of code:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | SET SERVEROUTPUT ON SIZE UNLIMITED DECLARE lv_object_struct1 SAMPLE_OBJECT := sample_object(); lv_object_struct2 SAMPLE_OBJECT := sample_object('User Name'); lv_object_struct3 SAMPLE_OBJECT := sample_object(1001,'User Name'); BEGIN dbms_output.put_line('lv_object_struct1.id ['||lv_object_struct1.id||']'); dbms_output.put_line('lv_object_struct1.name ['||lv_object_struct1.name||']'); dbms_output.put_line('lv_object_struct2.id ['||lv_object_struct2.id||']'); dbms_output.put_line('lv_object_struct2.name ['||lv_object_struct2.name||']'); lv_object_struct2.name := 'Changed Name'; dbms_output.put_line('lv_object_struct2.id ['||lv_object_struct2.id||']'); dbms_output.put_line('lv_object_struct2.name ['||lv_object_struct2.name||']'); dbms_output.put_line('lv_object_struct3.id ['||lv_object_struct3.id||']'); dbms_output.put_line('lv_object_struct3.name ['||lv_object_struct3.name||']'); END; / |
It prints to console:
lv_object_struct1.id [1] lv_object_struct1.name [] lv_object_struct2.id [2] lv_object_struct2.name [User Name] lv_object_struct2.id [2] lv_object_struct2.name [Changed Name] lv_object_struct3.id [1001] lv_object_struct3.name [User Name] |
Hope this helps those looking for a quick syntax example and explanation.
Function or Procedure?
Somebody asked for a simple comparison between a PL/SQL pass-by-value function and pass-by-reference procedure, where the procedure uses only an OUT
mode parameter to return the result. This provides examples of both, but please note that a pass-by-value function can be used in SQL or PL/SQL context while a pass-by-reference procedure can only be used in another anonymous of named block PL/SQL program.
The function and procedure let you calculate the value of a number raised to a power of an exponent. The third parameter lets you convert the exponent value to an inverse value, like 2 to 1/2.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | CREATE OR REPLACE FUNCTION find_root_function ( pv_number BINARY_DOUBLE , pv_power BINARY_DOUBLE , pv_inverse BINARY_INTEGER DEFAULT 0 ) RETURN BINARY_DOUBLE IS -- Declare local variable for return value. lv_result BINARY_DOUBLE; BEGIN -- If the inverse value is anything but zero calculate the inverse of the power. IF pv_inverse = 0 THEN lv_result := POWER(pv_number,pv_power); ELSE lv_result := POWER(pv_number,(1 / pv_power)); END IF; RETURN lv_result; END find_root_function; / |
You can test it with these to queries against the dual
table:
SELECT TO_CHAR(find_root_function(4,3),'99,999.90') FROM dual; SELECT TO_CHAR(find_root_function(125,3,1),'99,999.90') FROM dual; |
The procedure does the same thing as the function. The difference is that the fourth parameter to the procedure returns the value rather than a formal return type like a function.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | CREATE OR REPLACE PROCEDURE find_root_procedure ( pv_number IN BINARY_DOUBLE , pv_power IN BINARY_DOUBLE , pv_inverse IN BINARY_INTEGER DEFAULT 0 , pv_return OUT BINARY_DOUBLE ) IS BEGIN -- If the inverse value is anything but zero calculate the inverse of the power. IF pv_inverse = 0 THEN pv_return := POWER(pv_number,pv_power); ELSE dbms_output.put_line('here'); pv_return := POWER(pv_number,(1 / pv_power)); END IF; END find_root_procedure; / |
You can test it inside an anonymous block PL/SQL program, like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | DECLARE -- Declare input variables. lv_input BINARY_DOUBLE; lv_power BINARY_DOUBLE; lv_inverse BINARY_INTEGER; lv_output BINARY_DOUBLE; BEGIN -- Assign input values to variables. lv_input := '&1'; lv_power := '&2'; lv_inverse := '&3'; -- Test raising to a power. find_root_procedure(lv_input, lv_power, lv_inverse, lv_output); dbms_output.put_line(TO_CHAR(lv_output,'99,999.90')); -- Test raising to an inverse power. find_root_procedure(lv_input, lv_power, lv_inverse, lv_output); dbms_output.put_line(TO_CHAR(lv_output,'99,999.90')); END; / |
You can test it inside an anonymous block PL/SQL program, like the following example. For reference, the difference between PL/SQL and the SQL*Plus environment is large. The EXECUTE
call is correct in SQL*Plus but would be incorrect inside a PL/SQL block for a Native Dynamic SQL (NDS) call. Inside a PL/SQL block you would use EXECUTE IMMEDIATE
because it dispatches a call from the current running scope to a nested scope operation (see comment below).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | -- SQL*Plus Test. VARIABLE sv_input BINARY_DOUBLE VARIABLE sv_power BINARY_DOUBLE VARIABLE sv_inverse BINARY_DOUBLE VARIABLE sv_output BINARY_DOUBLE -- Verify the null value of the session variable. SELECT :sv_output AS ":sv_output" FROM dual; BEGIN -- Prompt for local assignments and initialize output variable. :sv_input := '&1'; :sv_power := '&2'; :sv_inverse := '&3'; :sv_output := 0; END; / -- Run the procedure in the SQL*Plus scope. EXECUTE find_root_procedure(:sv_input, :sv_power, :sv_inverse, :sv_output); -- Query the new value of the session variable. SELECT TO_CHAR(:sv_output,'99,999.90') AS ":output" FROM dual; |
As usual, I hope this helps folks beyond the one who asked. Comments are always welcome.
Updating Table View Columns
Answering a reader’s question: How can you sort data inside an Oracle table view column? This blog post shows you how to perform the trick, but for the record I’m not a fan of nested tables. A table view column is an Oracle specific user-defined type (UDT), and is nested table or varray of a scalar data type.
Oracle’s assigned a formal name to this type of UDT. It’s now labeled an Attribute Data Type (ADT). The ADT doesn’t allow you to update nested elements outside of PL/SQL program units.
This blog post reviews table view columns, and extends concepts from Oracle Database 11g & MySQL 5.6 Developer Handbook (by the way virtually everything in the book is relevant from MySQL 5.1 forward). It demonstrates how you can use PL/SQL user-defined functions (UDFs) to supplement the SQL semantics for updating nested tables, and then it shows how you can reshuffle (sort) data store the sorted data in table view columns.
Before you implement table view columns, you should answer two design questions and one relational modeling principal. You should also understand that this direction isn’t portable across database implementations. It currently supported fully by the Oracle database and mostly by PostgreSQL database. You can find how to join nested tables helpful in understanding the UPDATE
statements used in this posting, and this earlier post on UPDATE
and DELETE
statements.
Design Questions:
- Should you implement full object types with access methods in PL/SQL? The object type solution says there is no value in the nested data outside of the complete object. While choosing the table view column solution says that there is value to just implementing a nested list without element handling methods.
- Should you embed the elements in an XML_TYPE? An XML solution supports hierarchical node structures more naturally, like when you only access child nodes through the parent node. While choosing the table view column solution says that you want to avoid the XML Software Development Kit and that the data set is small and more manageable in a table view column.
Design Principle:
- Should you implement an ID-dependent relational modeling concept? An ID-dependent model replaces the primary and foreign keys with the relative position of parent and child elements. This is the design adopted when you choose a table view column, and it is more complex than single subject relational tables.
You should note that table view columns are inherently static at creation. You must also update the entire nested table view column when using Oracle SQL. Oracle SQL does let you modified attributes of object types in nested tables, as qualified in my new book (page 252).
Any attempt to modify a table view column element in SQL raises an ORA-25015 error. The error message states that (you) cannot perform DML on this nested TABLE VIEW COLUMN
.
You can update the table view column value by replacing it with a new collection, and that’s done with a PL/SQL function. This type of function preserves the ordered list in the table view column by finding and replacing an element in the collection.
Unfortunately, developers who use nested tables typically design table view columns with an internal ordering scheme. That means the collection is ordered during insert or update. This type of design relies on the fact that you can’t change the order without re-writing the stored structure.
While common for those you use these, it is a bad practice to rely on the ordering of elements in a collection. At least, it’s a bad practice when we’re trying to work within the relational model. All that aside, here’s how you ensure element updates while preserving element position:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 | CREATE OR REPLACE FUNCTION update_collection ( old_element_collection STREET_LIST , old_element_value VARCHAR2 , new_element_value VARCHAR2 ) RETURN STREET_LIST IS -- Declare and initial a new counter. lv_counter NUMBER := 1; -- Declare local return collection variable. lv_element_collection STREET_LIST := street_list(); BEGIN FOR i IN 1..old_element_collection.COUNT LOOP IF NOT old_element_collection(i) = old_element_value THEN lv_element_collection.EXTEND; lv_element_collection(lv_counter) := old_element_collection(i); ELSE lv_element_collection.EXTEND; lv_element_collection(lv_counter) := new_element_value; END IF; lv_counter := lv_counter + 1; END LOOP; RETURN lv_element_collection; END update_collection; / |
Then, you can use the user-defined function (UDF) inside a SQL UPDATE
statement, like this:
1 2 3 4 5 6 | UPDATE TABLE (SELECT e.home_address FROM employee e WHERE e.employee_id = 1) e SET e.street_address = update_collection(e.street_address, 'Suite 525','Suite 522') , e.city = 'Oakland' WHERE e.address_id = 1; |
The UPDATE_COLLECTION
function replaces Suite 525 with Suite 522, and preserves the sequence of elements in a new nested table. The UPDATE
statement assigns the modified nested table to the table view column. You can find the code to create the employee table in Chapter 6 (pages 148-149), and the code to insert the default data in Chapter 8 (page 229) of Oracle Database 11g & MySQL 5.6.
The lv_counter
variable could be replaced with a reference to the for loop’s iterator (i
) because the counts of both collections are the same. I opted for the local variable to make the code easier to read.
While common for those you use these, it is a bad practice to rely on the ordering of elements in a collection. At least, it’s a bad practice when we’re trying to work within the relational model. Along the same line of thought, you also have the ability of removing elements from a table view column with a similar PL/SQL function. You could write the function like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | CREATE OR REPLACE FUNCTION delete_from_collection ( old_element_collection STREET_LIST , old_element_value VARCHAR2 ) RETURN STREET_LIST IS -- Declare and initial a new counter. lv_counter NUMBER := 1; -- Declare local return collection variable. lv_element_collection STREET_LIST := street_list(); BEGIN FOR i IN 1..old_element_collection.COUNT LOOP IF NOT old_element_collection(i) = old_element_value THEN lv_element_collection.EXTEND; lv_element_collection(lv_counter) := old_element_collection(i); lv_counter := lv_counter + 1; END IF; END LOOP; RETURN lv_element_collection; END delete_from_collection; / |
Then, you can use the user-defined function (UDF) to delete an element from the collection inside a SQL UPDATE
statement, like this:
1 2 3 4 5 6 | UPDATE TABLE (SELECT e.home_address FROM employee1 e WHERE e.employee_id = 1) e SET e.street_address = delete_from_collection(e.street_address,'Suite 522') , e.city = 'Oakland' WHERE e.address_id = 1; |
After understanding all that, let’s examine how you sort data in a nested table or varray of a scalar data type (the basis of a table view column). The easiest way is a BULK COLLECT INTO
statement nested inside a function, like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | CREATE OR REPLACE FUNCTION sort_collection ( old_element_collection STREET_LIST) RETURN STREET_LIST IS -- Declare and initial a new counter. lv_counter NUMBER := 1; -- Declare local return collection variable. lv_element_collection STREET_LIST := street_list(); BEGIN -- Sort a collection alphabetically based on case sensitivity. SELECT column_value BULK COLLECT INTO lv_element_collection FROM TABLE(old_element_collection) ORDER BY column_value; RETURN lv_element_collection; END sort_collection; / |
You could test it with this:
1 2 | SELECT column_value FROM TABLE(sort_collection(street_list('Adams', 'Lewis', 'Clark', 'Fallon'))); |
Then, you can use the user-defined function (UDF) to update a table view column like this:
1 2 3 4 5 6 | UPDATE TABLE (SELECT e.home_address FROM employee1 e WHERE e.employee_id = 1) e SET e.street_address = sort_collection(e.street_address) , e.city = 'Oakland' WHERE e.address_id = 1; |
The funny thing about database solutions these days is that some Java developers don’t appreciate the simplicity of SQL and PL/SQL and would solve the problem with Java. Especially, if it was an case insensitive sort operation. That’s the hard way (easy way at the bottom), but I figured it should be thrown in because some folks think everything is generic if written in Java. Though, I thought making it proprietary would increase the irony and wrote it as a Java library for Oracle.
Here’s the Java library, which you can run from the SQL*Plus command line, SQL Developer, or that pricey Toad:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 | CREATE OR REPLACE AND COMPILE JAVA SOURCE NAMED "SortOracleList" AS // Import required classes. import java.io.*; import java.security.AccessControlException; import java.sql.*; import java.util.Arrays; import oracle.sql.driver.*; import oracle.sql.ArrayDescriptor; import oracle.sql.ARRAY; // Define class. public class DemoSort { public static ARRAY getList(oracle.sql.ARRAY list) throws SQLException, AccessControlException { // Convert Oracle data type to Java data type. String[] unsorted = (String[])list.getArray(); // Sort elements. Arrays.sort(unsorted, String.CASE_INSENSITIVE_ORDER); // Define a connection (this is for Oracle 11g). Connection conn = DriverManager.getConnection("jdbc:default:connection:"); // Declare a mapping to the schema-level SQL collection type. ArrayDescriptor arrayDescriptor = new ArrayDescriptor("STRINGLIST",conn); // Translate the Java String{} to the Oracle SQL collection type. ARRAY sorted = new ARRAY(arrayDescriptor,conn,((Object[])unsorted)); return sorted; }} / |
Then, you write the PL/SQL wrapper like this:
1 2 3 4 | CREATE OR REPLACE FUNCTION sortTable(list STRINGLIST) RETURN STRINGLIST IS LANGUAGE JAVA NAME 'DemoSort.getList(oracle.sql.ARRAY) return oracle.sql.ARRAY'; / |
You could test the case insensitive sort with this:
1 2 | SELECT column_value FROM TABLE(sort_collection(street_list('Adams', 'adams', 'Lewis', 'Clark', 'Fallon'))); |
Naturally, it ignores the fact you could do it like this without Java by using the UPPER
function in the purely PL/SQL SORT_COLLECTION
function shown earlier in this post:
12 13 14 15 | -- Sort a collection alphabetically based on case insensitive comparison. SELECT column_value BULK COLLECT INTO lv_element_collection FROM TABLE(old_element_collection) ORDER BY UPPER(column_value); |
Anyway, it’s a bunch of thoughts about writing solutions for table view columns. Hope it helps those interested in nested tables.
Oracle SQL Tuning Example
Somebody wanted a post on SQL statement tuning, so here it is. SQL statement tuning requires that you understand table definitions, constraints, and indexes plus how they interact through joins before you can begin tuning. Oracle presents some of the most advanced tools for tuning SQL statements.
This example works through a demonstration of tuning one statement with multiple joins. The key elements of tuning any SQL statements is understanding the table structures and data. The key worry about tuning is that optimizing for a query can harm the effectiveness of OLTP processing by slowing the performance of INSERT
and UPDATE
statements. Any DML statement that adds or modifies data in tables also modifies indexes built to support queries against the table.
Oracle
You have two approaches in an Oracle database. The old way and the new way. They’re really not that different but the DBMS_XPLAN
package provides a richer set of analytical data for less work.
The old way and the new way both use a single way for generating the SQL execution plan. You do that with the EXPLAIN PLAN
statement.
The EXPLAIN PLAN
statement is followed by a SET
operator. You use the SET
operator to assign a value to the STATEMENT_ID
column. This is a column that uniquely identifies statements in the PLAN_TABLE
. You must ensure that all STATEMENT_ID
values are unique, or you’ll get incorrect results when you analyze plans. There’s no constraint that magically does this for you. After you set the statement identifier, you use the FOR
keyword before the statement that you want to analyze.
An example query follows below:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | EXPLAIN PLAN SET STATEMENT_ID = '&input' FOR SELECT DISTINCT r.rental_id , c.contact_id , tu.check_out_date AS check_out_date , tu.return_date AS return_date , 3 AS created_by , TRUNC(SYSDATE) AS creation_date , 3 AS last_updated_by , TRUNC(SYSDATE) AS last_update_date FROM member m INNER JOIN contact c ON m.member_id = c.member_id INNER JOIN transaction_upload tu ON c.first_name = tu.first_name AND NVL(c.middle_name,'x') = NVL(tu.middle_name,'x') AND c.last_name = tu.last_name AND tu.account_number = m.account_number LEFT JOIN rental r ON c.contact_id = r.customer_id AND tu.check_out_date = r.check_out_date AND tu.return_date = r.return_date; |
After you explain a statement, you need to analyze it. Oracle uses hierarchical queries to analyze them. The following is a simple script that lets you display the execution plan by nesting the levels of the explained plan.
Here’s where you chose to follow the old or new way. Click on the link below to see the old way if you’re interested but I’d suggest you use the new way.
Oracle’s Old Fashioned Way ↓
This sections shows the old way of analyzing explain plans.
The first step after you’ve explained the statement, is to analyze it. The following query performs that analysis and returns the cost of the join. More information is available with the DBMS_XPLAN
package. You can also extend this to get more information from the PLAN_TABLE
but it’s a lot of work that’s already done for you.
It’s nice to see the mechanics of how this analysis is performed, and that’s why this is in the post. This uses Oracle’s hierarchical query mechanism.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 | -- Clear any reporting breaks, columns, or computing. CLEAR BREAKS CLEAR COLUMNS CLEAR COMPUTES -- Set environment to capture explained plan. SET FEEDBACK OFF SET NULL "" SET LINESIZE 120 SET PAGESIZE 9999 SET PAUSE OFF SET VERIFY OFF -- Override default substitution variable prompt. ACCEPT input1 PROMPT "What do you want as a statement ID? " -- Format the Query Plan based on a 120 character length. COL c1 FORMAT A80 HEADING 'Query Plan' -- Spool it to file. SPOOL explain_it.log -- Create the hierarchy of the join plan. SELECT LPAD(' ',2*(LEVEL-1)) || operation||' ' || options||' ' || object_name||' ' || DECODE(id,0,'Cost = '||POSITION) "Query Plan" FROM plan_table START WITH id = 0 AND statement_id = '&input1' CONNECT BY PRIOR id = parent_id AND statement_id = '&input1'; -- Reset operating environment to default. SET FEEDBACK ON SET NULL "<Null>" SET LINESIZE 80 SET PAGESIZE 999 SET VERIFY ON -- Close output file. SPOOL OFF |
Assuming you entered Query1
as the input response to the EXPLAIN PLAN
statement and you name the query above explain_it.sql
, then you’d call it with the following syntax:
1 2 | @explain_it.sql What do you want AS a statement ID? Query1 |
The query starts at the top-most node aligned to the left. Each level down the node shifts two spaces in to the right. You start moving out of the process when the indenting reverses. It would generate an output file like the following:
1 2 3 4 5 6 7 8 9 10 11 12 | Query Plan -------------------------------------------------------------------------------- SELECT STATEMENT Cost = 34 HASH UNIQUE HASH JOIN OUTER VIEW NESTED LOOPS HASH JOIN TABLE ACCESS FULL MEMBER TABLE ACCESS FULL CONTACT INDEX RANGE SCAN IMPORT_DATE_RANGE TABLE ACCESS FULL RENTAL |
This tells you that the join isn’t cheap from a system resource standpoint. You can improve the query by creating indexes. Those steps are shown using the new method.
Oracle’s New Way
Oracle’s new way uses the DBMS_XPLAN
package. In this post, you’ll examine only how to display results and check the impact of small changes that lead to improved query performance.
Assuming you entered Query1
as the &input
response to the EXPLAIN PLAN
statement above, then you’d call it with the following syntax:
1 2 3 4 5 6 7 8 | -- Expand the width of displayed output. SET LINESIZE 120 -- Query using the TABLE function that returns a user-defined object type (UDT) -- into a SQL aggregate table, which is a fancy word for normal SQL output -- (unfortunately, it's the technically accurate word). SELECT * FROM TABLE(dbms_xplan.display(NULL,'Query1')); |
The query using the DISPLAY
function of the DBMS_XPLAN
returns a table of data. It should generate an output stream like the following:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 | PLAN_TABLE_OUTPUT ---------------------------------------------------------------------------------------------------------- Plan hash value: 3289798709 ----------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 229 | 24274 | 50 (8)| 00:00:01 | | 1 | HASH UNIQUE | | 229 | 24274 | 50 (8)| 00:00:01 | |* 2 | HASH JOIN OUTER | | 229 | 24274 | 49 (7)| 00:00:01 | | 3 | VIEW | | 4 | 248 | 37 (6)| 00:00:01 | |* 4 | HASH JOIN | | 4 | 572 | 37 (6)| 00:00:01 | |* 5 | HASH JOIN | | 15 | 1230 | 7 (15)| 00:00:01 | | 6 | TABLE ACCESS FULL | MEMBER | 9 | 180 | 3 (0)| 00:00:01 | | 7 | TABLE ACCESS FULL | CONTACT | 15 | 930 | 3 (0)| 00:00:01 | | 8 | EXTERNAL TABLE ACCESS FULL| TRANSACTION_UPLOAD | 8168 | 486K| 30 (4)| 00:00:01 | | 9 | TABLE ACCESS FULL | RENTAL | 4689 | 201K| 11 (0)| 00:00:01 | ----------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("TU"."RETURN_DATE"="R"."RETURN_DATE"(+) AND "TU"."CHECK_OUT_DATE"="R"."CHECK_OUT_DATE"(+) AND "C"."CONTACT_ID"="R"."CUSTOMER_ID"(+)) 4 - access("C"."FIRST_NAME"="TU"."FIRST_NAME" AND NVL("C"."MIDDLE_NAME",'x')=NVL("TU"."MIDDLE_NAME",'x') AND "C"."LAST_NAME"="TU"."LAST_NAME" AND "TU"."ACCOUNT_NUMBER"="M"."ACCOUNT_NUMBER") 5 - access("C"."MEMBER_ID"="M"."MEMBER_ID") Note ----- - dynamic sampling used for this statement 30 rows selected. |
The first thing to do is move the data immediately from an external table into a production table. You do that with the following command:
1 2 | -- Create a new table from an existing one. CREATE TABLE transaction_uploaded AS SELECT * FROM transaction_upload; |
Truncating the PLAN_TABLE
table, is the easiest way to start over by reusing the statement identifiers, like Query1
. Then, you can edit your EXPLAIN PLAN
script and change TRANSACTION_UPLOAD
table reference to TRANSACTION_UPLOADED
. Now, the query using the DISPLAY
function of the DBMS_XPLAN
returns a reduction in cost from 50 to 34.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 | PLAN_TABLE_OUTPUT -------------------------------------------------------------------------------------------- Plan hash value: 3624831533 -------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 400 | 42400 | 34 (6)| 00:00:01 | | 1 | HASH UNIQUE | | 400 | 42400 | 34 (6)| 00:00:01 | |* 2 | HASH JOIN OUTER | | 400 | 42400 | 33 (4)| 00:00:01 | | 3 | VIEW | | 7 | 434 | 22 (5)| 00:00:01 | | 4 | NESTED LOOPS | | 7 | 1001 | 22 (5)| 00:00:01 | |* 5 | HASH JOIN | | 15 | 1230 | 7 (15)| 00:00:01 | | 6 | TABLE ACCESS FULL| MEMBER | 9 | 180 | 3 (0)| 00:00:01 | | 7 | TABLE ACCESS FULL| CONTACT | 15 | 930 | 3 (0)| 00:00:01 | |* 8 | INDEX RANGE SCAN | IMPORT_DATE_RANGE | 1 | 61 | 1 (0)| 00:00:01 | | 9 | TABLE ACCESS FULL | RENTAL | 4689 | 201K| 11 (0)| 00:00:01 | -------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("TU"."RETURN_DATE"="R"."RETURN_DATE"(+) AND "TU"."CHECK_OUT_DATE"="R"."CHECK_OUT_DATE"(+) AND "C"."CONTACT_ID"="R"."CUSTOMER_ID"(+)) 5 - access("C"."MEMBER_ID"="M"."MEMBER_ID") 8 - access("TU"."ACCOUNT_NUMBER"="M"."ACCOUNT_NUMBER" AND "C"."LAST_NAME"="TU"."LAST_NAME" AND "C"."FIRST_NAME"="TU"."FIRST_NAME" AND NVL("C"."MIDDLE_NAME",'x')=NVL("MIDDLE_NAME",'x')) Note ----- - dynamic sampling used for this statement 31 rows selected. |
A downside or quirk of thought, I’m never quite sure but folks look to the bottom not the middle first. Maybe in this case you tackle it first because it’s the attractive inefficiency due to the number of rows returned by a full table scan (most expensive).
You add a unique index that maps to the natural key for the RENTAL
table, like this:
1 2 | CREATE UNIQUE INDEX natural_key_rental ON rental (rental_id, customer_id, check_out_date, return_date); |
You run the EXPLAIN PLAN
again, using Query2
, and then analyze it. You’ll see that the query uses an index fast full scan and that the cost decreases by 2.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 | PLAN_TABLE_OUTPUT ----------------------------------------------------------------------------------------------- Plan hash value: 3402838417 --------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 400 | 42400 | 32 (7)| 00:00:01 | | 1 | HASH UNIQUE | | 400 | 42400 | 32 (7)| 00:00:01 | |* 2 | HASH JOIN OUTER | | 400 | 42400 | 31 (4)| 00:00:01 | | 3 | VIEW | | 7 | 434 | 22 (5)| 00:00:01 | | 4 | NESTED LOOPS | | 7 | 1001 | 22 (5)| 00:00:01 | |* 5 | HASH JOIN | | 15 | 1230 | 7 (15)| 00:00:01 | | 6 | TABLE ACCESS FULL| MEMBER | 9 | 180 | 3 (0)| 00:00:01 | | 7 | TABLE ACCESS FULL| CONTACT | 15 | 930 | 3 (0)| 00:00:01 | |* 8 | INDEX RANGE SCAN | IMPORT_DATE_RANGE | 1 | 61 | 1 (0)| 00:00:01 | | 9 | INDEX FAST FULL SCAN| NATURAL_KEY_RENTAL | 4689 | 201K| 9 (0)| 00:00:01 | --------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("TU"."RETURN_DATE"="R"."RETURN_DATE"(+) AND "TU"."CHECK_OUT_DATE"="R"."CHECK_OUT_DATE"(+) AND "C"."CONTACT_ID"="R"."CUSTOMER_ID"(+)) 5 - access("C"."MEMBER_ID"="M"."MEMBER_ID") 8 - access("TU"."ACCOUNT_NUMBER"="M"."ACCOUNT_NUMBER" AND "C"."LAST_NAME"="TU"."LAST_NAME" AND "C"."FIRST_NAME"="TU"."FIRST_NAME" AND NVL("C"."MIDDLE_NAME",'x')=NVL("MIDDLE_NAME",'x')) Note ----- - dynamic sampling used for this statement 31 rows selected. |
You add a unique index that maps the surrogate and natural key for the MEMBER
table, like this:
1 2 | CREATE UNIQUE INDEX member_account ON member (member_id, account_number); |
You run the EXPLAIN PLAN
again, using Query3
, and then analyze it. You’ll see that the query uses an index range scan instead of a full table access of the MEMBER
table. This decreases the cost by 2.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 | PLAN_TABLE_OUTPUT ---------------------------------------------------------------------------------------------- Plan hash value: 656512492 --------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 400 | 42400 | 30 (7)| 00:00:01 | | 1 | HASH UNIQUE | | 400 | 42400 | 30 (7)| 00:00:01 | |* 2 | HASH JOIN OUTER | | 400 | 42400 | 29 (4)| 00:00:01 | | 3 | VIEW | | 7 | 434 | 19 (0)| 00:00:01 | | 4 | NESTED LOOPS | | 7 | 1001 | 19 (0)| 00:00:01 | | 5 | NESTED LOOPS | | 15 | 1230 | 4 (0)| 00:00:01 | | 6 | TABLE ACCESS FULL| CONTACT | 15 | 930 | 3 (0)| 00:00:01 | |* 7 | INDEX RANGE SCAN | MEMBER_ACCOUNT | 1 | 20 | 1 (0)| 00:00:01 | |* 8 | INDEX RANGE SCAN | IMPORT_DATE_RANGE | 1 | 61 | 1 (0)| 00:00:01 | | 9 | INDEX FAST FULL SCAN| NATURAL_KEY_RENTAL | 4689 | 201K| 9 (0)| 00:00:01 | --------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("TU"."RETURN_DATE"="R"."RETURN_DATE"(+) AND "TU"."CHECK_OUT_DATE"="R"."CHECK_OUT_DATE"(+) AND "C"."CONTACT_ID"="R"."CUSTOMER_ID"(+)) 7 - access("C"."MEMBER_ID"="M"."MEMBER_ID") 8 - access("TU"."ACCOUNT_NUMBER"="M"."ACCOUNT_NUMBER" AND "C"."LAST_NAME"="TU"."LAST_NAME" AND "C"."FIRST_NAME"="TU"."FIRST_NAME" AND NVL("C"."MIDDLE_NAME",'x')=NVL("MIDDLE_NAME",'x')) Note ----- - dynamic sampling used for this statement 31 rows selected. |
You add a unique index that maps the surrogate, foreign key to the MEMBER
table, and the natural key for the CONTACT
table, like this:
1 2 | CREATE UNIQUE INDEX contact_member ON contact (contact_id, member_id, last_name, first_name, NVL(middle_name,'x')); |
You run the EXPLAIN PLAN
again, using Query3
, and then analyze it. You’ll see that the query uses an index full scan instead of a full table access of the CONTACT
table. This decreases the cost by another 2.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 | PLAN_TABLE_OUTPUT ---------------------------------------------------------------------------------------------- Plan hash value: 1185696375 --------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 400 | 42400 | 28 (8)| 00:00:01 | | 1 | HASH UNIQUE | | 400 | 42400 | 28 (8)| 00:00:01 | |* 2 | HASH JOIN OUTER | | 400 | 42400 | 27 (4)| 00:00:01 | | 3 | VIEW | | 7 | 434 | 17 (0)| 00:00:01 | | 4 | NESTED LOOPS | | 7 | 1001 | 17 (0)| 00:00:01 | | 5 | NESTED LOOPS | | 15 | 1230 | 2 (0)| 00:00:01 | | 6 | INDEX FULL SCAN | CONTACT_MEMBER | 15 | 930 | 1 (0)| 00:00:01 | |* 7 | INDEX RANGE SCAN | MEMBER_ACCOUNT | 1 | 20 | 1 (0)| 00:00:01 | |* 8 | INDEX RANGE SCAN | IMPORT_DATE_RANGE | 1 | 61 | 1 (0)| 00:00:01 | | 9 | INDEX FAST FULL SCAN| NATURAL_KEY_RENTAL | 4689 | 201K| 9 (0)| 00:00:01 | --------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("TU"."RETURN_DATE"="R"."RETURN_DATE"(+) AND "TU"."CHECK_OUT_DATE"="R"."CHECK_OUT_DATE"(+) AND "C"."CONTACT_ID"="R"."CUSTOMER_ID"(+)) 7 - access("C"."MEMBER_ID"="M"."MEMBER_ID") 8 - access("TU"."ACCOUNT_NUMBER"="M"."ACCOUNT_NUMBER" AND "C"."LAST_NAME"="TU"."LAST_NAME" AND "C"."FIRST_NAME"="TU"."FIRST_NAME" AND NVL("MIDDLE_NAME",'x')=NVL("MIDDLE_NAME",'x')) Note ----- - dynamic sampling used for this statement 31 rows selected. |
The question evolves after you tune the first query. Then, you must check what impacts may exist on other queries, like the statement that you use to merge data into the TRANSACTION
table. You may have to suppress indexes in a subsequent query, or after you tune it, return and suppress an index here. You suppress an index by adding a '0'
to a NUMBER
or DATE
column, and by concatenating a null string to a character data type.
Examples of Suppressing Indexes
The following is an example of suppressing an index built on a string. It extends the tuning statement earlier in the blog post, and references line #17. The concatenation of a null string suppresses the index.
17 | AND c.last_name = tu.last_name||'' |
The following is an example of suppressing an index built on a date. It extends the tuning statement earlier in the blog post, and references line #20. Adding a zero leaves the value intact and suppresses the index.
20 | AND tu.check_out_date = r.check_out_date+0 |
Fedora Install of Oracle 11g
After the installation of Oracle 11g XE on Windows, I anticipated problems installing on Fedora. It’s terrific to report it was by far simpler to install on Fedora. This post provides the steps, and builds on the Fedora configuration published earlier this week.
- After you download the software from the Oracle web site, you’ll need to expand the compressed file. When you double click on the download item you will see the following screen. Click Extract at the top of the display.
- The Extract displays the following dialog. Click the Create Folder button and you’ll get an entry point for a new directory in your user’s directory. For the example, Oracle11gXE
- After creating the directory, click the Extract button.
- The extract process completes and shows the following dialog.
- The Disk1 directory will show the following contents.
- You need to drop down into a Terminal session, which you can launch by clicking on Applications, System Tools, and Terminal in the menu.
sudo rpm -iv oracle-xe-11.2.0-1.0.x86_64.rpm |
This command will install the packages in verbose syntax and display the following messages:
[sudo] password for mclaughlinm: Preparing packages for installation... oracle-xe-11.2.0-1.0 Executing post-install steps... You must run '/etc/init.d/oracle-xe configure' as the root user to configure the database. |
- This step requires that you assume the role of the
root
user, which can be done with this syntax:
sudo sh |
In this root
shell, you run the Oracle 11g XE configuration with this command:
/etc/init.d/oracle-xe configure |
The following are the text prompts that you accept to configure Oracle 11g XE:
Oracle Database 11g Express Edition Configuration ------------------------------------------------- This will configure on-boot properties of Oracle Database 11g Express Edition. The following questions will determine whether the database should be starting upon system boot, the ports it will use, and the passwords that will be used for database accounts. Press <Enter> to accept the defaults. Ctrl-C will abort. Specify the HTTP port that will be used for Oracle Application Express [8080]: Specify a port that will be used for the database listener [1521]: Specify a password to be used for database accounts. Note that the same password will be used for SYS and SYSTEM. Oracle recommends the use of different passwords for each database account. This can be done after initial configuration: Confirm the password: Do you want Oracle Database 11g Express Edition to be started on boot (y/n) [y]: Starting Oracle Net Listener...Done Configuring database... Starting Oracle Database 11g Express Edition instance...Done Installation completed successfully. |
After completing the configuration, you need to do two things. First, you need to modify the .bash_profile
file for your user (covered in the next step). Second, you need to reboot your system.
- This step requires that you exit the
root
shell by typing theexit
command. This should put you back into your administration account with sudoer privileges. You usevi
to edit and add the following to the.bashrc
file.
. /u01/app/oracle/product/11.2.0/xe/bin/oracle_env.sh |
Or, you can write your own file, like this:
# Oracle Settings TMP=/tmp; export TMP TMPDIR=$TMP; export TMPDIR ORACLE_HOSTNAME=localhost.localdomain; export ORACLE_HOSTNAME ORACLE_UNQNAME=DB11G; export ORACLE_UNQNAME ORACLE_BASE=/u01/app/oracle; export ORACLE_BASE ORACLE_HOME=$ORACLE_BASE/product/11.2.0/xe; export ORACLE_HOME ORACLE_SID=XE; export ORACLE_SID NLS_LANG=`$ORACLE_HOME/bin/nls_lang.sh`; export NLS_LANG ORACLE_TERM=xterm; export ORACLE_TERM PATH=/usr/sbin:$PATH; export PATH PATH=$ORACLE_HOME/bin:$PATH; export PATH LD_LIBRARY_PATH=$ORACLE_HOME/lib:/lib:/usr/lib; export LD_LIBRARY_PATH CLASSPATH=$ORACLE_HOME/JRE:$ORACLE_HOME/jlib:$ORACLE_HOME/rdbms/jlib; export CLASSPATH if [ $USER = "oracle" ]; then if [ $SHELL = "/bin/ksh" ]; then ulimit -p 16384 ulimit -n 65536 else ulimit -u 16384 -n 65536 fi fi |
You can now log in to the Oracle database with the following syntax from the command line. You’ll be prompted for the system
password that you entered during the configuration steps. I’ll add another post on SQL Developer later this week.
sqlplus system |
- This step is optional. You can configure an
oracleenv.sh
file in theoracle
user’s home directory. While you could put this in the.bashrc
file, theoracle
account isn’t bootable. That’s why I recommend putting it in an environment file.
# .bashrc # Source global definitions if [ -f /etc/bashrc ]; then . /etc/bashrc fi # Uncomment the following line if you don't like systemctl's auto-paging feature: # export SYSTEMD_PAGER= # User specific aliases and functions . /u01/app/oracle/product/11.2.0/xe/bin/oracle_env.sh # Wrap sqlplus with rlwrap to edit prior lines with the # up, down, left and right keys. sqlplus() { if [ "$RLWRAP" = "0" ]; then sqlplus "$@" else rlwrap sqlplus "$@" fi } # Set vi as a command line editor. set -o vi |
If you want to use the menu choices added by the installation, you need to put your privileged sudoer
in the dba
group. I wrote a blog post on it here.
Oracle & MySQL Handbook
My new Oracle Database 11g & MySQL 5.6 Developer Handbook will be available at Oracle Open World 2011 (OOW2011). It’s a great book to compare and contrast approaches in Oracle and MySQL. It covers Oracle SQL & PL/SQL and MySQL SQL & SQL/PSM (Persistent Stored Modules – functions and procedures). Unfortunately, it seems like the book won’t be available on amazon.com until much later in the month, and not available from amazon.de until November. You can read about it at McGraw Hill Professional’s web site. They’ve made a special effort to get copies to OOW2011. Here’s the source code for the book because I don’t know when it’ll be on the publisher’s web site.
I’ll also be at OOW2011. They’ve scheduled me in the bookstore (probably 2nd floor of Moscone North, as usual) from 10 to 10:30 A.M. on Monday and Wednesday for a book signing. If you’re at OOW2011 and you like to stop by and say hello, I look forward to meeting you. Many folks leave comments on the posts but only a few suggest what they’d like me to write on when I’ve got a chance, and you can do that if you stop by to chat.
It’s also interesting to know how many folks use both Oracle and MySQL (any updates on that are always appreciated). Last year at the Bioinformatics and Computation Biology (ACM-BCB 2010) Conference in Niagara Falls, I found it interesting to discover how many pharmaceutical companies and national labs were using both Oracle and MySQL. They appeared consistent about using Oracle for their systems governed by legal compliance rules and MySQL for actual research.
The pharmaceutical companies also had clear barriers between the researchers and professional IT staff, specifically the DBAs. It seems that the DBAs don’t want to cede any control over installed Oracle instances, and they place barriers to research by denying additional Oracle instances when their site licenses would allow them to do so at no incremental cost. On the other hand, the DBAs are fine with letting researchers host and pilot with the MySQL Community Edition databases. This book supports those trying to figure out how to write portable SQL and how to port solutions from MySQL to Oracle and vice versa.
Hope to meet a few new folks at OOW2011. The Kindle version of the book became available 11/25/2011.
As an addendum to this original post, some folks asked for the summary of content for the new book, and the location of the errata (the errors of omission and commission in the book). Below is a summary of the book from page XVIII of the Introduction, and the errata is in the second comment to this post:
Part I: Development Components
- Chapter 1, “Architectures,” explains the Oracle 11g and MySQL 5.6 development architectures and highlights the comparative aspects of both client and server environments.
- Chapter 2, “Client Interfaces,” explains and demonstrates the basics of how you use SQL*Plus and MySQL Monitor client software.
- Chapter 3, “Security,” explains the security barriers for database servers and Data Control Language (DCL) commands that let you manage user and account privileges in the database servers.
- Chapter 4, “Transactions,” explains the nature of ACID-compliant transactions and the Two-phase Commit (2PC) process demonstrated by INSERT, UPDATE, and DELETE statements.
- Chapter 5, “Constraints,” explains the five primary database-level constraints and covers the check, not null, unique, primary key, and foreign key constraints.
Part II: SQL Development
- Chapter 6, “Creating Users and Structures,” explains how you can create users, databases, tables, sequences, and indexes.
- Chapter 7, “Modifying Users and Structures,” explains how you modify users, databases, tables, sequences, and indexes.
- Chapter 8, “Inserting Data,” explains how you insert data into tables.
- Chapter 9, “Updating Data,” explains how you update data in tables.
- Chapter 10, “Deleting Data,” explains how you delete data from tables.
- Chapter 11, “Querying Data,” explains how you query data from a single table, from a join of two or more tables, and from a join of two or more queries through set operators.
- Chapter 12, “Merging Data,” explains how you import denormalized data from external tables or source files and insert or update records in normalized tables.
Part III: Stored Program Development
- Chapter 13, “PL/SQL Basics,” explains the basics of using PL/SQL to write transactional blocks of code.
- Chapter 14, “SQL/PSM Basics,” explains the basics of using SQL/PSM to write transactional blocks of code.
- Chapter 15, “Triggers,” explains how to write database triggers in Oracle and MySQL databases.
Part IV: Appendix
- Appendix, Covers the answers to the mastery questions at the end of the chapters.
MySQL Empty Set Answer
Somebody was complaining that you couldn’t just get a Yes/No answer from a query. Yes when rows are found and No when rows aren’t found, like an “In-stock” or “Out-of-stock” message combo from a query. He didn’t like having to handle an Empty set
by writing logic in PHP to provide that “Out-of-stock” message.
I told him he was wrong, you can get a a Yes/No answer from a query. You just write it differently, instead of a query like this, which get the “In-stock” message but forces you to handle the “Out-of-stock” message in the PHP code base on no records found in the query.
SELECT 'In-stock' FROM item WHERE item_title = 'Star Wars II' LIMIT 1; |
It’s simpler to write it like the one below. You gets a Yes/No answer from a query whether a row matches the query condition or not:
SELECT IF('Star Wars VII' IN (SELECT item_title FROM item) ,'In-stock','Out-of-stock') AS yes_no_answer; |
You can also write it this more generic way, which works in Oracle and MySQL:
SELECT CASE WHEN 'Star Wars VII' IN (SELECT item_title FROM item) THEN 'In-stock' ELSE 'Out-of-stock' END AS yes_no_answer FROM dual; |
There’s no Star Wars VII yet, but this returns the desired result when it’s not found in the data set. It also works when you find Star Wars II in the data set. Never, say never … 🙂
A more useful and complete approach with this technique is shown below with data fabrication.
SELECT inline.query_string , CASE WHEN inline.query_string IN (SELECT item_title FROM item) THEN 'In-stock' ELSE 'Out-of-stock' END AS yes_no_answer FROM (SELECT 'Star Wars II' AS query_string FROM dual UNION ALL SELECT 'Star Wars VII' AS query_string FROM dual) inline; |
The query runs in an Oracle or MySQL database and returns the following result set:
+---------------+---------------+ | query_string | yes_no_answer | +---------------+---------------+ | Star Wars II | In-stock | | Star Wars VII | Out-of-stock | +---------------+---------------+ |
Hope this helps somebody else too.
SQL 1st Day of Month
A couple years ago I showed how to get the last day of the month and count past leap year. Somebody asked how to get the first day of the current month.
The two easiest ways I know how to do that in Oracle is to combine the ADD_MONTHS
and LAST_DAY
functions, like:
SELECT TO_CHAR(TRUNC(LAST_DAY(ADD_MONTHS(SYSDATE,-1))+1),'DD-MON-YY HH24:MI') AS "One Way" , TO_CHAR(TRUNC(ADD_MONTHS(LAST_DAY(SYSDATE)+1,-1)),'DD-MON-YY HH24:MI') AS "Another Way" FROM dual |
It gives you a truncated datetime value, as you can see:
One Way Another Way --------------- --------------- 01-JUN-11 00:00 01-JUN-11 00:00 |
As the comments point out, the following is best for getting the first day of the month:
SELECT TO_CHAR(TRUNC(SYSDATE,'MM'),'DD-MON-YYYY HH24:MI') FROM dual; |
You can also return the first day of the year like this:
SELECT TO_CHAR(TRUNC(SYSDATE,'YY'),'DD-MON-YYYY HH24:MI') FROM dual; |
If you’re planning to do this in an application and Oracle Database 11g, you may want to write it as a deterministic function, like this:
1 2 3 4 5 6 | CREATE OR REPLACE FUNCTION first_day ( date_in DATE ) RETURN DATE DETERMINISTIC IS BEGIN RETURN TRUNC(date_in,'MM'); END; / |
Naturally, you can also do this with the EXTRACT
function but it would get very complex quickly. Hope this answers the question.
Delete removes rows, eh?
It started as a smart remark, “The DELETE FROM
statement always removes one or more rows from a table when any are found, except when the table isn’t a table.” That exception is true when the table isn’t a schema-level table but a nested table in an Oracle database. Then, the DELETE FROM
statement may remove a row from the table or a row from collection held by a row.
Naturally, everybody would like an example. It took more than a couple steps to work one up and the Oracle documentation isn’t very helpful on the topic.
- Create a SQL nested table collection type of 30-character variable length strings:
1 2 | CREATE TYPE street_list IS TABLE OF VARCHAR2(30); / |
- Create a SQL user-defined object type for addresses:
1 2 3 4 5 6 7 | CREATE OR REPLACE TYPE address_type AS OBJECT ( address_id NUMBER , street_address STREET_LIST , city VARCHAR2(30) , state VARCHAR2(2) , postal_code VARCHAR2(10)); / |
- Create a SQL nested table of the user-defined object type for addresses:
1 2 | CREATE OR REPLACE TYPE address_list AS TABLE OF address_type; / |
- Create a schema-level table that uses the user-defined nested table of a user-defined object type, and a
SEQUENCE
for the table:
1 2 3 4 5 6 7 8 9 10 | CREATE TABLE employee ( employee_id NUMBER , first_name VARCHAR2(20) , middle_name VARCHAR2(20) , last_name VARCHAR2(20) , home_address ADDRESS_LIST) NESTED TABLE home_address STORE AS address_table (NESTED TABLE street_address STORE AS street_table); CREATE SEQUENCE employee_s1; |
- Insert data into the table:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 | INSERT INTO employee ( employee_id , first_name , last_name , home_address ) VALUES ( employee_s1.NEXTVAL ,'Sam' ,'Yosemite' , address_list( address_type( 1 , street_list('1111 Broadway','Suite 322') ,'Oakland' ,'CA' ,'94612') , address_type( 2 , street_list('1111 Broadway','Suite 525') ,'Oakland' ,'CA' ,'94612'))); INSERT INTO employee ( employee_id , first_name , last_name , home_address ) VALUES ( employee_s1.NEXTVAL ,'Bugs' ,'Bunny' , address_list( address_type( 1 , street_list('21 Rodeo Drive','Suite 1000') ,'Beverly Hills' ,'CA' ,'90210') , address_type( 2 , street_list('21 Rodeo Drive','Suite 2010') ,'Beverly Hills' ,'CA' ,'90210'))); |
- Create a SQL view to explode nesting into a meaningful data set:
1 2 3 4 5 6 7 8 9 10 11 12 | CREATE OR REPLACE VIEW explode_employee AS SELECT e.employee_id , e.first_name || ' ' || e.last_name AS full_name , st.address_id , sa.column_value AS st_address , st.city , st.state , st.postal_code FROM employee e CROSS JOIN TABLE(e.home_address) st CROSS JOIN TABLE(street_address) sa ORDER BY 1, 2, 3; |
- Format output and query view:
1 2 3 4 5 6 7 8 9 10 11 | -- SQL*Plus formattting COLUMN employee_id FORMAT 999 HEADING "ID|EMP" COLUMN full_name FORMAT A16 HEADING "Full Name" COLUMN address_id FORMAT 999 HEADING "ID|UDT" COLUMN st_address FORMAT A16 HEADING "Street Address" COLUMN city FORMAT A14 HEADING "City" COLUMN state FORMAT A5 HEADING "State" COLUMN postal_code FORMAT A5 HEADING "Zip|Code" -- Query all from view. SELECT * FROM explode_employee; |
Displays the following (2 base rows times two nested table rows times two scalar nested table rows or 8 rows):
ID ID Zip EMP Full Name UDT Street Address City State Code ---- ---------------- ---- ---------------- -------------- ----- ----- 1 Sam Yosemite 1 1111 Broadway Oakland CA 94612 1 Sam Yosemite 1 Suite 322 Oakland CA 94612 1 Sam Yosemite 2 1111 Broadway Oakland CA 94612 1 Sam Yosemite 2 Suite 525 Oakland CA 94612 2 Bugs Bunny 1 21 Rodeo Drive Beverly Hills CA 90210 2 Bugs Bunny 1 Suite 1000 Beverly Hills CA 90210 2 Bugs Bunny 2 21 Rodeo Drive Beverly Hills CA 90210 2 Bugs Bunny 2 Suite 2010 Beverly Hills CA 90210 |
UPDATE
statement against a nested table:
1 2 3 4 5 | UPDATE TABLE (SELECT e.home_address FROM employee e WHERE e.employee_id = 1) e SET e.city = 'Fremont' WHERE e.address_id = 1; |
UPDATE
statement against a nested table and scalar nested table within it:
1 2 3 4 5 6 | UPDATE TABLE (SELECT e.home_address FROM employee e WHERE e.employee_id = 1) e SET e.street_address = street_list('1111 Broadway','Suite 522') , e.city = 'Oakland' WHERE e.address_id = 1; |
DELETE
statement against a nested table, which only removes the nested row:
1 2 3 4 | DELETE FROM TABLE (SELECT e.home_address FROM employee e WHERE e.employee_id = 1) ha WHERE ha.address_id = 1; |
As always, hope this helps somebody looking for syntax or approach.