Archive for the ‘pl/sql’ Category
Conflict between identifiers
Sometimes interesting problems lead to shock or dismay at the suppositions of why they occur. Why an ORA-22979
is raised is one of those, and the error is typically:
ERROR at line 1: ORA-22979: cannot INSERT object VIEW REF OR user-defined REF |
This error occurs on an INSERT
statement if you follow the example from the Oracle 11gR2 Object-Relational Developer’s Guide, which also has various slightly modified examples in a couple PL/SQL books. It also happens on an UPDATE
statement to populate REF
values.
The conflict is typically between the uniqueness of the reference and an attempt to make a non-reference column of the object type a primary key constrained column and embedded object view. The source of the conflict is the OBJECT IDENTIFIER IS PRIMARY KEY
associated with a primary key in the Oracle documentation. The two goals are mutually exclusive; only the reference or non-reference column can be the object identifier. Unfortunately, Oracle documentation contains both examples in different places without making any effective cross reference.
If you want to make a column of an object type a primary key for an object table (that is a table that uses an object type to define its structure) and the object view (the content of the embedded object type), you can’t include the OBJECT IDENTIFIER IS PRIMARY KEY
clause when you want to populate the REF
column of the object type. Here’s an example that uses a column of the object type as a primary key and leaves the REF
column empty:
-- Create the BASE_T type, or specification for IDL. CREATE OR REPLACE TYPE base_t IS OBJECT ( obj_id NUMBER , obj_name VARCHAR2(30) , obj_ref REF base_t) NOT FINAL; / |
You can then create a table like the following:
CREATE TABLE base OF base_t ( obj_id CONSTRAINT base_pk PRIMARY KEY ) OBJECT IDENTIFIER IS PRIMARY KEY; |
Let’s insert some rows to test for ourselves that this fails when you try to assign references:
INSERT INTO base VALUES (base_t(1, 'Dwalin',NULL)); INSERT INTO base VALUES (base_t(2, 'Borfur',NULL)); INSERT INTO base VALUES (base_t(3, 'Gloin',NULL)); INSERT INTO base VALUES (base_t(4, 'Kili',NULL)); INSERT INTO base VALUES (base_t(5, 'Fili',NULL)); |
The following UPDATE
statement attempts to assign references, but fails as shown below:
UPDATE base b SET obj_ref = REF(b); |
The UPDATE
fails as shown:
UPDATE base b * ERROR at line 1: ORA-22979: cannot INSERT object VIEW REF OR user-defined REF |
The simple fix redefines the object table by removing the OBJ_ID
column as an object identifier and primary key value. You do that by removing the OBJECT IDENTIFIER IS PRIMARY KEY
clause because the column of the object type can be a primary key for the table without being an object view identifier. After you make the change, you can successfully update the table with object references. Object identifiers or references are unique and serve the same purpose of a primary key for the object view, and at the same time they can’t both exist.
CREATE TABLE base OF base_t |
Inserting the same rows, you can now update the table to provide valid object references. Let’s experiment with how they work because that’s also not as clear as I’d like in the Oracle documentation.
The next statement creates a CHILD
table that holds a reference to the BASE
(or parent) table and another instance of the same BASE_T
object type:
CREATE TABLE child ( child_id NUMBER CONSTRAINT child_pk PRIMARY KEY , base_ref REF base_t SCOPE IS base , child base_t); |
The INSERT
statement can’t use a VALUES
clause because we MUST capture the reference (or in this case primary key) from the BASE
(or parent) table. An INSERT
statement with a query does the trick:
INSERT INTO child SELECT 1, obj_ref, base_t(1, 'Gimli',NULL) FROM base b WHERE b.obj_name = 'Gloin'; |
You should note that the reference for the CHILD
table’s CHILD
column isn’t set but is likewise not required for the example to work.
Now, lets perform an standard INNER JOIN
(equijoin) between the two tables by using the references as primary and foreign keys. Please note the trick is referring to the table and column of the BASE
(or parent) table and the table, column, and embedded OBJ_REF
of the CHILD table.
COLUMN father FORMAT A10 COLUMN son FORMAT A10 SELECT b.obj_name AS "Father" , c.child.obj_name AS "Son" FROM base b INNER JOIN child c ON b.obj_ref = c.base_ref.obj_ref; |
You get the following results:
Father Son ---------- ---------- Gloin Gimli |
You can make a view of this table with either of these syntaxes:
CREATE OR REPLACE VIEW base_v OF base_t WITH OBJECT OID DEFAULT AS SELECT * FROM base; |
or,
CREATE OR REPLACE VIEW base_v OF base_t WITH OBJECT OID (obj_id) AS SELECT * FROM base; |
Hope it helps anybody trying it. Personally, I think it’s better to use collections of object types, but that’s much bigger discussion that I’ll save for the Oracle Database 12c PL/SQL Programming book that I’m writing.
Object Table Function View
Somebody was trying to create a striped view based on a table’s start_date
and end_date
temporal columns. They asked for some help, so here are the steps (a two-minute tech-tip).
Basically, you create a user-defined data type, or structure:
1 2 3 4 | CREATE OR REPLACE TYPE item_structure IS OBJECT ( id NUMBER , lookup VARCHAR2(30)); / |
Then, you create a list (an Oracle table) of the structure, like:
1 2 | CREATE OR REPLACE TYPE item_lookup IS TABLE OF item_structure; / |
Lastly, you create an object table function, like:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 | CREATE OR REPLACE FUNCTION get_item_types RETURN item_lookup IS -- Declare a variable that uses the record structure. lv_counter PLS_INTEGER := 1; -- Declare a variable that uses the record structure. lv_lookup_table ITEM_LOOKUP := item_lookup(); -- Declare static cursor structure. CURSOR c IS SELECT cl.common_lookup_id AS lookup_id , SUBSTR(cl.common_lookup_meaning,1,60) AS lookup_meaning FROM common_lookup cl WHERE cl.common_lookup_table = 'ITEM' AND cl.common_lookup_column = 'ITEM_TYPE' AND SYSDATE BETWEEN cl.start_date AND NVL(cl.end_date,TRUNC(SYSDATE) + 1) ORDER BY cl.common_lookup_meaning; BEGIN FOR i IN c LOOP lv_lookup_table.EXTEND; /* The assignment pattern for a SQL collection is incompatible with the cursor return type, and you must construct an instance of the object type before assigning it to a collection. */ lv_lookup_table(lv_counter) := item_structure( i.lookup_id , i.lookup_meaning ); lv_counter := lv_counter + 1; END LOOP; /* Call an autonomous function or procedure here! It would allow you to capture who queried what and when; and acts like a pseudo trigger for queries. */ RETURN lv_lookup_table; END; / |
Now you can embed the object table function in a view, like this:
1 2 3 | CREATE OR REPLACE VIEW item_lookup_view AS SELECT * FROM TABLE(get_item_types); |
Why not simply use an embedded query in the view, like the following?
SQL> CREATE OR REPLACE VIEW normal_view AS 2 SELECT cl.common_lookup_id AS lookup_id 3 , SUBSTR(cl.common_lookup_meaning,1,60) AS lookup_meaning 4 FROM common_lookup cl 5 WHERE cl.common_lookup_table = 'ITEM' 6 AND cl.common_lookup_column = 'ITEM_TYPE' 7 AND SYSDATE BETWEEN cl.start_date AND NVL(cl.end_date,TRUNC(SYSDATE) + 1) 8 ORDER BY cl.common_lookup_meaning; |
My guess is that it was too easy but who knows, maybe they found a real need. The only need that I see occurs when you’re enforcing something like HIPPA and you want to capture unauthorized queries along with who performed them.
Naturally, I hope this helps those looking to resolve syntax errors when they have a need to do the more complex solution.
A T-SQL Table Function
I had an interesting conversation about table functions in Oracle’s PL/SQL; and the fact that they’re not available in MySQL. When I explained they’re available in Microsoft T-SQL User-Defined Functions (UDFs), my students wanted a small example. One of them said they’d tried to do it but couldn’t get it to work because they found the Microsoft web pages difficult to read and use. Specifically, they didn’t like the sparseness of this one on how to create a function.
Here’s a quick definition of a UDF table function that runs in the studentdb
schema (created in this post for migrating SQL Server into a MySQL database). The following getConquistador
function takes a single string, which acts to filter the result set from a query positioned as the return value of the function. You should note that this is an implementation of Microsoft’s Common Language Infrastructure (CLI).
CREATE FUNCTION studentdb.getConquistador (@nationality AS VARCHAR(30)) RETURNS TABLE RETURN SELECT * FROM studentdb.conquistador WHERE nationality = @nationality; |
Unlike Oracle SQL, where you need to use the TABLE
function to read the content of a table result from a function, you don’t need anything other than the function call in the FROM
clause of a T-SQL query. Here’s an example of calling the table function:
SELECT * FROM studentdb.getConquistador('German'); |
The complete result from the query would produce these results when run from the sqlcmd
command-line interface:
conquistador_id conquistador actual_name nationality --------------- --------------------- -------------------- ------------ 11 Nicolas de Federman Nikolaus Federmann German 13 Jorge de la Espira George von Speyer German (2 rows affected) |
However, you also have the ability to query only rows of interest without any specialized syntax, like this:
1> USE studentdb; 2> SELECT conquistador AS "Conquistador" 3> , actual_name AS "Name" 4> FROM studentdb.getConquistador('German'); 5> GO |
This produces the following two-column result set:
Conquistador Name --------------------- -------------------- Nicolas de Federman Nikolaus Federmann Jorge de la Espira George von Speyer (2 rows affected) |
Hope this helps those interested in T-SQL UDFs.
SELECT-INTO variables
Somebody said, I shouldn’t have buried so much information in an answer to a question posed in a comment on a summary blog page. They suggested I put it in a regular blog post, and here it is with as little editing as possible.
The Ron Quizon user provided this sample code and a “What’s wrong with this PL/SQL program?”
1 2 3 4 5 6 7 8 9 10 11 12 13 | DECLARE v_name friends.fname%TYPE; v_grade friends.id%TYPE; BEGIN SELECT fname, grade INTO &ssv_name, v_grade FROM friends WHERE v_name = fname; DBMS_OUTPUT.PUT_LINE(NVL(v_name,'No Name ')||' has an id of '||NVL(v_grade, 0)); EXCEPTION WHEN NO_DATA_FOUND THEN DBMS_OUTPUT.PUT_LINE ('There is no record with '||'id 123'); END; |
While this certainly looks like a question from a class on PL/SQL or something from Steven Feuerstein’s PL/SQL question quizzes, I paused before answering it. The give away is the style is what Steven’s advocated for two decades. My guess is that it’s for Steven’s Q&A stuff, which means there’s no harm in answering it because I’m likely not defeating a teacher’s learning objective.
There are two core errors. The first error is an inappropriate assignment target on line #6 and the second is failing to assign a value to the local v_name
variable. If you’d taken the time to create the tables and try it, you should generate an error like this:
SQL> / Enter VALUE FOR ssv_name: Harry OLD 6: INTO &ssv_name, v_grade NEW 6: INTO Harry, v_grade INTO Harry, v_grade * ERROR at line 6: ORA-06550: line 6, COLUMN 12: PLS-00201: identifier 'HARRY' must be declared ORA-06550: line 7, COLUMN 4: PL/SQL: ORA-00904: : invalid identifier ORA-06550: line 5, COLUMN 4: PL/SQL: SQL Statement ignored |
Also, at this point if you couldn’t see the error by quick inspection, it should be transparent to you. However, I don’t believe in playing games. Here’s the answer you need:
- The SELECT-INTO statement is an left-to-right assignment operator in SQL (available in all dialects), and the right operand (variable) or list of operands (variables) must be identifier(s). “Identifiers are words. They can be reserved words, predefined identifiers, quoted identifiers, user-defined variables, subroutines, or user-defined types. (Oracle Database 11g PL/SQL Programming on page #51).” In this case, as the right operand(s), they are user-defined variables.
- The & (ampersand) preceding
ssv_name
makes that a substitution placeholder or target, which is a SQL*Plus prompt for a value. The value provided at run-time is assigned to the SQL*Plus placeholder as a string literal during the preparing phase. That phase precedes the anonymous block parse, fetch, and execute cycle. Therefore, you raise a parsing error while running the anonymous block unless the&ssv_name
input value is a valid locally declared variable or defined session bind variable name. - Assuming you input a valid identifier, the next problem is that the
WHERE
clause uses an equality comparison operator against thev_name
local variable, which holds a null value. That means theSELECT-INTO
always fails on a no data found error unless you add an assignment statement for thev_name
variable.
Anyway, I hope spelling it out in a formal blog post was helpful to someone in the future. If so, leave a comment and let me know.
Reset sequence START WITH
Two things contributed to this post. One was a student question about the difference between the MAX_VALUE
of a sequence and the actual sequence value. The other was a comment on an earlier post addressing an NDS approach to resetting sequences.
The student wanted to understand why there were gaps in the sequence, since they created it with the default values, like this:
CREATE SEQUENCE sequence_name; |
A default sequence in an Oracle database sets the cache value to 20, which means you get gaps every time a new set is allocated. This Ask Tom question and answer holds that information.
The blog comment was on an NDS statement post. I wasn’t quite sure what the comment wanted because there really wasn’t a question or a complete code example. Having demonstrated how to dynamically drop and recreate a new sequence without a gap in a comment reply, I thought that was adequate.
Having pointed the student to the Ask Tom column and my answer, he wanted to know how to create a stored procedure to reset sequences dynamically. It took me a couple weeks to get back to this but here’s the procedure. The tricky element of the procedure is the “no binding values allowed” restriction placed on NDS statements that process DDL statements.
The procedure uses two local procedures and two local functinons. The local find_sequence
finds a sequence name in the schema, and find_next_sequence
function returns the next value. The local create_sequence
and drop_sequence
procedures respectively isolate the logic for dynamic drops and creates of the sequence.
Warning: The restriction with this design assumes that the table name and sequence name are linked by using the table name and an _ID
suffix.
-- Declare an autonomous procedure to drop sequences. CREATE OR REPLACE PROCEDURE reset_sequence ( pv_table_name VARCHAR2 , pv_sequence_name VARCHAR2 , pv_cache_value NUMBER DEFAULT 0 ) IS -- Declares stubs to remove forward reference limits. PROCEDURE create_sequence ( pv_sequence_name VARCHAR2, pv_cache_value NUMBER ); PROCEDURE drop_sequence ( pv_sequence_name VARCHAR2 ); FUNCTION find_sequence ( pv_sequence_name VARCHAR2 ) RETURN VARCHAR2; FUNCTION find_next_sequence ( pv_table_name VARCHAR2 ) RETURN VARCHAR2; -- Drops sequence. PROCEDURE drop_sequence ( pv_sequence_name VARCHAR2 ) IS -- Declare local variable(s). lv_statement VARCHAR2(200); lv_sequence_name VARCHAR2(30); BEGIN /* Conditionally drop any sequence using a local function to find a valid sequence name before attempting to drop it. */ IF find_sequence(DBMS_ASSERT.SIMPLE_SQL_NAME(pv_sequence_name)) > 0 THEN /* Dynamically drop sequence, which requires concatenating the sanitized sequence name because you can't bind values on a DDL statement. */ lv_statement := 'DROP sequence '||DBMS_ASSERT.SIMPLE_SQL_NAME(pv_sequence_name); -- Execute statement immediately. EXECUTE IMMEDIATE lv_statement; END IF; END drop_sequence; -- Drops sequence. PROCEDURE create_sequence ( pv_sequence_name VARCHAR2 , pv_cache_value NUMBER ) IS -- Declare local variable(s). lv_statement VARCHAR2(200); lv_next_sequence NUMBER; BEGIN -- Assign the next sequence value if one exists. lv_next_sequence := find_next_sequence(pv_table_name); -- Check whether a cache value has been provided. IF pv_cache_value > 0 THEN /* Dynamically create a sequence statement with a new start value that is one greater than the highest value in the table that uses the sequence. */ lv_statement := 'CREATE SEQUENCE '||DBMS_ASSERT.SIMPLE_SQL_NAME(pv_sequence_name)||CHR(10) || 'MINVALUE 1 NOMAXVALUE INCREMENT BY 1'||CHR(10) || 'START WITH '||lv_next_sequence||' CACHE '||pv_cache_value||' NOORDER NOCYCLE'; ELSE /* Dynamically create a sequence statement with a new start value that is one greater than the highest value in the table that uses the sequence. */ lv_statement := 'CREATE SEQUENCE '||DBMS_ASSERT.SIMPLE_SQL_NAME(pv_sequence_name)||CHR(10) || ' MINVALUE 1 NOMAXVALUE'||CHR(10) || 'INCREMENT BY 1 START WITH '||lv_next_sequence||' NOCACHE NOORDER NOCYCLE'; END IF; -- Execute create sequence statement. EXECUTE IMMEDIATE lv_statement; END create_sequence; -- Finds whether a sequence exists. FUNCTION find_sequence ( pv_sequence_name VARCHAR2 ) RETURN VARCHAR2 IS -- Declare local return variable(s). lv_statement VARCHAR2(200); lv_sequence_name VARCHAR2(30); lv_return_value NUMBER := 0; -- Declare local system reference cursor. lv_cursor SYS_REFCURSOR; BEGIN -- Dynamically find sequence. lv_statement := 'SELECT sequence_name'||CHR(10) || 'FROM user_sequences'||CHR(10) || 'WHERE sequence_name = :seq_name'; OPEN lv_cursor FOR lv_statement USING DBMS_ASSERT.SIMPLE_SQL_NAME(pv_sequence_name); FETCH lv_cursor INTO lv_sequence_name; CLOSE lv_cursor; -- Convert valid sequence name to positive integer, which represents truth. lv_return_value := LENGTH(lv_sequence_name); -- Return value. RETURN lv_return_value; EXCEPTION -- Reached when DBMS_ASSERT check fails. WHEN OTHERS THEN RETURN lv_return_value; END find_sequence; -- Finds the next sequence value. FUNCTION find_next_sequence ( pv_table_name VARCHAR2 ) RETURN VARCHAR2 IS -- Declare local return variable(s). lv_statement VARCHAR2(200); lv_sequence_value NUMBER; lv_return_value NUMBER := 1; -- Declare local system reference cursor. lv_cursor SYS_REFCURSOR; BEGIN -- Dynamically find sequence. lv_statement := 'SELECT MAX('||DBMS_ASSERT.SIMPLE_SQL_NAME(pv_table_name)||'_ID) + 1'||CHR(10) || 'FROM '||DBMS_ASSERT.SIMPLE_SQL_NAME(pv_table_name); OPEN lv_cursor FOR lv_statement; FETCH lv_cursor INTO lv_sequence_value; CLOSE lv_cursor; -- Assign the return value from the NDS statement to a local return variable. lv_return_value := lv_sequence_value; -- Return value. RETURN lv_return_value; EXCEPTION -- Reached when DBMS_ASSERT check fails. WHEN OTHERS THEN RETURN lv_return_value; END find_next_sequence; -- Set precompiler instruction to run in a discrete process. PRAGMA AUTONOMOUS_TRANSACTION; BEGIN -- Conditionally drop sequence. drop_sequence(DBMS_ASSERT.SIMPLE_SQL_NAME(pv_sequence_name)); -- Create sequence. create_sequence(DBMS_ASSERT.SIMPLE_SQL_NAME(pv_sequence_name), pv_cache_value); END; / |
You can test this procedure with the following script, which includes COMMIT
statements. Those are requires because the stand alone procedure runs independently of the current session as an autonomous procedure.
-- Conditionally drop existing tables and sequences. BEGIN FOR i IN (SELECT object_name, object_type FROM user_objects WHERE object_name IN ('SAMPLE','SAMPLE_S')) LOOP IF i.object_type = 'TABLE' THEN EXECUTE IMMEDIATE 'DROP TABLE '||i.object_name||' CASCADE CONSTRAINTS'; dbms_output.put_line(i.object_name); ELSIF i.object_type = 'SEQUENCE' THEN EXECUTE IMMEDIATE 'DROP SEQUENCE '||i.object_name; dbms_output.put_line(i.object_name); END IF; END LOOP; END; / -- Create test table. CREATE TABLE sample (sample_id NUMBER); -- Create test table. CREATE SEQUENCE sample_s; -- Insert a seeding row. INSERT INTO sample VALUES (10); -- Commit the values because the procedure is autonomous. COMMIT; -- Reset the sequence value. EXECUTE reset_sequence('SAMPLE','SAMPLE_S'); -- Insert the next sequence value. INSERT INTO sample VALUES (sample_s.nextval); -- Commit the values because the procedure is autonomous. COMMIT; -- Query the result to ensure the numbers are sequential. SELECT * FROM sample; EXECUTE reset_sequence('SAMPLE','SAMPLE_S',500); -- Insert the next sequence value. INSERT INTO sample VALUES (sample_s.nextval); -- Query the result to ensure the numbers are sequential. SELECT * FROM sample; |
Hope this helps anybody interested in automating the process of resetting a START WITH
value of an Oracle sequence.
Why Stored Programs?
Why should you use stored programs? Great question, here’s my little insight into a situation that I heard about in a large organization.
A very large organization is having a technology argument. In someway, like politics, half-truth drives this type of discussion. This company has hundreds of databases and they’re about half SQL Server and Oracle. The argument (half-truth) states that using T-SQL or PL/SQL yields “spaghetti” code!
It seems like an old argument from my perspective. After all, I’ve been working with T-SQL and PL/SQL for a long time. Spaghetti code exists in every language when unskilled programmers solve problems but the point here is one of software architecture, and an attempt to malign stored programming in general. Let’s examine the merit of the argument against stored programs.
First of all, the argument against stored programs is simply not true. SQL DML statements, like the INSERT
, UPDATE
, and DELETE
statements should maintain ACID compliant interactions with a single table in a database. Unfortunately, the same statements create anomalies (errors) in a poorly designed database.
Stored programs provide the ability to perform ACID compliant interactions across a series of tables in a database. They may also hide database design errors and protect the data from corruption. The same can’t be said for Java or C# developers. Java and C# developers frequently fail to see database design errors or they overlook them as inconsequential. This type of behavior results in corrupt data.
It typically raises cost, errors, and overall application complexity when key logic migrates outside the database. If you’re asking why, that’s great. Here are my thoughts on why:
- Making a Java or C# programmer responsible for managing the transaction scope across multiple tables in a database is not trivial. It requires a Java programmer that truly has mastered SQL. As a rule, it means a programmer writes many more lines of logic in their code because they don’t understand how to use SQL. It often eliminates joins from being performed in the database where they would considerably outperform external language operations.
- Identifying bottlenecks and poor usage of data becomes much more complex for DBAs because small queries that avoid joins don’t appear problematic inside the database. DBAs don’t look at the execution or scope of transactions running outside of the database and you generally are left with anecdotal customer complaints about the inefficiency of the application. Therefore, you have diminished accountability.
- Developing a library of stored procedures (and functions) ensures the integrity of transaction management. It also provides a series of published interfaces to developers writing the application logic. The published interface provides a modular interface, and lets developers focus on delivering quality applications without worrying about the database design. It lowers costs and increases quality by focusing developers on their strengths rather than trying to make them generalists. That having been said, it should never mask a poorly designed database!
- Service level agreements are critical metrics in any organization because they compel efficiency. If you mix the logic of the database and the application layer together, you can’t hold the development team responsible for the interface or batch processing metrics because they’ll always “blame” the database. Likewise, you can’t hold the database team responsible for performance when their metrics will only show trivial DML statement processing. Moreover, the DBA team will always show you that it’s not their fault because they’ve got metrics!
- Removing transaction controls from the database server generally means you increase the analysis and design costs. That’s because few developers have deep understanding of a non-database programming language and the database. Likewise, input from DBAs is marginalized because the solution that makes sense is disallowed by design fiat. Systems designed in this type of disparate way often evolve into extremely awkward application models.
Interestingly, the effective use of T-SQL or PL/SQL often identifies, isolates, and manages issues in poorly designed database models. That’s because they focus on the integrity of transactions across tables and leverage native database features. They also act like CSS files, effectively avoiding the use of inline style or embedded SQL and transaction control statements.
Let’s face this fact; any person who writes something like “spaghetti” code in the original context is poorly informed. They’re typically trying to sidestep blame for an existing bad application design or drive a change of platform without cost justification.
My take on this argument is two fold. Technologists in the organization may want to dump what they have and play with something else; or business and IT management may want to sidestep the wrath of angry users by blaming their failure on technology instead of how they didn’t design, manage, or deliver it.
Oh, wait … isn’t that last paragraph the reason for the existence of pre-package software? 😉 Don’t hesitate to chime in, after all it’s just my off-the-cuff opinion.
Encrypting a Column
A few years ago, I gave a session on Oracle’s Data Vault. An attendee from that session and I happened to be seated at the same table for lunch last Thursday when I presented at Utah Oracle User Group Fall Symposium. He asked if I’d ever found a way to encrypt a column from the prying eyes of the SYS
user. I said yes, and he asked how. It was a bit more than could be explained at lunch and promised to put it on the blog. (Disclaimer: It is possible to unwrap wrapped code but the mechanics change with each release and they require access to the SYS
schema and substantial Application DBA knowledge; however, there are scripts published on the Internet to unwrap the code. There are also other vulnerabilities in the example, but I’ll pass on exposing them. After all this was written to illustrate an approach.)
It’s important to note you can now hide columns in Oracle Database 12c, but they’re still visible to the DBA-level staff. That’s why I’ll publish a new article on re-writing this encrypted object as a Java library, and disclose how to hide password exchanges from the SGA area.
This demonstrates how you can encapsulate a column from even the SYS
user. I also put it into Appendix D, PL/SQL Built-in Packages and Types to illustrate the DBMS_CRYPTO
package. It uses a User Defined Type (UDT) and assumes you have a working knowledge of object types in Oracle 10g forward. If not, you can find them in:
- Chapter 11 of my Oracle Database 12c PL/SQL Programming book.
- Chapter 14 of my Oracle Database 11g PL/SQL Programming book.
It also assumes basic knowledge of Oracle’s encryption technology, some of which is explained in the new Oracle Database 12c PL/SQL Programming book.
For reference, a good DBA would simply find this clear text password in the SGA. A more secure approach might be hiding the encryption keyword in a one column and table or embedded in some string within a common lookup table column value as a position specific substring. Alas, I don’t have time to write something so elaborate.
The steps are:
- You must create a user defined type (UDT), which sets the up a single salary column.
1 2 3 4 5 6 7 8 9 10 | CREATE OR REPLACE TYPE masked IS OBJECT ( salary RAW(1000) , CONSTRUCTOR FUNCTION masked RETURN SELF AS RESULT , CONSTRUCTOR FUNCTION masked ( salary NUMBER ) RETURN SELF AS RESULT , MEMBER FUNCTION get_raw_salary RETURN RAW , MEMBER FUNCTION get_salary ( KEY VARCHAR2 ) RETURN NUMBER , MEMBER PROCEDURE to_string , ORDER MEMBER FUNCTION equals ( object MASKED ) RETURN NUMBER ) INSTANTIABLE FINAL; / |
- As the
SYS
user, you must grantEXECUTE
privilege on theDBMS_CRYPTO
package to the targetSCHEMA
user.
1 | GRANT EXECUTE ON dbms_crypto TO schema_name; |
- You implement the
MASKED
UDT, as follows below (source unencrypted, don’t worry I show you how to encrypt [wrap] it in a moment).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 | CREATE OR REPLACE TYPE BODY masked IS CONSTRUCTOR FUNCTION masked RETURN SELF AS RESULT IS /* Create a placeholder for a zero salary, like an empty_clob() call. */ zero MASKED := masked(0); BEGIN /* Assign an encrypted zero salary to the instance. */ self := zero; RETURN; END masked; CONSTRUCTOR FUNCTION masked ( salary NUMBER ) RETURN SELF AS RESULT IS /* Declare local variables for encryption, object types hold instance objects and object body variables are method specific. Placing them inside the methods while tedious prevents their disclosure. */ lv_key_string VARCHAR2(4000) := 'Encrypt Me!'; lv_key RAW(1000); lv_raw RAW(1000); lv_encrypted_data RAW(1000); BEGIN /* Dynamic assignment. */ lv_raw := UTL_RAW.cast_to_raw(NVL(salary,0)); /* Convert to a RAW 64-character key. */ lv_key := UTL_RAW.cast_to_raw(lv_key_string); lv_key := RPAD(lv_key,64,'0'); /* Encrypt the salary before assigning it to the object type attribute */ lv_encrypted_data := DBMS_CRYPTO.ENCRYPT(lv_raw, dbms_crypto.ENCRYPT_AES256 + dbms_crypto.CHAIN_CBC + dbms_crypto.PAD_PKCS5,lv_key); self.salary := lv_encrypted_data; RETURN; END masked; MEMBER FUNCTION get_raw_salary RETURN RAW IS BEGIN RETURN self.salary; END get_raw_salary; MEMBER FUNCTION get_salary( key VARCHAR2 ) RETURN NUMBER IS /* Declare local variables for encryption, object types hold instance objects and object body variables are method specific. Placing them inside the methods while tedious prevents their disclosure. */ lv_key_string VARCHAR2(4000) := 'Encrypt Me!'; lv_decrypted_data RAW(4000); lv_key RAW(1000); lv_return_value NUMBER; BEGIN /* Verify key value matches local value before decrypting, substitute a zero value when the key doesn't match. */ IF key = lv_key_string THEN lv_key := UTL_RAW.cast_to_raw(lv_key_string); lv_key := RPAD(lv_key,64,'0'); lv_decrypted_data := DBMS_CRYPTO.DECRYPT(self.salary, dbms_crypto.ENCRYPT_AES256 + dbms_crypto.CHAIN_CBC + dbms_crypto.PAD_PKCS5,lv_key); lv_return_value := TO_NUMBER(TO_CHAR(UTL_RAW.cast_to_number(lv_decrypted_data),'9999990.00')); ELSE lv_return_value := 0; END IF; RETURN lv_return_value; END get_salary; ORDER MEMBER FUNCTION equals ( object MASKED ) RETURN NUMBER IS /* Declare local variables for encryption, object types hold instance objects and object body variables are method specific. Placing them inside the methods while tedious prevents their disclosure. */ lv_key_string VARCHAR2(4000) := 'Encrypt Me!'; lv_decrypted_self RAW(4000); lv_decrypted_peer RAW(4000); lv_key RAW(1000); BEGIN /* Decrypt the current and peer object attribute values before comparing their values. */ lv_key := UTL_RAW.cast_to_raw(lv_key_string); lv_key := RPAD(lv_key,64,'0'); lv_decrypted_self := DBMS_CRYPTO.DECRYPT(self.salary, dbms_crypto.ENCRYPT_AES256 + dbms_crypto.CHAIN_CBC + dbms_crypto.PAD_PKCS5, lv_key); lv_decrypted_peer := DBMS_CRYPTO.DECRYPT(object.get_raw_salary(), dbms_crypto.ENCRYPT_AES256 + dbms_crypto.CHAIN_CBC + dbms_crypto.PAD_PKCS5, lv_key); /* Sort order key. */ IF lv_decrypted_self < lv_decrypted_peer THEN RETURN 1; ELSE RETURN 0; END IF; END equals; MEMBER PROCEDURE to_string IS BEGIN /* Provide a to_string method for good practice. */ DBMS_OUTPUT.put_line('Encrypted value'); END to_string; END; / |
- You implement the
MASKED
UDT encrypted by using theDBMS_DDL
package, as follows below. This ensures that others can’t read the source code by querying theALL_
,DBA_
, orUSER_SOURCE
views. You should note that I’ve removed comments and unnecessary spaces.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 | BEGIN DBMS_DDL.create_wrapped( 'CREATE OR REPLACE TYPE BODY masked IS ' || 'CONSTRUCTOR FUNCTION masked RETURN SELF AS RESULT IS ' || 'zero MASKED := masked(0); ' || 'BEGIN ' || 'self := zero; ' || 'RETURN; ' || 'END masked; ' || 'CONSTRUCTOR FUNCTION masked ( salary NUMBER ) RETURN SELF AS RESULT IS ' || 'lv_key_string VARCHAR2(4000) := ''Encrypt Me!''; ' || 'lv_key RAW(1000); ' || 'lv_raw RAW(1000) := RPAD(utl_raw.cast_from_number(salary),32,''0''); ' || 'lv_encrypted_data RAW (1000); ' || 'BEGIN ' || 'lv_key := utl_raw.cast_to_raw(lv_key_string); ' || 'lv_key := RPAD(lv_key,64,''0''); ' || 'lv_encrypted_data := DBMS_CRYPTO.ENCRYPT(lv_raw, dbms_crypto.ENCRYPT_AES256 + dbms_crypto.CHAIN_CBC + dbms_crypto.PAD_PKCS5,lv_key); ' || 'self.salary := lv_encrypted_data; ' || 'RETURN; ' || 'END masked; ' || 'MEMBER FUNCTION get_raw_salary RETURN RAW IS ' || 'BEGIN ' || 'RETURN self.salary; ' || 'END get_raw_salary; ' || 'MEMBER FUNCTION get_salary( key VARCHAR2 ) RETURN NUMBER IS ' || 'lv_key_string VARCHAR2(4000) := ''Encrypt Me!''; ' || 'lv_decrypted_data RAW(4000); ' || 'lv_key RAW(1000); ' || 'lv_return_value NUMBER; ' || 'BEGIN ' || 'IF key = lv_key_string THEN ' || 'lv_key := utl_raw.cast_to_raw(lv_key_string); ' || 'lv_key := RPAD(lv_key,64,''0''); ' || 'lv_decrypted_data := DBMS_CRYPTO.DECRYPT(self.salary, dbms_crypto.ENCRYPT_AES256 + dbms_crypto.CHAIN_CBC + dbms_crypto.PAD_PKCS5,lv_key); ' || 'lv_return_value := TO_NUMBER(TO_CHAR(utl_raw.cast_to_number(lv_decrypted_data),''9999990.00'')); ' || 'ELSE ' || 'lv_return_value := 0; ' || 'END IF; ' || 'RETURN lv_return_value; ' || 'END get_salary; ' || 'ORDER MEMBER FUNCTION equals ( object MASKED ) RETURN NUMBER IS ' || 'lv_key_string VARCHAR2(4000) := ''Encrypt Me!''; ' || 'lv_decrypted_self RAW(4000); ' || 'lv_decrypted_peer RAW(4000); ' || 'lv_key RAW(1000); ' || 'BEGIN ' || 'lv_key := utl_raw.cast_to_raw(lv_key_string);' || 'lv_key := RPAD(lv_key,64,''0''); ' || 'lv_decrypted_self := DBMS_CRYPTO.DECRYPT(self.salary, dbms_crypto.ENCRYPT_AES256 + dbms_crypto.CHAIN_CBC + dbms_crypto.PAD_PKCS5, lv_key); ' || 'lv_decrypted_peer := DBMS_CRYPTO.DECRYPT(object.get_raw_salary(), dbms_crypto.ENCRYPT_AES256 + dbms_crypto.CHAIN_CBC + dbms_crypto.PAD_PKCS5, lv_key); ' || 'IF lv_decrypted_self < lv_decrypted_peer THEN ' || 'RETURN 1; ' || 'ELSE ' || 'RETURN 0; ' || 'END IF; ' || 'END equals; ' || 'MEMBER PROCEDURE to_string IS ' || 'BEGIN ' || 'dbms_output.put_line(''Encrypted value''); ' || 'END to_string; ' || 'END; '); END; / |
You can read more about wrapping PL/SQL in Appendix F of Oracle Database 12c PL/SQL Programming.
- You can test a single instance with this anonymous PL/SQL block.
1 2 3 4 5 6 7 8 9 | DECLARE o MASKED := masked(82000.12); BEGIN DBMS_OUTPUT.put('Override: '); o.to_string(); DBMS_OUTPUT.put_line('Decrypted: '||o.get_salary('Encrypt Me!')); DBMS_OUTPUT.put_line('Bad Key: '||o.get_salary('Incorrect')); END; / |
Override: Encrypted value Decrypted: 82000.12 Bad Key: 0 |
- You can test a series of instances by making them persistent objects, or columns in a table, and then query values from them. It also lets you you test the sorting feature provided in the UDT.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | CREATE TABLE sort_demo (salary MASKED); INSERT INTO sort_demo VALUES (masked(82000.24)); INSERT INTO sort_demo VALUES (masked(61000.12)); INSERT INTO sort_demo VALUES (masked(93000.36)); SELECT salary AS "Encrypted" FROM sort_demo; COLUMN unordered FORMAT 9,999,990.00 HEADING "Unordered|List" SELECT TREAT(salary AS MASKED).get_salary('Encrypt Me!') AS Unordered FROM sort_demo; COLUMN ordered FORMAT 9,999,990.00 HEADING "Ordered|List" SELECT TREAT(salary AS MASKED).get_salary('Encrypt Me!') AS Ordered FROM sort_demo ORDER BY 1; |
Encrypted(SALARY) -------------------------------------------------------------------------- MASKED('1798C04F8342C53A826144568075CBDB145D0C9BD226B410C8C7167B76382D86') MASKED('82F783F2E117AA60955B0A2E73545506936D6F8FFBEC5D9E0D8E70B82D4B694D') MASKED('1551F350AAEB30ADFC1527F25CAA935732243858AF1C5D724A78B997A4394EAD') Unordered List ------------- 82,000.24 61,000.12 93,000.36 Ordered List ------------- 61,000.12 82,000.24 93,000.36 |
If you want to retest this, make sure you drop the SORT_DEMO
table first. Hope this helps any interested in an elaborate solution.
NDS parameters as IN OUT?
A question posed by a student: “Why are Oracle Native Dynamic SQL (NDS) USING
clause parameters IN
, IN OUT
, or OUT
when the RETURNING INTO
clause manages output values?” It a great question, isn’t it? The followup question was also great, “How do you implement an example of NDS IN OUT
parameters?”
The answer is two fold. First, you should use the USING
clause for parameter list input values and the RETURNING INTO
clause for return values whenever possible. Second, when it’s not possible you’re generally passing parameters into and out of an NDS PL/SQL anonymous block.
The basic prototype for passing and retrieving values from an NDS statement is:
EXECUTE IMMEDIATE sql_stmt USING { IN | IN OUT | OUT } local_variable [, ...] RETURNING INTO { IN OUT | OUT } local_variable [, ...]; |
A quick and hopefully fun example is this parody on Marvel’s The Avengers. The program creates an anonymous block with a super hero of Thor and super villain of Loki, then it uses a USING
clause with IN OUT
parameters to an anonymous block statement. That’s basically the trick to how you use IN OUT
parameters in NDS statements.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 | -- Enable SERVEROUTPUT. SET SERVEROUTPUT ON SIZE UNLIMITED -- Declare an anonymous testing block. DECLARE -- Declare two local variables. lv_super_hero VARCHAR2(20) := 'Thor'; lv_super_villain VARCHAR2(20) := 'Loki'; -- Declare a null statement variable. lv_stmt VARCHAR2(32767); -- Declare a local procedure to parse the NDS block. PROCEDURE print_code_block (pv_block VARCHAR2) IS -- Declare local parsing variables. lv_length INTEGER := 1; lv_start INTEGER := 1; lv_end INTEGER := 1; BEGIN -- Read line by line on a line return character. WHILE NOT (lv_end = 0) LOOP -- Check for line returns. lv_end := INSTR(lv_stmt,CHR(10),lv_start); -- Check whether line return has been read. IF NOT lv_end = 0 THEN -- Reset the ending substring value and print substring. lv_end := INSTR(lv_stmt,CHR(10),lv_start); dbms_output.put_line('| '||SUBSTR(lv_stmt,lv_start,lv_end - lv_start)); ELSE -- Print the last substring with a semicolon and exit the loop. dbms_output.put_line('| '||SUBSTR(lv_stmt,lv_start,LENGTH(lv_stmt) - lv_start)||';'); END IF; -- Reset the beginning of the string. lv_start := lv_end + 1; END LOOP; END print_code_block; BEGIN -- Demonstrate good triumps over evil. dbms_output.put_line('The good '||lv_super_hero||' beats up the bad '||lv_super_villain||'!'); -- Assign the anonymous block to the local statement variable. lv_stmt := 'DECLARE'||CHR(10) || ' lv_super_hero VARCHAR2(20);'||CHR(10) || ' lv_super_villain VARCHAR2(20);'||CHR(10) || 'BEGIN'||CHR(10) || ' lv_super_hero '||CHR(58)||'= :pv_super_hero;'||CHR(10) || ' lv_super_villain '||CHR(58)||'= :pv_super_villain;'||CHR(10) || ' :pv_super_hero '||CHR(58)||'= lv_super_villain;'||CHR(10) || ' :pv_super_villain '||CHR(58)||'= lv_super_hero;'||CHR(10) || 'END;'; -- Run the NDS program. EXECUTE IMMEDIATE lv_stmt USING IN OUT lv_super_hero , IN OUT lv_super_villain; -- Print the diagnostic code block, that's why it used line returns afterall. dbms_output.put_line('--------------------------------------------------'); print_code_block(lv_stmt); dbms_output.put_line('--------------------------------------------------'); -- Demonstrate the world is upside down without Johnny Depp playing Capt'n Jack. dbms_output.put_line('The good '||lv_super_hero||' beats up the bad '||lv_super_villain||'!'); END; / |
You’ll get the following printed output:
The good Thor beats up the bad Loki! -------------------------------------------------- | DECLARE | lv_super_hero VARCHAR2(20); | lv_super_villain VARCHAR2(20); | BEGIN | lv_super_hero := :pv_super_hero; | lv_super_villain := :pv_super_villain; | :pv_super_hero := lv_super_villain; | :pv_super_villain := lv_super_hero; | END; -------------------------------------------------- The good Loki beats up the bad Thor! |
As always, I hope it helps you understand the concept of the USING
clause with IN OUT
parameters but I hope there’s always better way.
Result Cache Functions
I finally got around to cleaning up old contact me messages. One of the messages raises a question about RESULT_CACHE functions. The writer wanted an example implementing both a standalone schema and package RESULT_CACHE function.
The question references a note from the Oracle Database 11g PL/SQL Programming book (on page 322). More or less, that note points out that at the time of writing a RESULT_CACHE function worked as a standalone function but failed inside a package. When you tried it, you raised the following error message:
PLS-00999: Implementation Restriction (may be temporary) |
It’s no longer true in Oracle 11gR2, but it was true in Oracle 11gR1. I actually mentioned in a blog entry 4 years ago.
You can implement a schema RESULT_CACHE function like this:
1 2 3 4 5 6 7 8 | CREATE OR REPLACE FUNCTION full_name ( pv_first_name VARCHAR2 , pv_last_name VARCHAR2 ) RETURN VARCHAR2 RESULT_CACHE IS BEGIN RETURN pv_first_name || ' ' || pv_last_name; END full_name; / |
You would call it like this from a query:
SELECT full_name(c.first_name, c.last_name) FROM contact c; |
You can declare a published package RESULT_CACHE function like this:
1 2 3 4 5 6 7 | CREATE OR REPLACE PACKAGE cached_function IS FUNCTION full_name ( pv_first_name VARCHAR2 , pv_last_name VARCHAR2 ) RETURN VARCHAR2 RESULT_CACHE; END cached_function; / |
You would implement the function in a package body like this:
1 2 3 4 5 6 7 8 9 10 | CREATE OR REPLACE PACKAGE BODY cached_function IS FUNCTION full_name ( pv_first_name VARCHAR2 , pv_last_name VARCHAR2 ) RETURN VARCHAR2 RESULT_CACHE IS BEGIN RETURN pv_first_name || ' ' || pv_last_name; END full_name; END cached_function; / |
You would call the package function like this from a query:
SELECT cached_function.full_name(c.first_name, c.last_name) FROM contact c; |
I hope this answers the question.
Oracle CSV Imports
The first step in creating an effective import plan for comma-separated value (CSV) files is recognizing your options in a database. There are several options in an Oracle database. You can read the file with Java, C/C++, C#, PL/SQL (through the UTL_FILE package), PHP, Perl, or any other C-callable programming language; or you can use SQL*Loader as a standalone utility or through externally managed tables (known as external tables). The most convenient and non-programming solution is using external tables.
Adopting external tables as your import solution should drive you to consider how to manage the security surrounding this type of methodology. Host hardening is a critical security step because it shuts down most, hopefully all, unauthorized use of the operating system where the database and external files reside. Next, you need to manage the access to the external tables and ensure that exposure of business sensitive information in CSV files is minimized.
This post explains how to manage access and police (cleanup external files) once they’re read into the database. It assumes you have root-level permissions to the operating system and database. The SYS and SYSTEM accounts have the equivalent of root permissions for database configuration. The rule of thumb with these accounts is simple, manage as much as possible with the SYSTEM account before you use the SYS account.
Setting up the Import File System
While you can do all the setup of virtual directories in Oracle regardless of whether you’ve set them up in the operating system, it’s a good idea to set them up in the OS first. The example is using a Windows 7 OS, so you’ll need to change the directories when working in Linux or Unix. Here are the directories:
C:\Imports\ImportFiles C:\Imports\ImportLogs |
You may take note that there are two directories. That’s because you don’t want to grant write privileges to the Oracle virtual directory where you put the files. You can grant read-only privileges to the virtual directory and read-write privileges to the log directory.
Setting up the Import User/Schema
This step lets you create an IMPORT
user/schema in the Oracle database. You need to connect as the SYSTEM user to perform these steps (or another authorized DBA account with adequate privileges):
CREATE USER import IDENTIFIED BY import DEFAULT TABLESPACE users QUOTA 1000M ON users TEMPORARY TABLESPACE temp; |
After creating the user, grant the privileges like this as the SYSTEM user:
GRANT CREATE CLUSTER, CREATE INDEXTYPE, CREATE OPERATOR , CREATE PROCEDURE, CREATE SEQUENCE, CREATE SESSION , CREATE SYNONYM, CREATE TABLE, CREATE TRIGGER , CREATE TYPE, CREATE VIEW TO import; |
Setting up Virtual Directories
A virtual directory in Oracle acts maps an internal database directory name (known as a virtual directory) to a physical directory of the operating system. You create two virtual directories in this example, one holds read-only permissions to the directory where you’re putting the data file, and the other holds read-write permissions to the directory where you’re writing any log files from the external file process.
Log files are generated from this process when you query the data from the external file. Any error in the files conformity is written to a log file.
CREATE DIRECTORY upload_files AS 'C:\Imports\ImportFiles'; CREATE DIRECTORY upload_logs AS 'C:\Imports\ImportLogs'; |
After creating the virtual directories in the database, you must grant appropriate access to the user account that will access the data. This grants those permissions to the IMPORT user:
GRANT READ ON DIRECTORY upload_files TO import; GRANT READ, WRITE ON DIRECTORY upload_logs TO import; |
Setting up an External Table
An external table references both the UPLOAD_FILES and UPLOAD_LOGS virtual directories, and the virtual directories must map to physical directories that allow read and write privileges to the Oracle user. Here’s the external table for this example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | CREATE TABLE item_import_ext_table ( asin_number VARCHAR2(10) , item_type VARCHAR2(15) , item_title VARCHAR2(60) , item_subtitle VARCHAR2(60) , item_rating VARCHAR2(8) , item_rating_agency VARCHAR2(4) , item_release_date DATE) ORGANIZATION EXTERNAL ( TYPE oracle_loader DEFAULT DIRECTORY upload_files ACCESS PARAMETERS ( RECORDS DELIMITED BY NEWLINE CHARACTERSET US7ASCII BADFILE 'UPLOAD_LOGS':'item_import_ext_table.bad' DISCARDFILE 'UPLOAD_LOGS':'item_import_ext_table.dis' LOGFILE 'UPLOAD_LOGS':'item_import_ext_table.log' FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY "'" MISSING FIELD VALUES ARE NULL ) LOCATION ('item_import.csv')) REJECT LIMIT UNLIMITED; |
Setting up a Physical File
You should put the following in a item_import.csv physical file (case sensitivity won’t matter on the Windows 7 platform but will matter on the Linux or Unix platforms):
'B000W74EQC','DVD_WIDE_SCREEN','Harry Potter and the Sorcerer''s Stone',,'PG','MPAA','11-DEC-2007' 'B000W746GK','DVD_WIDE_SCREEN','Harry Potter and the Chamber of Secrets',,'PG','MPAA','11-DEC-2007' 'B000W796OM','DVD_WIDE_SCREEN','Harry Potter and the Prisoner of Azkaban',,'PG','MPAA','11-DEC-2007' 'B000E6EK2Y','DVD_WIDE_SCREEN','Harry Potter and the Goblet of Fire',,'PG-13','MPAA','07-MAR-2006' 'B000W7F5SS','DVD_WIDE_SCREEN','Harry Potter and the Order of the Phoenix',,'PG-13','MPAA','11-DEC-2007' 'B002PMV9FG','DVD_WIDE_SCREEN','Harry Potter and the Half-Blood Prince',,'PG','MPAA','08-DEC-2009' 'B001UV4XHY','DVD_WIDE_SCREEN','Harry Potter and the Deathly Hallows, Part 1',,'PG-13','MPAA','15-APR-2011' 'B001UV4XIS','DVD_WIDE_SCREEN','Harry Potter and the Deathly Hallows, Part 2',,'PG-13','MPAA','11-NOV-2011' |
Testing the External Table
After putting the item_import.csv file in the C:\Imports\ImportFiles directory, you can test the process at this point by running the following query:
SET PAGESIZE 99 COLUMN asin_number FORMAT A11 HEADING "ASIN #" COLUMN item_title FORMAT A46 HEADING "ITEM TITLE" COLUMN item_rating FORMAT A6 HEADING "RATING" COLUMN item_release_date FORMAT A11 HEADING "RELEASE|DATE" SELECT asin_number , item_title , item_rating , TO_CHAR(item_release_date,'DD-MON-YYYY') AS item_release_date FROM item_import_ext_table; |
It should return eight rows.
Extending Access to the Data Dictionary
The physical directory names of virtual directories are hidden from generic users. They’re available in the ALL_DIRECTORIES and DBA_DIRECTORIES administrative view for queries by the SYS, SYSTEM, and any DBA role privileged users.
While a privileged user can query the view, placing the view inside a function or procedure deployed in the privileged user’s schema would raise an ORA-00942 error. That error signals that the table or view does not exist.
This example deploys the view in the SYSTEM schema. That means it requires you make the following grant as the SYS user:
GRANT SELECT ON sys.dba_directories TO system; |
After making the grant from the SYS schema to the SYSTEM schema, connect to the SYSTEM schema. Then, create the following GET_DIRECTORY_PATH function:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | CREATE OR REPLACE FUNCTION get_directory_path ( virtual_directory IN VARCHAR2 ) RETURN VARCHAR2 IS -- Define RETURN variable. directory_path VARCHAR2(256) := ''; --Define dynamic cursor. CURSOR get_directory (virtual_directory VARCHAR2) IS SELECT directory_path FROM sys.dba_directories WHERE directory_name = virtual_directory; -- Define a LOCAL exception FOR name violation. directory_name EXCEPTION; PRAGMA EXCEPTION_INIT(directory_name,-22284); BEGIN OPEN get_directory (virtual_directory); FETCH get_directory INTO directory_path; CLOSE get_directory; -- RETURN file name. RETURN directory_path; EXCEPTION WHEN directory_name THEN RETURN NULL; END get_directory_path; / |
It’s tempting to make the grant on this function to PUBLIC user but that would expose information that any DBA should try and limit. That means you grant EXECUTE privilege only to the IMPORT schema.
This grant should be made as the SYSTEM user:
GRANT EXECUTE ON get_directory_path TO import; |
After granting the EXECUTE privilege to the IMPORT user, connect to the IMPORT schema and create a synonym to the GET_DIRECTORY_PATH function. The syntax for that command is:
CREATE SYNONYM get_directory_path FOR system.get_directory_path; |
You can now test your access to the function with the following query from the IMPORT schema:
SELECT get_directory_path('UPLOAD_FILES') FROM dual; |
You should return the following if you’ve got everything working at this point:
GET_DIRECTORY_PATH('UPLOAD_FILES') ------------------------------------ C:\Imports\ImportFiles |
At this point, you’ve completed the second major configuration component. You now need the ability to read files outside the database, which can be done with Java in Oracle 10g or Oracle 11g (that’s not possible in Oracle 10g XE or Oracle 11g XE because they don’t support an internal JVM). The
Reading Virtual Directory Files
The GET_DIRECTORY_PATH function provides you with the ability to read the Oracle data catalog and find the absolute directory path of a virtual directory. In this framework, you need this value to find whether the item_import.csv physical file is present in the file system before you read the file.
There doesn’t appear to be a neat little function to read an external directory. At least, there’s not one in the UTL_FILE or DBMS_LOB packages where you’d think it should be found. Unfortunately, that leaves us with two alternatives. One is to write an external library in C, C++, or C#. Another is to write an internal Java library that reads the file system. You accomplish this by granting permissions to a target directory or directories.
The first step is to create a scalar array of VARCHAR2 variables, like
CREATE OR REPLACE TYPE file_list AS TABLE OF VARCHAR2(255); / |
The second step is to write the Java library file. You can write it three ways. One accepts default error handling and the others override the default exception handling. If you’re new to Java, you should take the basic library with default handling. If you’ve more experience, you may want to override the helpful message with something that causes the developer to check with the DBA or simply suppress the message to enhance security.
You should note that the database connection is an Oracle Database 11g internal database connection. The connection only does one thing. It allows you to map the ArrayDescriptor to a schema-level SQL collection type. The element types of these collections should be scalar variables, like DATE, NUMBER, or VARCHAR2 data types.
The more advanced method overrides exception handling by suppressing information about the java.properties settings. You can do it by catching the natively thrown exception and re-throw it or ignore it. The example ignores it because handling it in Java reports an unhandled exception at the PL/SQL or SQL layer, which leads end users to think you have a major design problem.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 | CREATE OR REPLACE AND COMPILE JAVA SOURCE NAMED "ListVirtualDirectory" AS // Import required classes. import java.io.*; import java.security.AccessControlException; import java.sql.*; import java.util.Arrays; import oracle.sql.driver.*; import oracle.sql.ArrayDescriptor; import oracle.sql.ARRAY; // Define the class. public class ListVirtualDirectory { // Define the method. public static ARRAY getList(String path) throws SQLException { // DECLARE variable AS a NULL, required because OF try-catch block. ARRAY listed = NULL; // Define a connection (this IS FOR Oracle 11g). Connection conn = DriverManager.getConnection("jdbc:default:connection:"); // USE a try-catch block TO trap a Java permission error ON the directory. try { // DECLARE a class WITH the file list. File directory = NEW File(path); // DECLARE a mapping TO the schema-level SQL collection TYPE. ArrayDescriptor arrayDescriptor = NEW ArrayDescriptor("FILE_LIST",conn); // Translate the Java String[] TO the Oracle SQL collection TYPE. listed = NEW ARRAY(arrayDescriptor,conn,((Object[]) directory.list())); } catch (AccessControlException e) {} RETURN listed; }} / |
You can’t call an internal Java library without a PL/SQL wrapper function. Here’s the wrapper function for this Java library:
CREATE OR REPLACE FUNCTION list_files(path VARCHAR2) RETURN FILE_LIST IS LANGUAGE JAVA NAME 'ListVirtualDirectory.getList(java.lang.String) return oracle.sql.ARRAY'; / |
You MUST grant the Oracle Database’s internal JVM authority to read the external directory before you can return the directory contents. Any attempt to read a directory without the proper permissions raises an ORA-29532 exception.
The following is an anonymous block to grant permissions to a directory. You must grant a minimum of read permissions but since you’ll also delete this file later in the post you should grant read, write, and delete. You must run it from the SYSDBA role as the SYS user.
1 2 3 4 5 6 7 | BEGIN DBMS_JAVA.GRANT_PERMISSION('IMPORT' ,'SYS:java.io.FilePermission' ,'C:\Imports\ImportFiles' ,'read,write,delete'); END; / |
While you’re connected, it’s a good idea to grant the same privileges to your log directory:
1 2 3 4 5 6 7 | BEGIN DBMS_JAVA.GRANT_PERMISSION('IMPORT' ,'SYS:java.io.FilePermission' ,'C:\Imports\ImportLogs' ,'read,write,delete'); END; / |
You should now be able to read the contents of an external file from another PL/SQL block or from a SQL statement. Here’s an example of the SQL statement call that uses everything developed to this point:
SELECT column_value AS "File Names" FROM TABLE(list_files(get_directory_path('UPLOAD_FILES'))); |
It should return the item_import.csv physical file as the only file in the physical directory, like:
File Names ----------------- item_import.csv |
Mapping an External Table to a source File
The next step leverages the user segment of the Oracle Database’s data catalog and all the components developed above to find and display the external table and external file. This query returns the results:
COLUMN TABLE_NAME FORMAT A30 COLUMN file_name FORMAT A30 SELECT xt.table_name , xt.file_name FROM (SELECT uxt.TABLE_NAME , ixt.column_value AS file_name FROM user_external_tables uxt CROSS JOIN TABLE(list_files(get_directory_path(uxt.default_directory_name))) ixt) xt JOIN user_external_locations xl ON xt.table_name = xl.table_name AND xt.file_name = xl.location; |
It should return the following:
TABLE_NAME FILE_NAME ------------------------------ ------------------------------ ITEM_IMPORT_EXT_TABLE item_import.csv |
You can migrate the query into the following function. It returns a zero when the file isn’t found and a one when it is found.
CREATE OR REPLACE FUNCTION external_file_found ( table_in VARCHAR2 ) RETURN NUMBER IS -- Define a default return value. retval NUMBER := 0; -- Decalre a cursor to find external tables. CURSOR c (cv_table VARCHAR2) IS SELECT xt.table_name , xt.file_name FROM (SELECT uxt.TABLE_NAME , ixt.column_value AS file_name FROM user_external_tables uxt CROSS JOIN TABLE(list_files(get_directory_path(uxt.default_directory_name))) ixt) xt JOIN user_external_locations xl ON xt.table_name = xl.table_name AND xt.file_name = xl.location AND xt.table_name = UPPER(cv_table); BEGIN FOR i IN c(table_in) LOOP retval := 1; END LOOP; RETURN retval; END; / |
With the EXTERNAL_FILE_FOUND function, you can create a function that returns rows when the external file is found and no rows when the external file isn’t found. The following view hides the logic required to make that work:
CREATE OR REPLACE VIEW item_import AS SELECT * FROM item_import_ext_table WHERE external_file_found('ITEM_IMPORT_EXT_TABLE') = 1; |
Conveniently, you can now query the ITEM_IMPORT view without the risk of raising the following error when the file is missing:
SELECT * FROM item_import_ext_table * ERROR at line 1: ORA-29913: error IN executing ODCIEXTTABLEOPEN callout ORA-29400: DATA cartridge error KUP-04040: file item_import.csv IN UPLOAD_FILES NOT found |
You can now grant the SELECT privilege on the ITEM_IMPORT view to your application schema, like:
GRANT SELECT ON item_import TO application; |
After granting the SELECT privilege on the ITEM_IMPORT view to the APPLICATION schema, you can create a synonym to hide the IMPORT schema.
CREATE SYNONYM item_import FOR item_import; |
At this point, many developers feel they’re done. Enclosing the results in a schema-level function provides more utility than a view. The next section shows you how to replace the view with a schema-level function.
Replacing the View with an Object Table Function
Inside a schema-level function, you can assign the results from the query to a SQL collection of an object type. The object type should mirror the structure of the table, like the following:
1 2 3 4 5 6 7 8 9 | CREATE OR REPLACE TYPE item_import_object IS OBJECT ( asin_number VARCHAR2(10) , item_type VARCHAR2(15) , item_title VARCHAR2(60) , item_subtitle VARCHAR2(60) , item_rating VARCHAR2(8) , item_rating_agency VARCHAR2(4) , item_release_date DATE); / |
After creating the object type that mirrors the structure of the ITEM_IMPORT_EXT_TABLE table, you need to create a list like collection of the object type. The nested table collection type acts like a list in Oracle:
1 2 3 | CREATE OR REPLACE TYPE item_import_object_table IS TABLE OF item_import_object; / |
After defining the object type and collection, you can access them in the following type of function:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 | CREATE OR REPLACE FUNCTION external_file_contents ( table_in VARCHAR2 ) RETURN item_import_object_table IS -- Define a local counter. lv_counter NUMBER := 1; -- Construct an empty collection of ITEM_IMPORT_OBJECT data types. lv_item_import_table ITEM_IMPORT_OBJECT_TABLE := item_import_object_table(); -- Decalre a cursor to find external tables. CURSOR c (cv_table VARCHAR2) IS SELECT * FROM item_import_ext_table WHERE external_file_found(cv_table) = 1; BEGIN FOR i IN c(table_in) LOOP lv_item_import_table.EXTEND; lv_item_import_table(lv_counter) := item_import_object(i.asin_number ,i.item_type ,i.item_title ,i.item_subtitle ,i.item_rating ,i.item_rating_agency ,i.item_release_date); lv_counter := lv_counter + 1; END LOOP; /* * This is where you can place autonomous function calls: * ====================================================== * - These can read source and log files, and write them * to CLOB attributes for later inspection or review. * - These can call Java libraries to delete files, but * you should note that Java deletes any file rather * than moving it to the trash bin (where you might * recover it. */ RETURN lv_item_import_table; END; / |
Between the assignment to the collection and the return statement of the function, you have the ability of calling any number of autonomous functions. Any schema-level function can call autonomous functions that read and write tables with DML statements, like the INSERT, UPDATE, and DELETE statements. You can also call schema-functions that wrap Java libraries that delete external files.
You can confirm that the steps work by running the following query with or without the SQL*Plus formatting:
/* * SQL*Plus formatting. */ SET PAGESIZE 99 COLUMN asin_number FORMAT A11 HEADING "ASIN #" COLUMN item_title FORMAT A46 HEADING "ITEM TITLE" COLUMN item_rating FORMAT A6 HEADING "RATING" COLUMN item_release_date FORMAT A11 HEADING "RELEASE|DATE" /* * Query works only when item_import.csv file is present. */ SELECT asin_number , item_title , item_rating , TO_CHAR(item_release_date,'DD-MON-YYYY') AS item_release_date FROM TABLE(external_file_contents('ITEM_IMPORT_EXT_TABLE')); |
It should return the following from SQL*Plus:
RELEASE ASIN # ITEM TITLE RATING DATE ----------- ---------------------------------------------- ------ ----------- B000W74EQC Harry Potter and the Sorcerer's Stone PG 11-DEC-2007 B000W746GK Harry Potter and the Chamber of Secrets PG 11-DEC-2007 B000W796OM Harry Potter and the Prisoner of Azkaban PG 11-DEC-2007 B000E6EK2Y Harry Potter and the Goblet of Fire PG-13 07-MAR-2006 B000W7F5SS Harry Potter and the Order of the Phoenix PG-13 11-DEC-2007 B002PMV9FG Harry Potter and the Half-Blood Prince PG 08-DEC-2009 B001UV4XHY Harry Potter and the Deathly Hallows, Part 1 PG-13 15-APR-2011 B001UV4XIS Harry Potter and the Deathly Hallows, Part 2 PG-13 11-NOV-2011 |
The creation of the schema-level function lets you recreate the ITEM_IMPORT view. The following view would encapsulate (or hide) the presence of the function, which hides all the infrastructure components developed before this section (see line 14 in the function):
1 2 3 | CREATE OR REPLACE VIEW item_import AS SELECT * FROM TABLE(external_file_contents('ITEM_IMPORT_EXT_TABLE')); |
Implementing a Managed Import Process
During any import the information from the import process is exposed and one or more items may fail during the import process. That means the source file and loading log files must be preserved immediately after reading the data successfully. This is done by loading the data source file and log, discard, and bad import files into database tables. Only the source and log files exist when all rows are well formed, but the log files are reused for any subsequent load and require human inspection to isolate a specific upload.
The best way to implement this requires creating individual tables to hold each of the four potential large objects. The ITEM_MASTER table holds a transactional primary key and a table name for the import table. The primary key of the ITEM_MASTER table is the base key for imports and the ITEM_DATA, ITEM_LOG, ITEM_DISCARD, and ITEM_BAD tables hold foreign keys that point back to the ITEM_MASTER table’s primary key. These tables also hold a character large object column (CLOB), which will hold the respective source data file or log, discard, or bad files.
The following create the tables for the logging framework:
CREATE TABLE import_master ( import_master_id NUMBER CONSTRAINT pk_import_master PRIMARY KEY , import_table VARCHAR2(30)); -- Create sequence for import master. CREATE SEQUENCE import_master_s; -- Create import table. CREATE TABLE import_data ( import_data_id NUMBER CONSTRAINT pk_import_data PRIMARY KEY , import_master_id NUMBER , import_data CLOB , CONSTRAINT fk_import_data FOREIGN KEY (import_data_id) REFERENCES import_master (import_master_id)) LOB (import_data) STORE AS BASICFILE item_import_clob (TABLESPACE users ENABLE STORAGE IN ROW CHUNK 32768 PCTVERSION 10 NOCACHE LOGGING STORAGE (INITIAL 1048576 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645)); -- Create sequence for import master. CREATE SEQUENCE import_data_s; -- Create import table. CREATE TABLE import_log ( import_log_id NUMBER CONSTRAINT pk_import_log PRIMARY KEY , import_master_id NUMBER , import_log CLOB , CONSTRAINT fk_import_log FOREIGN KEY (import_log_id) REFERENCES import_master (import_master_id)) LOB (import_log) STORE AS BASICFILE item_import_log_clob (TABLESPACE users ENABLE STORAGE IN ROW CHUNK 32768 PCTVERSION 10 NOCACHE LOGGING STORAGE (INITIAL 1048576 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645)); -- Create sequence for import master. CREATE SEQUENCE import_log_s; -- Create import table. CREATE TABLE import_discard ( import_discard_id NUMBER CONSTRAINT pk_import_discard PRIMARY KEY , import_master_id NUMBER , import_discard CLOB , CONSTRAINT fk_import_discard FOREIGN KEY (import_discard_id) REFERENCES import_master (import_master_id)) LOB (import_discard) STORE AS BASICFILE item_import_discard_clob (TABLESPACE users ENABLE STORAGE IN ROW CHUNK 32768 PCTVERSION 10 NOCACHE LOGGING STORAGE (INITIAL 1048576 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645)); -- Create sequence for import master. CREATE SEQUENCE import_discard_s; -- Create import table. CREATE TABLE import_bad ( import_bad_id NUMBER CONSTRAINT pk_import_bad PRIMARY KEY , import_master_id NUMBER , import_bad CLOB , CONSTRAINT fk_import_bad FOREIGN KEY (import_bad_id) REFERENCES import_master (import_master_id)) LOB (import_bad) STORE AS BASICFILE item_import_bad_clob (TABLESPACE users ENABLE STORAGE IN ROW CHUNK 32768 PCTVERSION 10 NOCACHE LOGGING STORAGE (INITIAL 1048576 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645)); -- Create sequence for import master. CREATE SEQUENCE import_bad_s; |
The tables set the targets for uploading the source and log files. You should note that the table name is also the column name for the CLOB column, this becomes convenient when supporting a Native Dynamic SQL (NDS) statement in a single autonomous function. The LOAD_CLOB_FROM_FILE function supports reading the external source and log files and writing them their respective tables.
There is a DEADLOCK possibility with this type of architecture. It requires that the base row in the IMPORT_MASTER table is committed before attempting inserts into one of the dependent tables. A call to the function raises an error when the primary key column hasn’t been committed before hand.
You already set the access privileges for the DBMS_LOB package when you granted them to the UPLOAD_FILES and UPLOAD_LOGS virtual directories. This function only requires read permissions, which were granted to both virtual directories.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 | CREATE OR REPLACE FUNCTION load_clob_from_file ( pv_src_file_name IN VARCHAR2 , pv_virtual_dir IN VARCHAR2 , pv_table_name IN VARCHAR2 , pv_column_name IN VARCHAR2 , pv_foreign_key IN NUMBER ) RETURN NUMBER IS -- Declare placeholder for sequence generated primary key. lv_primary_key NUMBER; -- Declare default return value. lv_retval NUMBER := 0; -- Declare local variables for DBMS_LOB.LOADCLOBFROMFILE procedure. des_clob CLOB; src_clob BFILE := BFILENAME(pv_virtual_dir,pv_src_file_name); des_offset NUMBER := 1; src_offset NUMBER := 1; ctx_lang NUMBER := dbms_lob.default_lang_ctx; warning NUMBER; -- Declare pre-reading size. src_clob_size NUMBER; -- Declare variables for handling NDS sequence value. lv_sequence VARCHAR2(30); lv_sequence_output NUMBER; lv_sequence_tagline VARCHAR2(10) := '_s.nextval'; -- Define local variable for Native Dynamic SQL (NDS) Statement. stmt VARCHAR2(2000); -- Declare the function as an autonomous transaction. PRAGMA AUTONOMOUS_TRANSACTION; BEGIN -- Open file only when found. IF dbms_lob.fileexists(src_clob) = 1 AND NOT dbms_lob.isopen(src_clob) = 1 THEN src_clob_size := dbms_lob.getlength(src_clob); dbms_lob.open(src_clob,dbms_lob.lob_readonly); END IF; -- Concatenate the sequence name with the tagline. lv_sequence := pv_table_name || lv_sequence_tagline; -- Assign the sequence through an anonymous block. stmt := 'BEGIN ' || ' :output := '||lv_sequence||';' || 'END;'; -- Run the statement to extract a sequence value through NDS. EXECUTE IMMEDIATE stmt USING IN OUT lv_sequence_output; -- Create a dynamic statement that works for all source and log files. -- ---------------------------------------------------------------------- -- NOTE: This statement requires that the row holding the primary key -- has been committed because otherwise it raises the following -- error because it can't verify the integrity of the foreign -- key constraint. -- ---------------------------------------------------------------------- -- DECLARE -- * -- ERROR at line 1: -- ORA-00060: deadlock detected while waiting for resource -- ORA-06512: at "IMPORT.LOAD_CLOB_FROM_FILE", line 50 -- ORA-06512: at line 20 -- ---------------------------------------------------------------------- stmt := 'INSERT INTO '||pv_table_name||' '||CHR(10)|| 'VALUES '||CHR(10)|| '('||lv_sequence_output||CHR(10)|| ','||pv_foreign_key||CHR(10)|| ', empty_clob())'||CHR(10)|| 'RETURNING '||pv_column_name||' INTO :locator'; -- Run dynamic statement. EXECUTE IMMEDIATE stmt USING OUT des_clob; -- Read and write file to CLOB, close source file and commit. dbms_lob.loadclobfromfile( dest_lob => des_clob , src_bfile => src_clob , amount => dbms_lob.getlength(src_clob) , dest_offset => des_offset , src_offset => src_offset , bfile_csid => dbms_lob.default_csid , lang_context => ctx_lang , warning => warning ); -- Close open source file. dbms_lob.close(src_clob); -- Commit write and conditionally acknowledge it. IF src_clob_size = dbms_lob.getlength(des_clob) THEN COMMIT; lv_retval := 1; ELSE RAISE dbms_lob.operation_failed; END IF; RETURN lv_retval; END load_clob_from_file; / |
You can test this procedure against the data source file with the following script file:
-- Insert a sample row in the master table. INSERT INTO import_master VALUES (import_master_s.nextval,'ITEM_IMPORT_EXT_TABLE'); -- Record the row value to avoid deadlock on uncommitted master record. COMMIT; -- Test program for loading CLOB files. DECLARE -- Declare testing variables. lv_file_name VARCHAR2(255) := 'item_import.csv'; lv_virtual_dir VARCHAR2(255) := 'UPLOAD_FILES'; lv_table_name VARCHAR2(30) := 'IMPORT_DATA'; lv_column_name VARCHAR2(30) := 'IMPORT_DATA'; lv_foreign_key NUMBER; BEGIN -- Assign the current value of the sequence to a local variable. lv_foreign_key := import_master_s.currval; -- Check if you can read and insert a CLOB column. IF load_clob_from_file(lv_file_name ,lv_virtual_dir ,lv_table_name ,lv_table_name ,lv_foreign_key) = 1 THEN -- Display a successful subordinate routine. dbms_output.put_line('Subordinate routine succeeds.'); ELSE -- Display a failed subordinate routine. dbms_output.put_line('Subordinate routine fails.'); END IF; END load_clob_from_file; / |
You can test this procedure against the log file with the following script file:
DECLARE -- Declare testing variables. lv_file_name VARCHAR2(255) := 'item_import_ext_table.log'; lv_virtual_dir VARCHAR2(255) := 'UPLOAD_LOGS'; lv_table_name VARCHAR2(30) := 'IMPORT_LOG'; lv_column_name VARCHAR2(30) := 'IMPORT_LOG'; lv_foreign_key NUMBER; BEGIN -- Assign the current value of the sequence to a local variable. lv_foreign_key := import_master_s.currval; dbms_output.put_line('Foreign key ['||lv_foreign_key||']'); -- Check if you can read and insert a CLOB column. IF load_clob_from_file(lv_file_name ,lv_virtual_dir ,lv_table_name ,lv_table_name ,lv_foreign_key) = 1 THEN -- Display a successful subordinate routine. dbms_output.put_line('Subordinate routine succeeds.'); ELSE -- Display a failed subordinate routine. dbms_output.put_line('Subordinate routine fails.'); END IF; END; / |
You now have the ability to read and store the source and log files in CLOB columns. The next step is to write a master function that writes the master row and calls the LOAD_CLOB_FROM_FILE function for the source file and each of the log files. That’s what the CLEANUP_EXTERNAL_FILES function provides.
Unfortunately, the Java logic requires using the logical and operation, which is two ampersands (&&). This requires that you turn off substitution variables in SQL*Plus. You do that by disabling DEFINE, like this:
SET DEFINE OFF |
You can compile this Java library file after you’ve disabled LOAD_CLOB_FROM_FILE:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 | CREATE OR REPLACE FUNCTION cleanup_external_files ( table_in VARCHAR2 , data_directory_in VARCHAR2 , log_directory_in VARCHAR2 ) RETURN NUMBER IS -- Declare a local Attribute Data Type (ADT). TYPE list IS TABLE OF VARCHAR2(3); -- Declare a collection. lv_extension LIST := list('csv','log','bad','dis'); -- Define a default return value. retval NUMBER := 0; -- Declare base target table name. lv_target_table VARCHAR2(30) := 'IMPORT'; lv_foreign_key NUMBER; -- Decalre a cursor to find external tables. CURSOR check_source (cv_table_name VARCHAR2) IS SELECT xt.file_name FROM (SELECT uxt.TABLE_NAME , ixt.column_value AS file_name FROM user_external_tables uxt CROSS JOIN TABLE(list_files(get_directory_path(uxt.default_directory_name))) ixt) xt JOIN user_external_locations xl ON xt.TABLE_NAME = xl.TABLE_NAME AND xt.file_name = xl.location AND xt.TABLE_NAME = UPPER(cv_table_name); -- Declare a cursor to find files and compare for one input file name. CURSOR check_logs (cv_file_name VARCHAR2) IS SELECT list.column_value FROM TABLE(list_files(get_directory_path('UPLOAD_LOGS'))) list JOIN (SELECT cv_file_name AS file_name FROM dual) FILTER ON list.column_value = FILTER.file_name; -- Declare the function as autonomous. PRAGMA AUTONOMOUS_TRANSACTION; BEGIN -- Master loop to check for source and log files. FOR i IN check_source (table_in) LOOP -- Assign next sequence value to local variable. lv_foreign_key := import_master_s.nextval; -- Write the master record and commit it for the autonomous threads. INSERT INTO import_master VALUES (lv_foreign_key,'ITEM_IMPORT_EXT_TABLE'); COMMIT; -- Process all file extensions. FOR j IN 1..lv_extension.COUNT LOOP -- The source data file is confirmed by the CHECK_SOURCE cursor. IF lv_extension(j) = 'csv' THEN -- Load the source data file. -- ---------------------------------------------------------- -- The RETVAL holds success or failure, this approach -- suppresses an error when the file can't be loaded. -- It should only occur when there's no space available -- in the target table. retval := load_clob_from_file(i.file_name ,data_directory_in ,lv_target_table||'_DATA' ,lv_target_table||'_DATA' ,lv_foreign_key); lv_foreign_key := lv_foreign_key + 1; ELSE -- Verify that log file exists before attempting to load it. FOR k IN check_logs (LOWER(table_in)||'.'||lv_extension(j)) LOOP -- Load the log, bad, or dis(card) file. -- ---------------------------------------------------------- -- The RETVAL holds success or failure, as mentioned above. retval := load_clob_from_file(LOWER(table_in)||'.'||lv_extension(j) ,log_directory_in ,lv_target_table||'_'||lv_extension(j) ,lv_target_table||'_'||lv_extension(j) ,lv_foreign_key); END LOOP; END IF; END LOOP; retval := 1; END LOOP; RETURN retval; END; / |
Deleting Files from Virtual Directories
After you’ve read the files through a query and uploaded the source and log files to the database, you need to cleanup the files. This can be done by using another Java library function, provided you granted read, write, and delete privileges to the internal Java permissions file.
The DeleteFile Java library deletes files from the file system. It doesn’t put them in the trash can for final delete, it removes them completely.
Now you can build the Java library that lets you delete a file. A quick caveat, this code includes an AND logical operator that is two ampersands (&&). SQL uses an ampersand (&) for substitution variables. You’ll need to suppress that behavior when you run this code.
You do that by issuing the following command to disable substitution variables in SQL*Plus:
1 | SET DEFINE OFF |
You create the DeleteFile library like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | CREATE OR REPLACE AND COMPILE JAVA SOURCE NAMED "DeleteFile" AS // Java import statements import java.io.File; import java.security.AccessControlException; // Class definition. public class DeleteFile { // Define variable(s). private static File file; // Define copyTextFile() method. public static void deleteFile(String fileName) throws AccessControlException { // CREATE files FROM canonical file names. file = NEW File(fileName); // DELETE file(s). IF (file.isFile() && file.delete()) {}}} / |
You need a PL/SQL Wrapper to call the library, and here it is:
1 2 3 4 | CREATE OR REPLACE PROCEDURE delete_file (dfile VARCHAR2) IS LANGUAGE JAVA NAME 'DeleteFile.deleteFile(java.lang.String)'; / |
You can call this separately or embed it inside the UPLOAD_LOGS function, which saves re-writing the logic to find any source or log files.
This has provided you with an external table import framework. You can extend the framework by wrapping the query in an object table function. Such a function would afford you the opportunity to cleanup the source and log files after the query operation.