Archive for the ‘Unix’ Category
INSERT Statement
INSERT Statement
Learning Outcomes
- Learn how to use positional- and named-notation in INSERT statements.
- Learn how to use the VALUES clause in INSERT statements.
- Learn how to use subqueries in INSERT statements.
The INSERT statement lets you enter data into tables and views in two ways: via an INSERT statement with a VALUES clause and via an INSERT statement with a query. The VALUES clause takes a list of literal values (strings, numbers, and dates represented as strings), expression values (return values from functions), or variable values.
Query values are results from SELECT statements that are subqueries (covered earlier in this appendix). INSERT statements work with scalar, single-row, and multiple-row subqueries. The list of columns in the VALUES clause or SELECT clause of a query (a SELECT list) must map to the positional list of columns that defines the table. That list is found in the data dictionary or catalog. Alternatively to the list of columns from the data catalog, you can provide a named list of those columns. The named list overrides the positional (or default) order from the data catalog and must provide at least all mandatory columns in the table definition. Mandatory columns are those that are not null constrained.
Oracle databases differ from other databases in how they implement the INSERT statement. Oracle doesn’t support multiple-row inserts with a VALUES clause. Oracle does support default and override signatures as qualified in the ANSI SQL standards. Oracle also provides a multiple- table INSERT statement. This section covers how you enter data with an INSERT statement that is based on a VALUES clause or a subquery result statement. It also covers multiple-table INSERT statements.
The INSERT statement has one significant limitation: its default signature. The default signature is the list of columns that defines the table in the data catalog. The list is defined by the position and data type of columns. The CREATE statement defines the initial default signature, and the ALTER statement can change the number, data types, or ordering of columns in the default signature.
The default prototype for an INSERT statement allows for an optional column list that overrides the default list of columns. When you provide the column list you choose to implement named-notation, which is the right way to do it. Relying on the insertion order of the columns is a bad idea. An INSERT statement without a list of column names is a position-notation statement. Position-notation is bad because somebody can alter that order and previously written INSERT statements will break or put data in the wrong columns.
Like methods in OOPLs, an INSERT statement without the optional column list constructs an instance (or row) of the table using the default constructor. The override constructor for a row is defined by any INSERT statement when you provide an optional column list. That’s because it overrides the default constructor.
The generic prototype for an INSERT statement is confusing when it tries to capture both the VALUES clause and the result set from a query. Therefore, I’ve opted to provide two generic prototypes.
Insert by value
The first uses the VALUES clause:
INSERT INTO table_name [( column1, column2, column3, ...)] VALUES ( value1, value2, value3, ...); |
Notice that the prototype for an INSERT statement with the result set from a query doesn’t use the VALUES clause at all. A parsing error occurs when the VALUES clause and query both occur in an INSERT statement.
The second prototype uses a query and excludes the VALUES clause. The subquery may return one to many rows of data. The operative rule is that all columns in the query return the same number of rows of data, because query results should be rectangles—rectangles made up of one to many rows of columns.
Insert by subquery
Here’s the prototype for an INSERT statement that uses a subquery:
INSERT INTO table_name [( column1, column2, column3, ...)] ( SELECT value1, value2, value3, ... FROM table_name WHERE ...); |
A query, or SELECT statement, returns a SELECT list. The SELECT list is the list of columns, and it’s evaluated by position and data type. The SELECT list must match the definition of the table or the override signature provided.
Default signatures present a risk of data corruption through insertion anomalies, which occur when you enter bad data in tables. Mistakes transposing or misplacing values can occur more frequently with a default signature, because the underlying table structure can change. As a best practice, always use named notation by providing the optional list of values; this should help you avoid putting the right data in the wrong place.
The following subsections provide examples that use the default and override syntax for INSERT statements in Oracle databases. The subsections also cover multiple-table INSERT statements and a RETURNING INTO clause, which is an extension of the ANSI SQL standard. Oracle uses the RETURNING INTO clause to manage large objects, to return autogenerated identity column values, and to support some of the features of Oracle’s dynamic SQL. Note that Oracle also supports a bulk INSERT statement, which requires knowledge of PL/SQL.
Insert by Values →
An INSERT statement with a VALUES clause can only insert one row at a time in and Oracle database. Other databases, like Microsoft SQL Server and MySQL allow you to insert a comma delimited set of values inside the VALUES clause. Oracle adheres to the ANSI standard that support single row inserts with a VALUES clause and multiple row inserts with a subquery.
Inserting by the VALUES clause is the most common type of INSERT statement. It’s most useful when interacting with single-row inserts.
You typically use this type of INSERT statement when working with data entered through end-user web forms. In some cases, users can enter more than one row of data using a form, which occurs, for example, when a user places a meal order in a restaurant and the meal and drink are treated as order items. The restaurant order entry system would enter a single-row in the order table and two rows in the order_item table (one for the meal and the other for the drink). PL/SQL programmers usually handle the insertion of related rows typically inside a loop structure when they use dynamic INSERT statements. Dynamic inserts are typically performed using NDS (Native Dynamic SQL) statements.
Oracle supports only a single-row insert through the VALUES clause. Multiple-row inserts require an INSERT statement from a query.
The VALUES clause of an INSERT statement accepts scalar values, such as strings, numbers, and dates. It also accepts calls to arrays, lists, or user-defined object types, which are called flattened objects. Oracle supports VARRAY as arrays and nested tables as lists. They can both contain elements of a scalar data type or user-defined object type.
The following sections discuss how you use the VALUES clause with scalar data types, how you convert various data types, and how you use the VALUES clause with nested tables and user-defined object data types.
Inserting Scalar Data Types
Instruction Details →
This section shows you how to INSERT scalar values into tables.
The basic syntax for an INSERT statement with a VALUES clause can include an optional override signature between the table name and VALUES keyword. With an override signature, you designate the column names and the order of entry for the VALUES clause elements. Without an override signature, the INSERT signature checks the definition of the table in the database catalog. The positional order of the column in the data catalog defines the positional, or default, signature for the INSERT statement. As shown previously, you can discover the structure of a table in Oracle with the DESCRIBE command issued at the SQL*Plus command line:
DESCRIBE table_name |
You’ll see the following after describing the rental table in SQL*Plus:
Name Null? Type ------------------------------------ -------- -------- RENTAL_ID NOT NULL NUMBER CUSTOMER_ID NOT NULL NUMBER CHECK_OUT_DATE NOT NULL DATE RETURN_DATE DATE CREATED_BY NOT NULL NUMBER CREATION_DATE NOT NULL DATE LAST_UPDATED_BY NOT NULL NUMBER LAST_UPDATE_DATE NOT NULL DATE |
The rental_id column is a surrogate key, or an artificial numbering sequence. The combination of the customer_id and check_out_date columns serves as a natural key because a DATE data type is a date-time value. If it were only a date, the customer would be limited to a single entry for each day, and limiting customer rentals to one per day isn’t a good business model.
The basic INSERT statement would require that you look up the next sequence value before using it. You should also look up the surrogate key column value that maps to the row where your unique customer is stored in the contact table. For this example, assume the following facts:
- Next sequence value is 1086
- Customer’s surrogate key value is 1009
- Current date-time is represented by the value from the SYSDATE function
- Return date is the fifth date from today
- User adding and updating the row has a primary (surrogate) key value of 1
- Creation and last update date are the value returned from the SYSDATE function.
An INSERT statement must include a list of values that match the positional data types of the database catalog, or it must use an override signature for all mandatory columns.
You can now write the following INSERT statement, which relies on the default signature:
Name Null? Type ------------------------------------ -------- -------- RENTAL_ID NOT NULL NUMBER CUSTOMER_ID NOT NULL NUMBER CHECK_OUT_DATE NOT NULL DATE RETURN_DATE DATE CREATED_BY NOT NULL NUMBER CREATION_DATE NOT NULL DATE LAST_UPDATED_BY NOT NULL NUMBER LAST_UPDATE_DATE NOT NULL DATE |
The rental_id column is a surrogate key, or an artificial numbering sequence. The combination of the customer_id and check_out_date columns serves as a natural key because a DATE data type is a date-time value. If it were only a date, the customer would be limited to a single entry for each day, and limiting customer rentals to one per day isn’t a good business model.
The basic INSERT statement would require that you look up the next sequence value before using it. You should also look up the surrogate key column value that maps to the row where your unique customer is stored in the contact table. For this example, assume the following facts:
- Next sequence value is 1086
- Customer’s surrogate key value is 1009
- Current date-time is represented by the value from the SYSDATE function
- Return date is the fifth date from today
- User adding and updating the row has a primary (surrogate) key value of 1
- Creation and last update date are the value returned from the SYSDATE function.
An INSERT statement must include a list of values that match the positional data types of the database catalog, or it must use an override signature for all mandatory columns.
You can now write the following INSERT statement, which relies on the default signature:
SQL> INSERT INTO rental 2 VALUES 3 ( 1086 4 , 1009 5 , SYSDATE 6 , TRUNC(SYSDATE + 5) 7 ,1 8 , SYSDATE 9 , 1 10 , SYSDATE); |
If you weren’t using SYSDATE for the date-time value on line 5, you could manually enter a date-time with the following Oracle proprietary syntax:
5 , TO_DATE('15-APR-2011 12:53:01','DD-MON-YYYY HH24:MI:SS') |
The TO_DATE function is an Oracle-specific function. The generic conversion function would be the CAST function. The problem with a CAST function by itself is that it can’t handle a format mask other than the database defaults (‘DD-MON-RR‘ or ‘DD-MON-YYYY‘). For example, consider this syntax:
5 , CAST('15-APR-2011 12:53:02' AS DATE) |
It raises the following error:
5 , CAST('15-APR-2011 12:53:02' AS DATE) FROM dual * ERROR AT line 1: ORA-01830: DATE format picture ends before converting entire input string |
You actually need to double cast this type of format mask when you want to store it as a DATE data type. The working syntax casts the date-time string as a TIMESTAMP data type before recasting the TIMESTAMP to a DATE, like
5 , CAST(CAST('15-APR-2011 12:53:02' AS TIMESTAMP) AS DATE) |
Before you could write the preceding INSERT statement, you would need to run some queries to find the values. You would secure the next value from a rental_s1 sequence in an Oracle database with the following command:
SQL> SELECT rental_s1.NEXTVAL FROM dual; |
This assumes two things, because sequences are separate objects from tables. First, code from which the values in a table’s surrogate key column come must appear in the correct sequence. Second, a sequence value is inserted only once into a table as a primary key value.
In place of a query that finds the next sequence value, you would simply use a call against the .nextval pseudocolumn in the VALUES clause. You would replace line 3 with this:
3 ( rental_s1.NEXTVAL |
The .nextval is a pseudocolumn, and it instantiates an instance of a sequence in the current session. After a call to a sequence with the .nextval pseudocolumn, you can also call back the prior sequence value with the .currval pseudocolumn.
Assuming the following query would return a single-row, you can use the contact_id value as the customer_id value in the rental table:
SQL> SELECT contact_id 2 FROM contact 3 WHERE last_name = 'Potter' 4 AND first_name = 'Harry'; |
Taking three steps like this is unnecessary, however, because you can call the next sequence value and find the valid customer_id value inside the VALUES clause of the INSERT statement. The following INSERT statement uses an override signature and calls for the next sequence value on line 11. It also uses a scalar subquery to look up the correct customer_id value with a scalar subquery on lines 12 through 15.
SQL> INSERT INTO rental 2 ( rental_id 3 , customer_id 4 , check_out_date 5 , return_date 6 , created_by 7 , creation_date 8 , last_updated_by 9 , last_update_date ) 10 VALUES 11 ( rental_s1.NEXTVAL 12 ,(SELECT contact_id 13 FROM contact 14 WHERE last_name = 'Potter' 15 AND first_name = 'Harry') 16 , SYSDATE 17 , TRUNC(SYSDATE + 5) 18 , 1 19 , SYSDATE 20 , 3 21 , SYSDATE); |
When a subquery returns two or more rows because the conditions in the WHERE clause failed to find and return a unique row, the INSERT statement would fail with the following message:
,(SELECT contact_id * ERROR AT line 3: ORA-01427: single-ROW subquery returns more than one ROW |
In fact, the statement could fail when there are two or more “Harry Potter” names in the data set because three columns make up the natural key of the contact table. The third column is the member_id, and all three should be qualified inside a scalar subquery to guarantee that it returns only one row of data.
Handling Oracle’s Large Objects
Instruction Details →
This section shows you how to INSERT large object values into tables.
Oracle’s large objects present a small problem when they’re not null constrained in the table definition. You must insert empty object containers or references when you perform an INSERT statement.
Assume, for example, that you have the following three large object columns in a table:
Name Null? Type ------------------------------- -------- ----------------------- ITEM_DESC NOT NULL CLOB ITEM_ICON NOT NULL BLOB ITEM_PHOTO BINARY FILE LOB |
The item_desc column uses a CLOB (Character Large Object) data type, and it is a required column; it could hold a lengthy description of a movie, for example. The item_icon column uses a BLOB (Binary Large Object) data type, and it is also a required column. It could hold a graphic image. The item_photo column uses a binary file (an externally managed file) but is fortunately null allowed or an optional column in any INSERT statement. It can hold a null or a reference to an external graphic image.
Oracle provides two functions that let you enter an empty large object, and they are:
EMPTY_BLOB() EMPTY_CLOB() |
Although you could insert a null value in the item_photo column, you can also enter a reference to an Oracle database virtual directory file. Here’s the syntax to enter a valid BFILE name with the BFILENAME function call:
10 , BFILENAME('VIRTUAL_DIRECTORY_NAME', 'file_name.png') |
You can insert a large character or binary stream into BLOB and CLOB data types by using the stored procedures and functions available in the dbms_lob package. Chapter 13 covers the dbms_lob package.
You can use an empty_clob function or a string literal up to 32,767 bytes long in a VALUES clause. You must use the dbms_lob package when you insert a string that is longer than 32,767 bytes. That also changes the nature of the INSERT statement and requires that you append the RETURNING INTO clause. Here’s the prototype for this Oracle proprietary syntax:
INSERT INTO some_table [( column1, column2, column3, ...)] VALUES ( value1, value2, value3, ...) RETURNING column1 INTO local_variable; |
The local_variable is a reference to a procedural programming language. It lets you insert a character stream into a target CLOB column or insert a binary stream into a BLOB column.
Capturing the Last Sequence Value
Instruction Details →
This section shows you how to INSERT a new sequence in a parent table and a copy of that new sequence as a foreign key value in a child table.
Sometimes you insert into a series of tables in the scope of a transaction. In this scenario, one table gets the new sequence value (with a call to sequence_name.nextval) and enters it as the surrogate primary key, and another table needs a copy of that primary key to enter into a foreign key column. While scalar subqueries can solve this problem, Oracle provides the .currval pseudocolumn for this purpose.
The steps to demonstrate this behavior require a parent table and a child table. The parent table is defined as follows:
Name Null? Type ------------------------------------ -------- -------------- PARENT_ID NOT NULL NUMBER PARENT_NAME VARCHAR2(10) |
The parent_id column is the primary key for the parent table. You include the parent_id column in the child table. In the child table, the parent_id column holds a copy of a valid primary key column value as a foreign key to the parent table.
Name Null? Type ------------------------------------ -------- -------------- CHILD_ID NOT NULL NUMBER PARENT_ID NUMBER PARENT_NAME VARCHAR2(10) |
After creating the two tables, you can manage inserts into them with the .nextval and .currval pseudocolumns. The sequence calls with the .nextval pseudocolumn insert primary key values, and the sequence calls with the .currval pseudocolumn insert foreign key values.
You would perform these two INSERT statements as a group:
SQL> INSERT INTO parent 2 VALUES 3 ( parent_s1.NEXTVAL 4 ,'One Parent'); SQL> INSERT INTO child 2 VALUES 3 ( child_s1.NEXTVAL 4 , parent_s1.CURRVAL 5 ,'One Child'); |
The .currval pseudocolumn for any sequence fetches the value placed in memory by call to the .nextval pseudocolumn. Any attempt to call the .currval pseudocolumn before the .nextval pseudocolumn raises an ORA-02289 exception. The text message for that error says the sequence doesn’t exist, which actually means that it doesn’t exist in the scope of the current session. Line 4 in the insert into the child table depends on line 3 in the insert into the parent table.
You can use comments in INSERT statements to map to columns in the table. For example, the following shows the technique for the child table from the preceding example:
SQL> INSERT INTO child 2 VALUES 3 ( child_s1.NEXTVAL -- CHILD_ID 4 , parent_s1.CURRVAL -- PARENT_ID 5 ,'One Child') -- CHILD_NAME 6 / |
Comments on the lines of the VALUES clause identify the columns where the values are inserted. A semicolon doesn’t execute this statement, because a trailing comment would trigger a runtime exception. You must use the semicolon or forward slash on the line below the last VALUES element to include the last comment.
Insert by Subquery Results →
An INSERT statement with a subquery can insert one to many rows of data into any table provided the SELECT-list of the subquery matches the data dictionary definition of the table or the named-notation list provided by the INSERT statement. An INSERT statement with a subquery cannot have a VALUES keyword in it, or it raises an error.
The generic prototype for an INSERT statement follows the pattern of an INSERT statement by value prototype with one exception, it excludes the VALUES keyword and replaces the common delimited list of values with a SELECT-list from a subquery. If you want to rely on the positional definition of the table, exclude the list of comma delimited column values. The optional comma-delimited list of column values is necessary when you want to insert columns in a different order or exclude optional columns.
The generic prototype is:
INSERT INTO table_name [( column1, column2, column3, ...)] ( SELECT value1, value2, value3, ... FROM table_name WHERE ...); |
The subquery, or SELECT statement, must return a SELECT-list that maps to the column definition in the data dictionary or the optional comma-delimited column list.
MySQL 5-Table Procedure
A student wanted a better example of writing a MySQL Persistent Stored Module (PSM) that maintains transactional scope across a couple tables. Here’s the one I wrote about ten years ago to build the MySQL Video Store model. It looks I neglected to put it out there before, so here it is for reference.
-- Conditionally drop procedure if it exists. DROP PROCEDURE IF EXISTS contact_insert; -- Reset the delimiter so that a semicolon can be used as a statement and block terminator. DELIMITER $$ SELECT 'CREATE PROCEDURE contact_insert' AS "Statement"; CREATE PROCEDURE contact_insert ( pv_member_type CHAR(12) , pv_account_number CHAR(19) , pv_credit_card_number CHAR(19) , pv_credit_card_type CHAR(12) , pv_first_name CHAR(20) , pv_middle_name CHAR(20) , pv_last_name CHAR(20) , pv_contact_type CHAR(12) , pv_address_type CHAR(12) , pv_city CHAR(30) , pv_state_province CHAR(30) , pv_postal_code CHAR(20) , pv_street_address CHAR(30) , pv_telephone_type CHAR(12) , pv_country_code CHAR(3) , pv_area_code CHAR(6) , pv_telephone_number CHAR(10)) MODIFIES SQL DATA BEGIN /* Declare variables to manipulate auto generated sequence values. */ DECLARE member_id int unsigned; DECLARE contact_id int unsigned; DECLARE address_id int unsigned; DECLARE street_address_id int unsigned; DECLARE telephone_id int unsigned; /* Declare local constants for who-audit columns. */ DECLARE lv_created_by int unsigned DEFAULT 1001; DECLARE lv_creation_date DATE DEFAULT UTC_DATE(); DECLARE lv_last_updated_by int unsigned DEFAULT 1001; DECLARE lv_last_update_date DATE DEFAULT UTC_DATE(); /* Declare a locally scoped variable. */ DECLARE duplicate_key INT DEFAULT 0; /* Declare a duplicate key handler */ DECLARE CONTINUE HANDLER FOR 1062 SET duplicate_key = 1; /* Start the transaction context. */ START TRANSACTION; /* Create a SAVEPOINT as a recovery point. */ SAVEPOINT all_or_none; /* Insert into the first table in sequence based on inheritance of primary keys by foreign keys. */ INSERT INTO member ( member_type , account_number , credit_card_number , credit_card_type , created_by , creation_date , last_updated_by , last_update_date ) VALUES ((SELECT common_lookup_id FROM common_lookup WHERE common_lookup_context = 'MEMBER' AND common_lookup_type = pv_member_type) , pv_account_number , pv_credit_card_number ,(SELECT common_lookup_id FROM common_lookup WHERE common_lookup_context = 'MEMBER' AND common_lookup_type = pv_credit_card_type) , lv_created_by , lv_creation_date , lv_last_updated_by , lv_last_update_date ); /* Preserve the sequence by a table related variable name. */ SET member_id = last_insert_id(); /* Insert into the first table in sequence based on inheritance of primary keys by foreign keys. */ INSERT INTO contact VALUES ( null , member_id ,(SELECT common_lookup_id FROM common_lookup WHERE common_lookup_context = 'CONTACT' AND common_lookup_type = pv_contact_type) , pv_first_name , pv_middle_name , pv_last_name , lv_created_by , lv_creation_date , lv_last_updated_by , lv_last_update_date ); /* Preserve the sequence by a table related variable name. */ SET contact_id = last_insert_id(); /* Insert into the first table in sequence based on inheritance of primary keys by foreign keys. */ INSERT INTO address VALUES ( null , last_insert_id() ,(SELECT common_lookup_id FROM common_lookup WHERE common_lookup_context = 'MULTIPLE' AND common_lookup_type = pv_address_type) , pv_city , pv_state_province , pv_postal_code , lv_created_by , lv_creation_date , lv_last_updated_by , lv_last_update_date ); /* Preserve the sequence by a table related variable name. */ SET address_id = last_insert_id(); /* Insert into the first table in sequence based on inheritance of primary keys by foreign keys. */ INSERT INTO street_address VALUES ( null , last_insert_id() , pv_street_address , lv_created_by , lv_creation_date , lv_last_updated_by , lv_last_update_date ); /* Insert into the first table in sequence based on inheritance of primary keys by foreign keys. */ INSERT INTO telephone VALUES ( null , contact_id , address_id ,(SELECT common_lookup_id FROM common_lookup WHERE common_lookup_context = 'MULTIPLE' AND common_lookup_type = pv_telephone_type) , pv_country_code , pv_area_code , pv_telephone_number , lv_created_by , lv_creation_date , lv_last_updated_by , lv_last_update_date); /* This acts as an exception handling block. */ IF duplicate_key = 1 THEN /* This undoes all DML statements to this point in the procedure. */ ROLLBACK TO SAVEPOINT all_or_none; END IF; /* This commits the write when successful and is harmless otherwise. */ COMMIT; END; $$ -- Reset the standard delimiter to let the semicolon work as an execution command. DELIMITER ; |
You can then call the procedure, like:
SELECT 'CALL contact_insert() PROCEDURE 5 times' AS "Statement"; CALL contact_insert('INDIVIDUAL','R11-514-34','1111-1111-1111-1111','VISA_CARD','Goeffrey','Ward','Clinton','CUSTOMER','HOME','Provo','Utah','84606','118 South 9th East','HOME','011','801','423\-1234'); CALL contact_insert('INDIVIDUAL','R11-514-35','1111-2222-1111-1111','VISA_CARD','Wendy',null,'Moss','CUSTOMER','HOME','Provo','Utah','84606','1218 South 10th East','HOME','011','801','423-1234'); CALL contact_insert('INDIVIDUAL','R11-514-36','1111-1111-2222-1111','VISA_CARD','Simon','Jonah','Gretelz','CUSTOMER','HOME','Provo','Utah','84606','2118 South 7th East','HOME','011','801','423-1234'); CALL contact_insert('INDIVIDUAL','R11-514-37','1111-1111-1111-2222','MASTER_CARD','Elizabeth','Jane','Royal','CUSTOMER','HOME','Provo','Utah','84606','2228 South 14th East','HOME','011','801','423-1234'); CALL contact_insert('INDIVIDUAL','R11-514-38','1111-1111-3333-1111','VISA_CARD','Brian','Nathan','Smith','CUSTOMER','HOME','Spanish Fork','Utah','84606','333 North 2nd East','HOME','011','801','423-1234'); |
I hope this code complete approach helps those looking to learn how to write MySQL PSMs.
PostgreSQL Arrays
If you’re wondering about this post, it shows the basic array of a set of integers and strings before showing you how to create nested tables of data in PostgreSQL. By the way, they’re not called nested tables in PostgreSQL, like they are in Oracle but perform like their Oracle cousins.
Let’s create a table with an auto-incrementing column and two arrays, one array of integers and another of strings:
-- Conditionally drop the demo table. DROP TABLE IF EXISTS demo; -- Create the test table. CREATE TABLE demo ( demo_id serial , demo_number integer[5] , demo_string varchar(5)[7]); |
You can insert test values like this:
INSERT INTO demo (demo_number, demo_string) VALUES ( array[1,2,3,4,5] , array['One','Two','Three','Four','Five','Six','Seven']); |
Then, you can query them with this unnest function, like:
SELECT unnest(demo_number) AS numbers , unnest(demo_string) AS strings FROM demo; |
It returns:
numbers | strings ---------+--------- 1 | One 2 | Two 3 | Three 4 | Four 5 | Five | Six | Seven (7 rows) |
You may note that the two arrays are asymmetrical. It only becomes an issue when you navigate the result in a PL/pgSQL cursor or imperative programming language, like Python.
Now, let’s do something more interesting like work with a composite user-defined type, like the player structure. You would create the composite user-defined type with this syntax:
-- Conditionally drop the player type. DROP TYPE IF EXISTS player; -- Create the player type. CREATE TYPE player AS ( player_no integer , player_name varchar(24) , player_position varchar(14) , ab integer , r integer , h integer , bb integer , rbi integer ); |
You can create a world_series table that include a players column that uses an array of player type, like
-- Conditionally drop the world_series table. DROP TABLE IF EXISTS world_series; -- Create the player type. CREATE TABLE world_series ( world_series_id serial , team varchar(24) , players player[30] , game_no integer , year integer ); |
If you’re familiar with the Oracle Database, you’d have to specify a tested table in the syntax. Fortunately, PostgreSQL doesn’t require that.
Insert two rows with the following statement:
INSERT INTO world_series ( team , players , game_no , year ) VALUES ('San Francisco Giants' , array[(24,'Willie Mayes','Center Fielder',5,0,1,0,0)::player ,(5,'Tom Haller','Catcher',4,1,2,0,2)::player] , 4 , 1962 ); |
You can append to the array with the following syntax. A former student and I have a disagreement on whether this is shown in the PostgreSQL 8.15.4 Modifying Array documentation. I believe it can be inferred from the document and he doesn’t believe so. Anyway, here’s how you add an element to an existing array in a table with the UPDATE statement:
UPDATE world_series SET players = (SELECT array_append(players,(7,'Henry Kuenn','Right Fielder',3,0,0,1,0)::player) FROM world_series) WHERE team = 'San Francisco Giants' AND year = 1962 AND game_no = 4; |
Like Oracle’s nested tables, PostgreSQL’s arrays of composite user-defined types requires writing a PL/pgSQL function. I’ll try to add one of those shortly in another blog entry to show you how to edit and replace entries in stored arrays of composite user-defined types.
You can query the unnested rows and get a return set like a Python tuple with the following query:
SELECT unnest(players) AS player_list FROM world_series WHERE team = 'San Francisco Giants' AND year = 1962 AND game_no = 4; |
It returns the three rows from the players array:
player_list ---------------------------------------------- (24,"Willie Mayes","Center Field",5,0,1,0,0) (5,"Tom Haller",Catcher,4,1,2,0,2) (7,"Henry Kuenn","Right Fielde",3,0,0,1,0) (3 rows) |
It returns the data set in entry-order. If we step outside of the standard 8.15 Arrays PostgreSQL Documentation, you can do much more with arrays (or nested tables). The balance of this example demonstrates some new syntax that helps you achieve constructive outcomes in PostgreSQL.
You can use a Common Table Expression (CTE) to get the columnar display of the player composite user-defined type. This type of solution is beyond the standard , like:
WITH list AS (SELECT unnest(players) AS row_result FROM world_series WHERE team = 'San Francisco Giants' AND year = 1962 AND game_no = 4) SELECT (row_result).player_name , (row_result).player_no , (row_result).player_position FROM list; |
If you’re unfamiliar with accessing composite user-defined types, I wrote a post on that 7 years ago. You can find the older blog entry PostgreSQL Composites on my blog.
It returns only the three requested columns of the player composite user-defined type:
player_name | player_no | player_position --------------+-----------+----------------- Willie Mayes | 24 | Center Fielder Tom Haller | 5 | Catcher Henry Kuenn | 7 | Right Fielder (3 rows) |
You should note that the data is presented in an entry-ordered manner when unnested alone in the SELECT-list. That behavior changes when the SELECT-list includes non-array data.
The easiest way to display data from the non-array and array columns is to list them inside the SELECT-list of the CTE, like:
WITH list AS (SELECT game_no AS game , year , unnest(players) AS row_result FROM world_series WHERE team = 'San Francisco Giants' AND year = 1962 AND game_no = 4) SELECT game , year , (row_result).player_name , (row_result).player_no , (row_result).player_position FROM list; |
It returns an ordered set of unnested rows when you include non-array columns, like:
game | year | player_name | player_no | player_position ------+------+--------------+-----------+----------------- 4 | 1962 | Henry Kuenn | 7 | Right Fielder 4 | 1962 | Tom Haller | 5 | Catcher 4 | 1962 | Willie Mayes | 24 | Center Fielder (3 rows) |
While you can join the world_series table to the unnested array rows (returned as a derived table, its a bad idea. The mechanics to do it require you to return the primary key column in the same SELECT-list of the CTE. Then, you join the CTE list to the world_series table by using the world_series_id primary key.
However, there is no advantage to an inner join approach and it imposes unnecessary processing on the database server. The odd rationale that I heard when I noticed somebody was using a CTE to base-table join was: “That’s necessary so they could use column aliases for the non-array columns.” That’s not true because you can use the aliases inside the CTE, as shown above when game is an alias to the game_no column.
As always, I hope this helps those looking to solve a problem in PostgreSQL.
Setting SQL_MODE
In MySQL, the @@sql_mode parameter should generally use ONLY_FULL_GROUP_BY. If it doesn’t include it and you don’t have the ability to change the database parameters, you can use a MySQL PSM (Persistent Stored Module), like:
Create the set_full_group_by procedure:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | -- Drop procedure conditionally on whether it exists already. DROP PROCEDURE IF EXISTS set_full_group_by; -- Reset delimter to allow semicolons to terminate statements. DELIMITER $$ -- Create a procedure to verify and set connection parameter. CREATE PROCEDURE set_full_group_by() LANGUAGE SQL NOT DETERMINISTIC SQL SECURITY DEFINER COMMENT 'Set connection parameter when not set.' BEGIN /* Check whether full group by is set in the connection and if unset, set it in the scope of the connection. */ IF NOT EXISTS (SELECT NULL WHERE REGEXP_LIKE(@@SQL_MODE,'ONLY_FULL_GROUP_BY')) THEN SET SQL_MODE=(SELECT CONCAT(@@sql_mode,',ONLY_FULL_GROUP_BY')); END IF; END; $$ -- Reset the default delimiter. DELIMITER ; |
Run the following SQL command before you attempt the exercises in the same session scope:
CALL set_full_group_by(); |
As always, I hope this helps those looking for a solution. Naturally, you can simply use the SET command on line #21 above.
Drop Overloaded Routine
In October 2019, I wrote a post with anonymous block programs to drop tables, sequences, routines, and triggers. Two weeks later, I wrote another post to drop all overloaded routines. However, I recognized the other day that I should have written a function that let you target which function or procedure you want to drop.
The older code only let you drop all of your functions or procedures. That was overkill when you’re working on new functions or procedures.
This post provides a utility for those writing functions and procedures in a public schema of any database in a PostgreSQL installation. It is designed to drop functions or procedures from the public schema.
The code follows below:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 | CREATE OR REPLACE FUNCTION drop_routine( IN pv_routine_name VARCHAR(64) , IN pv_routine_type VARCHAR(64)) RETURNS INTEGER AS $$ DECLARE /* Declare the current catalog. */ lv_local_catalog VARCHAR(64) := current_database(); /* Declare return type variable. */ lv_retval INTEGER := 1; /* Manage data dictionary case mechanics: ====================================== routine_name is always in lowercase. routine_type is always in uppercase. */ lv_routine_name VARCHAR(64) := LOWER(pv_routine_name); lv_routine_type VARCHAR(64) := UPPER(pv_routine_type); /* Declare an indefinite length string for SQL statement. */ sql VARCHAR; /* Declare variables to manage cursor return values. */ row RECORD; arg VARCHAR; /* Declare parameter list. */ list VARCHAR; /* Declare a routine cursor. */ routine_cursor CURSOR( cv_routine_name VARCHAR , cv_routine_type VARCHAR ) FOR SELECT r.routine_name , r.specific_name , r.routine_type FROM information_schema.routines r WHERE r.specific_catalog = current_database() AND r.routine_schema = 'public' AND r.routine_type = cv_routine_type AND r.routine_name = cv_routine_name; /* Declare a parameter cursor. */ parameter_cursor CURSOR( cv_specific_name VARCHAR ) FOR SELECT args.data_type FROM information_schema.parameters args WHERE args.specific_catalog = current_database() AND args.specific_schema = 'public' AND args.specific_name = cv_specific_name; BEGIN /* Open the cursor. */ OPEN routine_cursor(lv_routine_name, lv_routine_type); <<row_loop>> LOOP /* Fetch table names. */ FETCH routine_cursor INTO row; /* Exit when no more records are found. */ EXIT row_loop WHEN NOT FOUND; /* Initialize parameter list. */ list := '('; /* Open the parameter cursor. */ OPEN parameter_cursor(row.specific_name::varchar); <<parameter_loop>> LOOP FETCH parameter_cursor INTO arg; /* Exit the parameter loop. */ EXIT parameter_loop WHEN NOT FOUND; /* Add parameter and delimit more than one parameter with a comma. */ IF LENGTH(list) > 1 THEN list := CONCAT(list,',',arg); ELSE list := CONCAT(list,arg); END IF; END LOOP; /* Close the parameter list. */ list := CONCAT(list,')'); /* Close the parameter cursor. */ CLOSE parameter_cursor; /* Concatenate together a DDL to drop the table with prejudice. */ sql := 'DROP '||row.routine_type||' IF EXISTS '||row.routine_name||list; /* Execute the DDL statement. */ EXECUTE sql; /* Assign success flag of 0. */ lv_retval := 0; END LOOP; /* Close the routine_cursor. */ CLOSE routine_cursor; /* Return the output text variable. */ RETURN lv_retval; END $$ LANGUAGE plpgsql; |
If you now create a series of hello overloaded functions, like:
CREATE OR REPLACE FUNCTION hello() RETURNS text AS $$ DECLARE output VARCHAR; BEGIN SELECT 'Hello World!' INTO output; RETURN output; END $$ LANGUAGE plpgsql; CREATE OR REPLACE FUNCTION hello(whom text) RETURNS text AS $$ DECLARE output VARCHAR; BEGIN SELECT CONCAT('Hello ',whom,'!') INTO output; RETURN output; END $$ LANGUAGE plpgsql; CREATE OR REPLACE FUNCTION hello(id int, whom text) RETURNS text AS $$ DECLARE output VARCHAR; BEGIN SELECT CONCAT('[',id,'] Hello ',whom,'!') INTO output; RETURN output; END $$ LANGUAGE plpgsql; |
After you create the overloaded functions, you can query their status from the information_schema.routines table in the data dictionary:
SELECT routine_name , specific_name , routine_type FROM information_schema.routines WHERE specific_catalog = current_setting('videodb.catalog_name') AND routine_schema = 'public' AND routine_name = 'hello'; |
Which shows you the three versions of the hello function:
routine_name | specific_name | routine_type --------------+---------------+-------------- hello | hello_18100 | FUNCTION hello | hello_18101 | FUNCTION hello | hello_18102 | FUNCTION (3 rows) |
You can drop all versions of the hello functions by calling the drop_routine function:
SELECT CASE WHEN drop_routine('hello','function') = 0 THEN 'Success' ELSE 'Failure' END AS drop_routine; |
It returns the following:
drop_routine -------------- Success (1 row) |
As always, I hope this helps those looking for how to routinely test new functions and procedures.
PostgreSQL CLI Error
Problems get reported to me all the time on installations for my students, this one was interesting. They got an error complaining about a missing libpq.so.5 library.
psql: /usr/pgsql-11/lib/libpq.so.5: no version information available (required by psql) psql: /usr/pgsql-11/lib/libpq.so.5: no version information available (required by psql) could not change directory to "/root": Permission denied psql (11.7, server 11.8) Type "help" for help. postgres=# |
It appeared as a mismatch of libraries but it’s not that. For reference, this was a Fedora instance. I ran the rpm utility:
rpm -qa | grep postgres |
It returned:
postgresql11-libs-11.8-1PGDG.f30.x86_64 postgresql-11.7-2.fc30.x86_64 postgresql-server-11.7-2.fc30.x86_64 |
Then, I had them run the rpm utility again looking for the Python driver for PostgreSQL:
rpm -qa | grep psycopg2 |
It returned:
python3-psycopg2-2.7.7-1.fc30.x86_64 |
Then, it was easy to explain. The Python psycopg2 library uses both PostgreSQL 11.7 and 11.8 dependent libraries and the libpq.so.5 library is missing version information. You must ignore the error, which is really only a warning message, when you want to work on Fedora, PostgreSQL 11, and Python 3.
MySQL WITH Clause
When I went over my example of using the WITH clause to solve how to use a series of literal values in data sets, some students got it right away and some didn’t. The original post showed how to solve a problem where one value in the data set is returned in the SELECT-list and two values are used as the minimum and maximum values with a BETWEEN operator. It used three approaches with literal values:
- A list of Python dictionaries that require you to filter the return set from the database through a range loop and if statement that mimics a SQL BETWEEN operator.
- A WITH clause that accepts the literals as bind variables to filter the query results inside the query.
- A table design that holds the literals values that an analyst might use for reporting.
It was the last example that required elaboration. I explained you might build a web form that uses a table, and the table could allow a data analyst to enter parameter sets. That way the analyst could submit a flag value to use one or another set of values. I threw out the idea on the whiteboard of introducing a report column to the prior post’s level table. The student went off to try it.
Two problems occurred. The first was in the design of the new table and the second was how to properly use the MySQL Python driver.
Below is a formal table design that supports this extension of the first blog post as a list of parameter values. It uses a report column as a super key to return a set of possible values. One value will show in the SELECT-list and the other two values deploy as the minimum and maximum values in a BETWEEN operator. It is seeded with two sets of values. One of the report possibilities is Summary level with three possibilities and the other is the Detail level with five possibilities.
-- Conditionally drop the levels table. DROP TABLE IF EXISTS levels; -- Create the levels list. CREATE TABLE levels ( level VARCHAR(16) , report ENUM('Summary','Detail') , min_roles INT , max_roles INT ); -- Insert values into the list table. INSERT INTO levels ( level, report, min_roles, max_roles ) VALUES ('Hollywood Star','Summary', 30, 99999) ,('Prolific Actor','Summary', 20, 29) ,('Newcommer','Summary', 1, 19) ,('Hollywood Star','Detail', 30, 99999) ,('Prolific Actor','Detail', 20, 29) ,('Regular Actor','Detail', 10, 19) ,('Actor','Detail', 5, 9) ,('Newcommer','Detail', 1, 4); |
The foregoing table design uses an ENUM type because reporting parameter sets are typically fewer than 64 possibilities. If you use the table to support multiple reports, you should add a second super key column like report_type. The report_type column key would let you use the table to support a series of different report parameter lists.
While the student used a %s inside the query, they created a runtime error when trying to pass the single bind variable into the query. The student misunderstood how to convert a report column input parameter variable into a tuple, which shows up when the student calls the Python MySQL Driver, like this:
59 | cursor.execute(query, (report)) |
The student’s code generated the following error stack:
Traceback (most recent call last): File "./python-with-clause.py", line 59, in <module> cursor.execute(query,(report)) File "/usr/lib/python3.7/site-packages/mysql/connector/cursor_cext.py", line 248, in execute prepared = self._cnx.prepare_for_mysql(params) File "/usr/lib/python3.7/site-packages/mysql/connector/connection_cext.py", line 632, in prepare_for_mysql raise ValueError("Could not process parameters") ValueError: Could not process parameters |
The ValueError should indicate to the developer that they’ve used a wrong data type in the call to the method:
cursor.execute(<class 'str'>,<class 'tuple'>) |
This clearly was a misunderstanding of how to cast a single string to a tuple. A quick explanation of how Python casts a single string into a tuple can best be illustrated inside an interactive Python shell, like:
>>> # Define a variable. >>> x = 'Detail' >>> # An incorrect attempt to make a string a tuple. >>> y = (x) >>> # Check type of y after assignment. >>> print(type(y)) <class 'str'> >>> # A correct attempt to make a string a tuple. >>> y = tuple(x) >>> # Check type of y after assignment. >>> print(type(y)) <class 'tuple'> >>> # An alternative to make a string a tuple. >>> z = (x,) >>> # Check type of z after assignment. >>> print(type(z)) <class 'tuple'> |
So, the fix was quite simple to line 59:
59 | cursor.execute(query, (report,)) |
The student started with a copy of a Python program that I provided. I fixed the argument handling and added some comments. The line 59 reference above maps to this code example.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 | # Import the library. import sys import mysql.connector from mysql.connector import errorcode # Capture argument list. fullCmdArguments = sys.argv # Assign argument list to variable. argumentList = fullCmdArguments[1:] # Define a standard report variable. report = "Summary" # Check and process argument list. # ============================================================ # If there are less than two arguments provide default values. # Else enumerate and convert strings to dates. # ============================================================ if (len(argumentList) == 1): # Set a default start date. if (isinstance(report,str)): report = argumentList[0] # Attempt the query. # ============================================================ # Use a try-catch block to manage the connection. # ============================================================ try: # Open connection. cnx = mysql.connector.connect(user='student', password='student', host='127.0.0.1', database='sakila') # Create cursor. cursor = cnx.cursor() # Set the query statement. query = ("WITH actors AS " "(SELECT a.actor_id " " , a.first_name " " , a.last_name " " , COUNT(fa.actor_id) AS num_roles " " FROM actor a INNER JOIN film_actor fa " " ON a.actor_id = fa.actor_id " " GROUP BY a.actor_id " " , a.first_name " " , a.last_name ) " " SELECT a.first_name " " , a.last_name " " , l.level " " , a.num_roles " " FROM actors a CROSS JOIN levels l " " WHERE a.num_roles BETWEEN l.min_roles AND l.max_roles " " AND l.report = %s " " ORDER BY a.last_name " " , a.first_name") # Execute cursor. cursor.execute(query,(report,)) # Display the rows returned by the query. for (first_name, last_name, level, num_roles) in cursor: print('{0} {1} is a {2} with {3} films.'.format( first_name.title() , last_name.title() , level.title() , num_roles)) # Close cursor. cursor.close() # ------------------------------------------------------------ # Handle exception and close connection. except mysql.connector.Error as e: if e.errno == errorcode.ER_ACCESS_DENIED_ERROR: print("Something is wrong with your user name or password") elif e.errno == errorcode.ER_BAD_DB_ERROR: print("Database does not exist") else: print("Error code:", e.errno) # error number print("SQLSTATE value:", e.sqlstate) # SQLSTATE value print("Error message:", e.msg) # error message # Close the connection when the try block completes. else: cnx.close() |
A Linux shell program like the following (provided the name of the shell script and Python program are the same) can run the Python program with or without a parameter. It works without a parameter because it sets a default value for the report variable.
# Switch the file extension and run the python program. file=${0/%sh/py} python3 ${file} "${@}" |
You call the shell script like this:
./python-with-clause.sh Detail |
As always, I hope this helps those looking for a solution.
Linux sqlplus wrapper
Here’s a quick way to ensure you can use the up-arrows and navigation keys when using the sqlplus command-line interface. You can just add it to your .bashrc file.
sqlplus () { path=`which rlwrap 2>/dev/null`; file=''; if [ -n ${path} ]; then file=${path##/*/}; fi; if [ -n ${file} ] && [[ ${file} = "rlwrap" ]]; then rlwrap sqlplus "${@}"; else echo "Command-line history unavailable: Install the rlwrap package."; $ORACLE_HOME/bin/sqlplus "${@}"; fi } |
As always, I hope this helps those looking of solutions.
Waking up the Network
Interesting problems seem to plague me from time to time. The current problem isn’t yet solved but I’m working on it. After a sleep cycle, IPV6 networking isn’t starting.
Try:
- Checking the network cables, modem, and router
- Reconnecting to Wi-Fi
ERR_INTERNET_DISCONNECTED |
In the broken Fedora 30 VM, I checked the status with the nmcli tool:
sudo nmcli general status |
It returned:
STATE CONNECTIVITY WIFI-HW WIFI WWAN-HW WWAN asleep none enabled enabled enabled enabled |
The STATE should return connected and connectivity return full. Unfortunately, that’s not the case.
There was little surprise that the next check:
sudo nmcli device |
Returned the following:
DEVICE TYPE STATE CONNECTION virbr0 bridge unmanaged -- ens33 ethernet unmanaged -- lo loopback unmanaged -- virbr0-nic tun unmanaged -- |
In a working instance, it should return:
DEVICE TYPE STATE CONNECTION ens33 ethernet connected ens33 virbr0 bridge connected virbr0 lo loopback unmanaged -- virbr0-nic tun unmanaged -- |
I’m currently troubleshooting what failed by leveraging an article on How to Configure Network Connection Using ‘nmcli’ Tool and the Gnome nmcli documentation. Naturally, when I get it fixed, I’ll finish this article.
Title Case Anyone?
Sometimes life is too surreal. Like when somebody says, “How do you get title case in an Oracle database?” That’s when you know three things about the individual, while suppressing laughter. They’re not very experienced with SQL, likely lazy, and don’t read the documentation.
I had a little fun with somebody today by taking them down a small rat-hole. “Oh, gosh … ” I said, “… let’s write a function for that.” Here’s the joke function, like:
CREATE OR REPLACE FUNCTION title_case ( string VARCHAR2 ) RETURN VARCHAR2 IS BEGIN /* Change upper case to title case. */ RETURN UPPER(SUBSTR(string,1,1)) || LOWER(SUBSTR(string,2,LENGTH(string))); END title_case; / |
Then, we tested it with a query from the pseudo dual table:
SELECT title_case('incredible') AS "Proper Name" FROM dual; |
It returned:
Proper Name ---------- Incredible |
Then, I said “Oh, that’s not his proper name in the Pixar World.” It should be: Mr. Incredible. Let’s try that:
SELECT title_case('mr. incredible') AS "Proper Name" FROM dual; |
It returned:
Proper Name -------------- Mr. incredible |
Then, I said: “That’s not what we want at all. Should we rewrite our function or simply use the INITCAP built-in function?” Then, I wrote:
SELECT INITCAP('mr. incredible') AS "Proper Name" FROM dual; |
It returned:
Proper Name -------------- Mr. Incredible |
Well, needless to say my acquaintance got it immediately and said “I should have Googled it or read the documentation.” I concurred with his observation.
Just sharing a cute day’s event that made me laugh and cry at the same time because there are too many who say SQL isn’t worth learning.