Archive for the ‘PL/pgSQL’ Category
PostgreSQL Arrays
If you’re wondering about this post, it shows the basic array of a set of integers and strings before showing you how to create nested tables of data in PostgreSQL. By the way, they’re not called nested tables in PostgreSQL, like they are in Oracle but perform like their Oracle cousins.
Let’s create a table with an auto-incrementing column and two arrays, one array of integers and another of strings:
-- Conditionally drop the demo table. DROP TABLE IF EXISTS demo; -- Create the test table. CREATE TABLE demo ( demo_id serial , demo_number integer[5] , demo_string varchar(5)[7]); |
You can insert test values like this:
INSERT INTO demo (demo_number, demo_string) VALUES ( array[1,2,3,4,5] , array['One','Two','Three','Four','Five','Six','Seven']); |
Then, you can query them with this unnest function, like:
SELECT unnest(demo_number) AS numbers , unnest(demo_string) AS strings FROM demo; |
It returns:
numbers | strings ---------+--------- 1 | One 2 | Two 3 | Three 4 | Four 5 | Five | Six | Seven (7 rows) |
You may note that the two arrays are asymmetrical. It only becomes an issue when you navigate the result in a PL/pgSQL cursor or imperative programming language, like Python.
Now, let’s do something more interesting like work with a composite user-defined type, like the player structure. You would create the composite user-defined type with this syntax:
-- Conditionally drop the player type. DROP TYPE IF EXISTS player; -- Create the player type. CREATE TYPE player AS ( player_no integer , player_name varchar(24) , player_position varchar(14) , ab integer , r integer , h integer , bb integer , rbi integer ); |
You can create a world_series table that include a players column that uses an array of player type, like
-- Conditionally drop the world_series table. DROP TABLE IF EXISTS world_series; -- Create the player type. CREATE TABLE world_series ( world_series_id serial , team varchar(24) , players player[30] , game_no integer , year integer ); |
If you’re familiar with the Oracle Database, you’d have to specify a tested table in the syntax. Fortunately, PostgreSQL doesn’t require that.
Insert two rows with the following statement:
INSERT INTO world_series ( team , players , game_no , year ) VALUES ('San Francisco Giants' , array[(24,'Willie Mayes','Center Fielder',5,0,1,0,0)::player ,(5,'Tom Haller','Catcher',4,1,2,0,2)::player] , 4 , 1962 ); |
You can append to the array with the following syntax. A former student and I have a disagreement on whether this is shown in the PostgreSQL 8.15.4 Modifying Array documentation. I believe it can be inferred from the document and he doesn’t believe so. Anyway, here’s how you add an element to an existing array in a table with the UPDATE statement:
UPDATE world_series SET players = (SELECT array_append(players,(7,'Henry Kuenn','Right Fielder',3,0,0,1,0)::player) FROM world_series) WHERE team = 'San Francisco Giants' AND year = 1962 AND game_no = 4; |
Like Oracle’s nested tables, PostgreSQL’s arrays of composite user-defined types requires writing a PL/pgSQL function. I’ll try to add one of those shortly in another blog entry to show you how to edit and replace entries in stored arrays of composite user-defined types.
You can query the unnested rows and get a return set like a Python tuple with the following query:
SELECT unnest(players) AS player_list FROM world_series WHERE team = 'San Francisco Giants' AND year = 1962 AND game_no = 4; |
It returns the three rows from the players array:
player_list ---------------------------------------------- (24,"Willie Mayes","Center Field",5,0,1,0,0) (5,"Tom Haller",Catcher,4,1,2,0,2) (7,"Henry Kuenn","Right Fielde",3,0,0,1,0) (3 rows) |
It returns the data set in entry-order. If we step outside of the standard 8.15 Arrays PostgreSQL Documentation, you can do much more with arrays (or nested tables). The balance of this example demonstrates some new syntax that helps you achieve constructive outcomes in PostgreSQL.
You can use a Common Table Expression (CTE) to get the columnar display of the player composite user-defined type. This type of solution is beyond the standard , like:
WITH list AS (SELECT unnest(players) AS row_result FROM world_series WHERE team = 'San Francisco Giants' AND year = 1962 AND game_no = 4) SELECT (row_result).player_name , (row_result).player_no , (row_result).player_position FROM list; |
If you’re unfamiliar with accessing composite user-defined types, I wrote a post on that 7 years ago. You can find the older blog entry PostgreSQL Composites on my blog.
It returns only the three requested columns of the player composite user-defined type:
player_name | player_no | player_position --------------+-----------+----------------- Willie Mayes | 24 | Center Fielder Tom Haller | 5 | Catcher Henry Kuenn | 7 | Right Fielder (3 rows) |
You should note that the data is presented in an entry-ordered manner when unnested alone in the SELECT-list. That behavior changes when the SELECT-list includes non-array data.
The easiest way to display data from the non-array and array columns is to list them inside the SELECT-list of the CTE, like:
WITH list AS (SELECT game_no AS game , year , unnest(players) AS row_result FROM world_series WHERE team = 'San Francisco Giants' AND year = 1962 AND game_no = 4) SELECT game , year , (row_result).player_name , (row_result).player_no , (row_result).player_position FROM list; |
It returns an ordered set of unnested rows when you include non-array columns, like:
game | year | player_name | player_no | player_position ------+------+--------------+-----------+----------------- 4 | 1962 | Henry Kuenn | 7 | Right Fielder 4 | 1962 | Tom Haller | 5 | Catcher 4 | 1962 | Willie Mayes | 24 | Center Fielder (3 rows) |
While you can join the world_series table to the unnested array rows (returned as a derived table, its a bad idea. The mechanics to do it require you to return the primary key column in the same SELECT-list of the CTE. Then, you join the CTE list to the world_series table by using the world_series_id primary key.
However, there is no advantage to an inner join approach and it imposes unnecessary processing on the database server. The odd rationale that I heard when I noticed somebody was using a CTE to base-table join was: “That’s necessary so they could use column aliases for the non-array columns.” That’s not true because you can use the aliases inside the CTE, as shown above when game is an alias to the game_no column.
As always, I hope this helps those looking to solve a problem in PostgreSQL.
Drop Overloaded Routine
In October 2019, I wrote a post with anonymous block programs to drop tables, sequences, routines, and triggers. Two weeks later, I wrote another post to drop all overloaded routines. However, I recognized the other day that I should have written a function that let you target which function or procedure you want to drop.
The older code only let you drop all of your functions or procedures. That was overkill when you’re working on new functions or procedures.
This post provides a utility for those writing functions and procedures in a public schema of any database in a PostgreSQL installation. It is designed to drop functions or procedures from the public schema.
The code follows below:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 | CREATE OR REPLACE FUNCTION drop_routine( IN pv_routine_name VARCHAR(64) , IN pv_routine_type VARCHAR(64)) RETURNS INTEGER AS $$ DECLARE /* Declare the current catalog. */ lv_local_catalog VARCHAR(64) := current_database(); /* Declare return type variable. */ lv_retval INTEGER := 1; /* Manage data dictionary case mechanics: ====================================== routine_name is always in lowercase. routine_type is always in uppercase. */ lv_routine_name VARCHAR(64) := LOWER(pv_routine_name); lv_routine_type VARCHAR(64) := UPPER(pv_routine_type); /* Declare an indefinite length string for SQL statement. */ sql VARCHAR; /* Declare variables to manage cursor return values. */ row RECORD; arg VARCHAR; /* Declare parameter list. */ list VARCHAR; /* Declare a routine cursor. */ routine_cursor CURSOR( cv_routine_name VARCHAR , cv_routine_type VARCHAR ) FOR SELECT r.routine_name , r.specific_name , r.routine_type FROM information_schema.routines r WHERE r.specific_catalog = current_database() AND r.routine_schema = 'public' AND r.routine_type = cv_routine_type AND r.routine_name = cv_routine_name; /* Declare a parameter cursor. */ parameter_cursor CURSOR( cv_specific_name VARCHAR ) FOR SELECT args.data_type FROM information_schema.parameters args WHERE args.specific_catalog = current_database() AND args.specific_schema = 'public' AND args.specific_name = cv_specific_name; BEGIN /* Open the cursor. */ OPEN routine_cursor(lv_routine_name, lv_routine_type); <<row_loop>> LOOP /* Fetch table names. */ FETCH routine_cursor INTO row; /* Exit when no more records are found. */ EXIT row_loop WHEN NOT FOUND; /* Initialize parameter list. */ list := '('; /* Open the parameter cursor. */ OPEN parameter_cursor(row.specific_name::varchar); <<parameter_loop>> LOOP FETCH parameter_cursor INTO arg; /* Exit the parameter loop. */ EXIT parameter_loop WHEN NOT FOUND; /* Add parameter and delimit more than one parameter with a comma. */ IF LENGTH(list) > 1 THEN list := CONCAT(list,',',arg); ELSE list := CONCAT(list,arg); END IF; END LOOP; /* Close the parameter list. */ list := CONCAT(list,')'); /* Close the parameter cursor. */ CLOSE parameter_cursor; /* Concatenate together a DDL to drop the table with prejudice. */ sql := 'DROP '||row.routine_type||' IF EXISTS '||row.routine_name||list; /* Execute the DDL statement. */ EXECUTE sql; /* Assign success flag of 0. */ lv_retval := 0; END LOOP; /* Close the routine_cursor. */ CLOSE routine_cursor; /* Return the output text variable. */ RETURN lv_retval; END $$ LANGUAGE plpgsql; |
If you now create a series of hello overloaded functions, like:
CREATE OR REPLACE FUNCTION hello() RETURNS text AS $$ DECLARE output VARCHAR; BEGIN SELECT 'Hello World!' INTO output; RETURN output; END $$ LANGUAGE plpgsql; CREATE OR REPLACE FUNCTION hello(whom text) RETURNS text AS $$ DECLARE output VARCHAR; BEGIN SELECT CONCAT('Hello ',whom,'!') INTO output; RETURN output; END $$ LANGUAGE plpgsql; CREATE OR REPLACE FUNCTION hello(id int, whom text) RETURNS text AS $$ DECLARE output VARCHAR; BEGIN SELECT CONCAT('[',id,'] Hello ',whom,'!') INTO output; RETURN output; END $$ LANGUAGE plpgsql; |
After you create the overloaded functions, you can query their status from the information_schema.routines table in the data dictionary:
SELECT routine_name , specific_name , routine_type FROM information_schema.routines WHERE specific_catalog = current_setting('videodb.catalog_name') AND routine_schema = 'public' AND routine_name = 'hello'; |
Which shows you the three versions of the hello function:
routine_name | specific_name | routine_type --------------+---------------+-------------- hello | hello_18100 | FUNCTION hello | hello_18101 | FUNCTION hello | hello_18102 | FUNCTION (3 rows) |
You can drop all versions of the hello functions by calling the drop_routine function:
SELECT CASE WHEN drop_routine('hello','function') = 0 THEN 'Success' ELSE 'Failure' END AS drop_routine; |
It returns the following:
drop_routine -------------- Success (1 row) |
As always, I hope this helps those looking for how to routinely test new functions and procedures.
Python on PostgreSQL
The ODBC library you use when connecting Python to PostgreSQL is the psycopg2 Python library. This blog post will show use how to use it in Python and install it on your Fedora Linux installation. It leverages a videodb database that I show you how to build in this earlier post on configuring PostgreSQL 14.
You would import psycopg2 as follows in your Python code:
import psycopg2 |
Unfortunately, that only works on Linux servers when you’ve installed the library. That library isn’t installed with generic Python libraries. You get the following error when the psycopg2 library isn’t installed on your server.
Traceback (most recent call last): File "python_new_hire.sql", line 1, in <module> import psycopg2 ModuleNotFoundError: No module named 'psycopg2' |
You can install it on Fedora Linux with the following command:
yum install python3-psycopg2 |
It will install:
==================================================================================== Package Architecture Version Repository Size ==================================================================================== Installing: python3-psycopg2 x86_64 2.7.7-1.fc30 fedora 160 k Transaction Summary ==================================================================================== Install 1 Package Total download size: 160 k Installed size: 593 k Is this ok [y/N]: y Downloading Packages: python3-psycopg2-2.7.7-1.fc30.x86_64.rpm 364 kB/s | 160 kB 00:00 ------------------------------------------------------------------------------------ Total 167 kB/s | 160 kB 00:00 Running transaction check Transaction check succeeded. Running transaction test Transaction test succeeded. Running transaction Preparing : 1/1 Installing : python3-psycopg2-2.7.7-1.fc30.x86_64 1/1 Running scriptlet: python3-psycopg2-2.7.7-1.fc30.x86_64 1/1 Verifying : python3-psycopg2-2.7.7-1.fc30.x86_64 1/1 Installed: python3-psycopg2-2.7.7-1.fc30.x86_64 Complete! |
Here’s a quick test case that you can run in PostgreSQL and Python to test all the pieces. The first SQL script creates a new_hire table and inserts two rows, and the Python program queries data from the new_hire table.
The new_hire.sql file creates the new_hire table and inserts two rows:
-- Environment settings for the script. SET SESSION "videodb.table_name" = 'new_hire'; SET CLIENT_MIN_MESSAGES TO ERROR; -- Verify table name. SELECT current_setting('videodb.table_name'); -- ------------------------------------------------------------------ -- Conditionally drop table. -- ------------------------------------------------------------------ DROP TABLE IF EXISTS new_hire CASCADE; -- ------------------------------------------------------------------ -- Create table. -- ------------------------------------------------------------------- CREATE TABLE new_hire ( new_hire_id SERIAL , first_name VARCHAR(20) NOT NULL , middle_name VARCHAR(20) , last_name VARCHAR(20) NOT NULL , hire_date TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT CURRENT_TIMESTAMP , PRIMARY KEY (new_hire_id)); -- Alter the sequence by restarting it at 1001. ALTER SEQUENCE new_hire_new_hire_id_seq RESTART WITH 1001; -- Display the table organization. SELECT tc.table_catalog || '.' || tc.constraint_name AS constraint_name , tc.table_catalog || '.' || tc.table_name AS table_name , kcu.column_name , ccu.table_catalog || '.' || ccu.table_name AS foreign_table_name , ccu.column_name AS foreign_column_name FROM information_schema.table_constraints AS tc JOIN information_schema.key_column_usage AS kcu ON tc.constraint_name = kcu.constraint_name AND tc.table_schema = kcu.table_schema JOIN information_schema.constraint_column_usage AS ccu ON ccu.constraint_name = tc.constraint_name AND ccu.table_schema = tc.table_schema WHERE tc.constraint_type = 'FOREIGN KEY' AND tc.table_name = current_setting('videodb.table_name') ORDER BY 1; SELECT c1.table_name , c1.ordinal_position , c1.column_name , CASE WHEN c1.is_nullable = 'NO' AND c2.column_name IS NOT NULL THEN 'PRIMARY KEY' WHEN c1.is_nullable = 'NO' AND c2.column_name IS NULL THEN 'NOT NULL' END AS is_nullable , CASE WHEN data_type = 'character varying' THEN data_type||'('||character_maximum_length||')' WHEN data_type = 'numeric' THEN CASE WHEN numeric_scale != 0 AND numeric_scale IS NOT NULL THEN data_type||'('||numeric_precision||','||numeric_scale||')' ELSE data_type||'('||numeric_precision||')' END ELSE data_type END AS data_type FROM information_schema.columns c1 LEFT JOIN (SELECT trim(regexp_matches(column_default,current_setting('videodb.table_name'))::text,'{}')||'_id' column_name FROM information_schema.columns) c2 ON c1.column_name = c2.column_name WHERE c1.table_name = current_setting('videodb.table_name') ORDER BY c1.ordinal_position; -- Display primary key and unique constraints. SELECT constraint_name , lower(constraint_type) AS constraint_type FROM information_schema.table_constraints WHERE table_name = current_setting('videodb.table_name') AND constraint_type IN ('PRIMARY KEY','UNIQUE'); -- Insert two test records. INSERT INTO new_hire ( first_name, middle_name, last_name, hire_date ) VALUES ('Malcolm','Jacob','Lewis','2018-2-14') ,('Henry',null,'Chabot','1990-07-31'); |
You can put it into a local directory, connect as the student user to a videodb database, and run the following command (or any database you’ve created).
\i new_hire.sql |
The new_hire.py file creates the new_hire table and inserts two rows:
# Import the PostgreSQL connector library. import psycopg2 try: # Open a connection to the database. connection = psycopg2.connect( user="student" , password="student" , port="5432" , dbname="videodb") # Open a cursor. cursor = connection.cursor() # Assign a static query. query = "SELECT new_hire_id, first_name, last_name " \ "FROM new_hire" # Parse and execute the query. cursor.execute(query) # Fetch all rows from a table. records = cursor.fetchall() # Read through and print the rows as tuples. for row in range(0, len(records)): print(records[row]) except (Exception, psycopg2.Error) as error : print("Error while fetching data from PostgreSQL", error) finally: # Close the database connection. if (connection): cursor.close() connection.close() |
You run it from the command line, like:
python3 ./new_hire.py |
It should print:
(1001, 'Malcolm', 'Lewis') (1002, 'Henry', 'Chabot') |
As always, I hope this helps those trying to sort out how to connect Python to PostgreSQL.
PostgreSQL CLI Error
Problems get reported to me all the time on installations for my students, this one was interesting. They got an error complaining about a missing libpq.so.5 library.
psql: /usr/pgsql-11/lib/libpq.so.5: no version information available (required by psql) psql: /usr/pgsql-11/lib/libpq.so.5: no version information available (required by psql) could not change directory to "/root": Permission denied psql (11.7, server 11.8) Type "help" for help. postgres=# |
It appeared as a mismatch of libraries but it’s not that. For reference, this was a Fedora instance. I ran the rpm utility:
rpm -qa | grep postgres |
It returned:
postgresql11-libs-11.8-1PGDG.f30.x86_64 postgresql-11.7-2.fc30.x86_64 postgresql-server-11.7-2.fc30.x86_64 |
Then, I had them run the rpm utility again looking for the Python driver for PostgreSQL:
rpm -qa | grep psycopg2 |
It returned:
python3-psycopg2-2.7.7-1.fc30.x86_64 |
Then, it was easy to explain. The Python psycopg2 library uses both PostgreSQL 11.7 and 11.8 dependent libraries and the libpq.so.5 library is missing version information. You must ignore the error, which is really only a warning message, when you want to work on Fedora, PostgreSQL 11, and Python 3.
PL/pgSQL OUT Mode
A friend asked me a question about using the OUT
mode parameter versus INOUT
mode parameters in functions. He formed an opinion that they didn’t work in PostgreSQL PL/pgSQL.
Unfortunately, there’s not a lot of material written about how to use the OUT
mode parameter in functions. I thought an article showing the standard example with a call to the function might help. The standard example function from the PostgreSQL documentation is:
1 2 3 4 5 6 7 8 9 | CREATE OR REPLACE FUNCTION sales_tax( IN amount real , OUT tax real ) AS $$ BEGIN /* Calculate the tax at 6%. */ tax := amount * 0.06; END; $$ LANGUAGE plpgsql; |
The RETURNS
clause is optional but here’s how you can include it. The following example works exactly like the former.
1 2 3 4 5 6 7 8 9 10 | CREATE OR REPLACE FUNCTION sales_tax( IN amount real , OUT tax real ) RETURNS real AS $$ BEGIN /* Calculate the tax at 6%. */ tax := amount * 0.06; END; $$ LANGUAGE plpgsql; |
You call the PostgreSQL sales_tax()
function like this:
SELECT 'Sales Tax ['|| sales_tax(200) ||']' AS "Return Value"; |
It should return the following:
Return Value ---------------- Sales Tax [12] (1 row) |
You can also call it in an inline code block (e.g., what Oracle documentation calls an anonymous block), like:
1 2 3 4 5 6 7 8 9 10 11 12 13 | DO $$ DECLARE /* Declare a local variable. */ tax_paid real := 0; BEGIN /* Query the results into a local variable. */ SELECT sales_tax(100) INTO tax_paid; /* Print the taxes paid. */ RAISE NOTICE 'Tax Paid [%]',tax_paid; END; $$; |
You can replace the sales_tax
function with its OUT
mode tax
parameter with the following classic sales_tax
function, which adds a tax_rate
parameter.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | CREATE OR REPLACE FUNCTION sales_tax( IN amount real , IN tax_rate real ) RETURNS real AS $$ DECLARE /* Declare a local variable. */ tax real; BEGIN /* Calculate the tax at 6%. */ tax := amount * (tax_rate / 100); /* Return the tax. */ RETURN tax; END; $$ LANGUAGE plpgsql; |
Let’s return the original approach with the OUT
parameter. Then, let’s expand the list of parameters to include an INOUT
mode state
variable, like:
1 2 3 4 5 6 7 8 9 10 11 12 | CREATE OR REPLACE FUNCTION sales_tax( IN amount real , IN tax_rate real , INOUT state VARCHAR(14) , OUT tax real ) RETURNS RECORD AS $$ BEGIN /* Calculate the tax at 6%. */ tax := amount * (tax_rate / 100); END; $$ LANGUAGE plpgsql; |
You don’t need to include the RETURNS RECORD
phrase because PL/pgSQL implements a very mature adapter pattern and it adjusts the return type automatically to the parameter list. On the other hand, many beginning programmers and support staff won’t know that. That’s why I recommend you include it for clarity.
You can call this in a query with a column alias, like:
SELECT 'Sales Tax ['|| sales_tax(100,8.25,'California') ||']' AS "Return Value"; |
It will return a tuple:
Return Value ------------------------------- Sales Tax [(California,8.25)] (1 row) |
You can implement it inside an inline block by adding a local variable of the RECORD
data type, like:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | DO $$ DECLARE /* Declare a local variable. */ state varchar(14) := 'California'; tax_paid real := 0; result RECORD; BEGIN /* Query the results into a local variable. */ SELECT sales_tax(100,8.25,state) INTO result; /* Print the taxes paid. */ RAISE NOTICE 'Tax Paid [%]', result; END; $$; |
It prints the following:
NOTICE: Tax Paid [("(California,8.25)")] |
You can actually return the individual members of the tuple by putting the function call inside the FROM
clause, like:
SELECT * FROM sales_tax(100,8.25,'California'); |
It now returns the members of the tuple in separate columns:
state | tax ------------+------ California | 8.25 (1 row) |
Alternatively, you can call it from inside an inline block, like:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | DO $$ DECLARE /* Declare a local variable. */ state varchar(14) := 'California'; tax_paid real := 0; BEGIN /* Query the results into a local variable. */ SELECT * INTO state, tax_paid FROM sales_tax(100,8.25,state); /* Print the taxes paid. */ RAISE NOTICE 'Tax Paid [%] [%]', state, tax_paid; END; $$; |
It prints:
NOTICE: Tax Paid [California] [8.25] |
Now, let’s rewrite the function into a traditional function with all IN
mode variables that returns a RECORD
structure with additional values. Just one quick caveat (the big but), you can only assign values to dynamically constructed RECORD
structures by using the SELECT-INTO
or FOR
statements. Below is the refactored sales_tax()
function:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | CREATE OR REPLACE FUNCTION sales_tax( IN subtotal REAL , IN tax_rate REAL , IN state VARCHAR(14)) RETURNS RECORD AS $$ DECLARE /* Declare a local variable. */ tax REAL; tax_record RECORD; BEGIN /* Calculate the tax at 6%. */ tax := subtotal * (tax_rate / 100); /* Assign state to record. */ SELECT state, tax INTO tax_record; /* Return the tax. */ RETURN tax_record; END; $$ LANGUAGE plpgsql; |
It returns the same set of values as the early version with the four parameter example above but you only need three IN
-only mode variables to get the result. Other than the parameter lists, the biggest change appears to be the assignment line, which is required in the explicit and traditional function that has only IN
mode parameters:
16 | SELECT state, tax INTO tax_record; |
Given you can return any RECORD
structure you want, why use INOUT
and OUT
mode parameters? Don’t you loose clarity about what your stored function does? Or, at least, don’t you make understanding the program logic more difficult when you use INOUT
and OUT
mode variables? The only benefit appears to be when you shift your input variables from the SELECT
clause to the INTO
clause.
Hopefully, this shows folks how to use the OUT
mode parameter; and how closely related it is to a classic function.
PostgreSQL Upsert Advanced
Nine years after writing how to use the MERGE
statement in Oracle, I am writing how you implement an UPSERT
statement in PostgreSQL. I wrote an initial post going over the basics of PostgreSQL’s upsert implementation of the INSERT
statement with an DO UPDATE
clause and a DO NOTHING
clause.
I thought it was interesting that the PostgreSQL Upsert Using INSERT ON CONFLICT
Statement web page didn’t cover using a subquery as the source for an INSERT
statement.
Demonstration
Here are the steps to accomplish an import/upload with the COPY
statement and the INSERT
statement with DO UPDATE
and DO NOTHING
clauses.
Step #1 : Position your CSV file in the physical directory
The example uses the following upload directory:
/u01/app/postgres/upload |
Put the following text in to the kingdom_import.csv
file.
Narnia,77600,Peter the Magnificent,1272-03-20,1292-06-19 Narnia,77600,Edmund the Just,1272-03-20,1292-06-19 Narnia,77600,Susan the Gentle,1272-03-20,1292-06-19 Narnia,77600,Lucy the Valiant,1272-03-20,1292-06-19 Narnia,42100,Peter the Magnificent,1531-04-12,1531-05-31 Narnia,42100,Edmund the Just,1531-04-12,1531-05-31 Narnia,42100,Susan the Gentle,1531-04-12,1531-05-31 Narnia,42100,Lucy the Valiant,1531-04-12,1531-05-31 Camelot,15200,King Arthur,0631-03-10,0686-12-12 Camelot,15200,Sir Lionel,0631-03-10,0686-12-12 Camelot,15200,Sir Bors,0631-03-10,0635-12-12 Camelot,15200,Sir Bors,0640-03-10,0686-12-12 Camelot,15200,Sir Galahad,0631-03-10,0686-12-12 Camelot,15200,Sir Gawain,0631-03-10,0686-12-12 Camelot,15200,Sir Tristram,0631-03-10,0686-12-12 Camelot,15200,Sir Percival,0631-03-10,0686-12-12 Camelot,15200,Sir Lancelot,0670-09-30,0682-12-12 Camelot,15200,Modred,0681-09-30,0682-12-12 |
Step #2 : Run the script that creates tables and sequences
Copy the following code into a create_kingdom_knight_tables.sql
file within a directory of your choice. Then, you run it as the student
user. Assuming you put the code in the create_kingdom_knight_tables.sql
script, you can call it like so
\i create_kingdom_knight_tables.sql |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 | -- Conditionally drop three tables and sequences. DO $$ DECLARE /* Declare an indefinite length string and record variable. */ SQL VARCHAR; ROW RECORD; /* Declare a cursor. */ table_cursor CURSOR FOR SELECT TABLE_NAME FROM information_schema.tables WHERE table_catalog = 'videodb' AND table_schema = 'public' AND TABLE_NAME IN ('kingdom','knight','kingdom_knight_import'); BEGIN /* Open the cursor. */ OPEN table_cursor; LOOP /* Fetch table names. */ FETCH table_cursor INTO ROW; /* Exit when no more records are found. */ EXIT WHEN NOT FOUND; /* Concatenate together a DDL to drop the table with prejudice. */ SQL := 'DROP TABLE IF EXISTS '||ROW.table_name||' CASCADE'; /* Execute the DDL statement. */ EXECUTE SQL; END LOOP; /* Close the cursor. */ CLOSE table_cursor; END; $$; -- Create normalized kingdom table. CREATE TABLE kingdom ( kingdom_id SERIAL , kingdom_name VARCHAR(20) , population INTEGER , CONSTRAINT kingdom_uq_key UNIQUE ( kingdom_name , population )); -- Create normalized knight table. CREATE TABLE knight ( knight_id SERIAL , knight_name VARCHAR(24) , kingdom_allegiance_id INTEGER , allegiance_start_date DATE , allegiance_end_date DATE , CONSTRAINT knight_uq_key UNIQUE ( knight_name , kingdom_allegiance_id , allegiance_start_date , allegiance_end_date )); -- Create external import table. CREATE TABLE kingdom_knight_import ( kingdom_name VARCHAR(20) , population INTEGER , knight_name VARCHAR(24) , allegiance_start_date DATE , allegiance_end_date DATE); |
Step #3 : Run the COPY
command.
Run the COPY
command to move the data from the Comma Separated Values (CSV) file to the kingdom_knight_import
table. Then, run it as the student
account.
1 2 | COPY kingdom_knight_import FROM '/u01/app/postgres/upload/kingdom_import1.csv' DELIMITERS ',' CSV; |
Step #4 : Create the upload_kingdom
procedure
Copy the following code into a create_kingdom_knight_procedure.sql
file within a directory of your choice. Assuming you put the code in the create_kingdom_knight_procedure.sql
script, you can call it like so
\i create_kingdom_knight_procedure.sql |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 | CREATE PROCEDURE upload_kingdom() AS $$ DECLARE /* Declare error handling variables. */ err_num TEXT; err_msg INTEGER; BEGIN /* Insert only unique rows. The DISTINCT operator compresses the result set to a unique set and avoids consuming sequence values for non-unique result sets. */ INSERT INTO kingdom ( kingdom_name , population ) ( SELECT DISTINCT kki.kingdom_name , kki.population FROM kingdom_knight_import kki LEFT JOIN kingdom k ON kki.kingdom_name = k.kingdom_name AND kki.population = k.population) ON CONFLICT ON CONSTRAINT kingdom_uq_key DO NOTHING; /* Insert only unique rows. */ INSERT INTO knight ( knight_name , kingdom_allegiance_id , allegiance_start_date , allegiance_end_date ) (SELECT kki.knight_name , k.kingdom_id , kki.allegiance_start_date AS start_date , kki.allegiance_end_date AS end_date FROM kingdom_knight_import kki INNER JOIN kingdom k ON kki.kingdom_name = k.kingdom_name AND kki.population = k.population LEFT JOIN knight kn ON k.kingdom_id = kn.kingdom_allegiance_id AND kki.knight_name = kn.knight_name AND kki.allegiance_start_date = kn.allegiance_start_date AND kki.allegiance_end_date = kn.allegiance_end_date) ON CONFLICT ON CONSTRAINT knight_uq_key DO NOTHING; EXCEPTION WHEN OTHERS THEN err_num := SQLSTATE; err_msg := SUBSTR(SQLERRM,1,100); RAISE NOTICE 'Trapped Error: %', err_msg; END $$ LANGUAGE plpgsql; |
Step #5 : Run the upload_kingdom
procedure and query the results
You run the upload_kingdom
procedure with the CALL
statement and then query the results. Assuming you put the code in the call_kingdom_knight_procedure.sql
script, you can call it like so
\i call_kingdom_knight_procedure.sql |
/* Call the upload_kingdom procedure. */ CALL upload_kingdom(); /* Query the kingdom_knight_import table. */ SELECT kingdom_name , population , knight_name , date_trunc('second',allegiance_start_date AT TIME ZONE 'MST') AS allegiance_start_date , date_trunc('second',allegiance_end_date AT TIME ZONE 'MST') AS allegiance_end_date FROM kingdom_knight_import; /* Query the kingdom table. */ SELECT * FROM kingdom; /* Query the knight table. */ SELECT kn.knight_id , kki.knight_name , k.kingdom_id , date_trunc('second',kki.allegiance_start_date AT TIME ZONE 'MST') AS start_date , date_trunc('second',kki.allegiance_end_date AT TIME ZONE 'MST') AS end_date FROM kingdom_knight_import kki INNER JOIN kingdom k ON kki.kingdom_name = k.kingdom_name AND kki.population = k.population LEFT JOIN knight kn ON k.kingdom_id = kn.kingdom_allegiance_id AND kki.knight_name = kn.knight_name AND kki.allegiance_start_date = kn.allegiance_start_date AND kki.allegiance_end_date = kn.allegiance_end_date; |
It prints the following results:
CALL kingdom_name | population | knight_name | allegiance_start_date | allegiance_end_date --------------+------------+-----------------------+-----------------------+--------------------- Narnia | 77600 | Peter the Magnificent | 1272-03-19 23:59:56 | 1292-06-18 23:59:56 Narnia | 77600 | Edmund the Just | 1272-03-19 23:59:56 | 1292-06-18 23:59:56 Narnia | 77600 | Susan the Gentle | 1272-03-19 23:59:56 | 1292-06-18 23:59:56 Narnia | 77600 | Lucy the Valiant | 1272-03-19 23:59:56 | 1292-06-18 23:59:56 Narnia | 42100 | Peter the Magnificent | 1531-04-11 23:59:56 | 1531-05-30 23:59:56 Narnia | 42100 | Edmund the Just | 1531-04-11 23:59:56 | 1531-05-30 23:59:56 Narnia | 42100 | Susan the Gentle | 1531-04-11 23:59:56 | 1531-05-30 23:59:56 Narnia | 42100 | Lucy the Valiant | 1531-04-11 23:59:56 | 1531-05-30 23:59:56 Camelot | 15200 | King Arthur | 0631-03-09 23:59:56 | 0686-12-11 23:59:56 Camelot | 15200 | Sir Lionel | 0631-03-09 23:59:56 | 0686-12-11 23:59:56 Camelot | 15200 | Sir Bors | 0631-03-09 23:59:56 | 0635-12-11 23:59:56 Camelot | 15200 | Sir Bors | 0640-03-09 23:59:56 | 0686-12-11 23:59:56 Camelot | 15200 | Sir Galahad | 0631-03-09 23:59:56 | 0686-12-11 23:59:56 Camelot | 15200 | Sir Gawain | 0631-03-09 23:59:56 | 0686-12-11 23:59:56 Camelot | 15200 | Sir Tristram | 0631-03-09 23:59:56 | 0686-12-11 23:59:56 Camelot | 15200 | Sir Percival | 0631-03-09 23:59:56 | 0686-12-11 23:59:56 Camelot | 15200 | Sir Lancelot | 0670-09-29 23:59:56 | 0682-12-11 23:59:56 (18 rows) kingdom_id | kingdom_name | population ------------+--------------+------------ 1 | Narnia | 42100 2 | Narnia | 77600 3 | Camelot | 15200 (3 rows) knight_id | knight_name | kingdom_id | start_date | end_date -----------+-----------------------+------------+---------------------+--------------------- 1 | Peter the Magnificent | 2 | 1272-03-19 23:59:56 | 1292-06-18 23:59:56 2 | Edmund the Just | 2 | 1272-03-19 23:59:56 | 1292-06-18 23:59:56 3 | Susan the Gentle | 2 | 1272-03-19 23:59:56 | 1292-06-18 23:59:56 4 | Lucy the Valiant | 2 | 1272-03-19 23:59:56 | 1292-06-18 23:59:56 5 | Peter the Magnificent | 1 | 1531-04-11 23:59:56 | 1531-05-30 23:59:56 6 | Edmund the Just | 1 | 1531-04-11 23:59:56 | 1531-05-30 23:59:56 7 | Susan the Gentle | 1 | 1531-04-11 23:59:56 | 1531-05-30 23:59:56 8 | Lucy the Valiant | 1 | 1531-04-11 23:59:56 | 1531-05-30 23:59:56 9 | King Arthur | 3 | 0631-03-09 23:59:56 | 0686-12-11 23:59:56 10 | Sir Lionel | 3 | 0631-03-09 23:59:56 | 0686-12-11 23:59:56 11 | Sir Bors | 3 | 0631-03-09 23:59:56 | 0635-12-11 23:59:56 12 | Sir Bors | 3 | 0640-03-09 23:59:56 | 0686-12-11 23:59:56 13 | Sir Galahad | 3 | 0631-03-09 23:59:56 | 0686-12-11 23:59:56 14 | Sir Gawain | 3 | 0631-03-09 23:59:56 | 0686-12-11 23:59:56 15 | Sir Tristram | 3 | 0631-03-09 23:59:56 | 0686-12-11 23:59:56 16 | Sir Percival | 3 | 0631-03-09 23:59:56 | 0686-12-11 23:59:56 17 | Sir Lancelot | 3 | 0670-09-29 23:59:56 | 0682-12-11 23:59:56 69 | Modred | 3 | 0681-09-29 23:59:56 | 0682-12-11 23:59:56 (18 rows) |
As always, I hope this works to help those trying to solve a similar problem.
PostgreSQL and LPAD
While porting my Oracle code to PostgreSQL I encountered a little quirk. It’s probably not a quirk except for the fact that I’ve worked in Oracle so long. Oracle implicitly type casts so well that we seldom notice.
PostreSQL doesn’t work like Oracle. It does type cast sometimes but not very often. I tried porting the following segment from my Oracle stored procedure to PostgreSQL:
/* Add account number with zeros. */ FOR j IN 1..50 LOOP INSERT INTO account_list VALUES ( lv_airport_code||'-'||LPAD(j,6,'0') , NULL , NULL , 1002 , 1002 ); END LOOP; |
Oracle implicitly casts the integer j
to a text
string before running the LPAD function. In PostgreSQL, it doesn’t cast the integer
to a text
string before calling the LPAD function. Since the LPAD
function is not overloaded, calling it with an integer
, integer
, and text
set of parameters fails at runtime but the code doesn’t raise an exception when compiling it as a stored procedure.
At runtime, it raises the following error:
ERROR: invalid INPUT syntax FOR INTEGER: "function lpad(integer, integer, unknown) does not exist" CONTEXT: PL/pgSQL FUNCTION seed_account_list() line 48 at assignment |
You can fix the code by explicitly casting the LPAD
function’s first parameter to a text
string. You do that as follows below:
/* Add account number with zeros. */ FOR j IN 1..50 LOOP INSERT INTO account_list VALUES ( lv_airport_code||'-'||LPAD(j::text,6,'0') , NULL , NULL , 1002 , 1002 ); END LOOP; |
This fixes the casting problem. As always, I hope this helps those looking for a solution.
DBeaver for PostgreSQL
I’m migrating my database classes from the Oracle database to the PostgreSQL database. Using the Oracle Express Edition has always required a virtualized image because students use Windows and Mac OS. Also, the university doesn’t like my use of a virtualized image. Virtualization imposes incremental cost on students to have high end laptops.
The available Docker images don’t typically support the Oracle Express Edition. That means there are licensing implications tied to Oracle.
As a committee, we decided to use PostgreSQL as our new database platform. We opted to use PostgreSQL over MySQL because it supports arrays and stored procedures. PostgreSQL PL/pgSQL are a solid implementation of stored procedures. While MySQL supports a stored procedure language, it effectively discourages using it.
We use sqlplus
, mysql
, or psql
Command-Line Interface (CLI) as the primary interface. That’s because CLI supports production-oriented deployment and DevOps practices. The Open Source DBeaver project provides a clean native GUI management console for PostgreSQL. DBeaver also supports other databases, which pgAdmin doesn’t.
You click the Download button from the DBeaver home page. It support Windows, Mac OS, and Linux distributions. I downloaded the Linux RPM package to my Downloads directory. You can run the RPM with the following syntax:
rpm -ivh dbeaver-ce-6.2.3-stable.x86_64.rpm |
It doesn’t check for dependencies. The installation is quick. You can verify that it’s installed with the following command:
which dbeaver |
It installs here:
/usr/bin/dbeaver |
If you failed to install the PostgreSQL JAR file, you’ll see the following error message:
It will look like the following on Linux:
You should have a user in the pg_hba.conf
file, like the following:
host all all 127.0.0.1/32 trust |
As always, I hope this helps those solving problems.
Postgres Print Debug Notes
A student asked how you print output from PL/pgSQL blocks. The student wanted to know if there was something like the following in Oracle’s PL/SQL programming language:
dbms_output.put_line('some string'); |
or, in Java programming the:
System.out.println("some string"); |
The RAISE NOTICE
is the equivalent to these in Postgres PL/pgSQL, as shown in the following anonymous block:
do $$ BEGIN raise notice 'Hello World!'; END; $$; |
It prints:
NOTICE: Hello World! |
You can write a hello_world function as a named PL/pgSQL block:
CREATE FUNCTION hello_world() RETURNS text AS $$ DECLARE output VARCHAR(20); BEGIN /* Query the string into a local variable. */ SELECT 'Hello World!' INTO output; /* Return the output text variable. */ RETURN output; END $$ LANGUAGE plpgsql; |
You can call it with the following:
SELECT hello_world(); |
It prints:
hello_world -------------- Hello World! (1 row) |
Here’s a full test case with stored procedure in PL/pgSQL:
-- Drop the msg table. DROP TABLE msg; -- Create the msg table. CREATE TABLE msg ( comment VARCHAR(400) ); -- Transaction Management Example. DROP PROCEDURE IF EXISTS testing ( IN pv_one VARCHAR(30) , IN pv_two VARCHAR(10)); -- Transaction Management Example. CREATE OR REPLACE PROCEDURE testing ( IN pv_one VARCHAR(30) , IN pv_two VARCHAR(10)) AS $$ DECLARE /* Declare error handling variables. */ err_num TEXT; err_msg INTEGER; BEGIN /* Log actdual parameter values. */ INSERT INTO msg VALUES (pv_one||'.'||pv_two); EXCEPTION WHEN OTHERS THEN err_num := SQLSTATE; err_msg := SUBSTR(SQLERRM,1,100); RAISE NOTICE 'Trapped Error: %', err_msg; END $$ LANGUAGE plpgsql; do $$ DECLARE lv_one VARCHAR(30) := 'INDIVIDUAL'; lv_two VARCHAR(19) := 'R11-514-34'; BEGIN RAISE NOTICE '[%]', lv_one; RAISE NOTICE '[%]', lv_two; CALL testing( pv_one := lv_one, pv_two := lv_two ); END $$; -- Query any logged results. SELECT * FROM msg; |
It prints:
DROP TABLE CREATE TABLE DROP PROCEDURE CREATE PROCEDURE psql:fixed.sql:61: NOTICE: [INDIVIDUAL] psql:fixed.sql:61: NOTICE: [R11-514-34] DO comment ----------------------- INDIVIDUAL.R11-514-34 (1 row) |
I hope this helps those looking for a solution.