MacLochlainns Weblog

Michael McLaughlin's Technical Blog

Site Admin

Archive for the ‘Postgres’ Category

PostgreSQL Table Function

without comments

This shows how to write a PL/pgSQL function that returns a filtered table result set while writing to a debug log file. The example requires a small character table, like:

DROP TABLE IF EXISTS CHARACTER;
CREATE TABLE CHARACTER
( character_id    SERIAL
, character_name  VARCHAR );

and, a logger table:

DROP TABLE IF EXISTS logger;
CREATE TABLE logger
( logger_id     SERIAL
, message_text  VARCHAR );

Now, let’s insert a couple rows into the character table. The following query inserts one row:

INSERT INTO CHARACTER
( character_name )
VALUES
('Harry Potter');

It was simply too much fun to write this tricky insert statement to the character table. It only submits a value when it doesn’t already exist in the set of character_name column values. While it eliminates the need for a unique constraint on the character_name column, it makes every insert statement more costly in terms of machine resources.

WITH cte AS
( SELECT 'Harry Potter' AS character_name
  UNION ALL
  SELECT 'Hermione Granger' AS character_name
  UNION ALL
  SELECT 'Ronald Weasily' AS character_name )
INSERT INTO CHARACTER
( character_name )
( SELECT character_name
  FROM   cte
  WHERE  NOT EXISTS
          (SELECT NULL
           FROM   CHARACTER c
           WHERE  c.character_name = cte.character_name));

You can verify these insert statements work with the following query:

SELECT * FROM CHARACTER;

It returns:

 character_id |  character_name  
--------------+------------------
            1 | Harry Potter
            2 | Hermione Granger
            3 | Ronald Weasily
(3 rows)

The following character_query PL/pgSQL function filters table results and returns a table of values. The function defines the future query return results, which is a full fledged object-oriented programming adapter pattern.

CREATE OR REPLACE
  FUNCTION character_query (pattern VARCHAR)
  RETURNS TABLE ( character_id    INTEGER
                , character_text  VARCHAR ) AS
$$
BEGIN
  RETURN QUERY
  SELECT c.character_id
  ,      c.character_name
  FROM   CHARACTER c
  WHERE  c.character_name SIMILAR TO '%'||pattern||'%';
END;
$$ LANGUAGE plpgsql;

You can test the character_query function, like this:

SELECT * FROM character_query('Hermione');

It returns:

 character_id |  character_text  
--------------+------------------
            2 | Hermione Granger
(1 row)

Building on what we did, let’s log our query word in the logger table. You add an insert statement after the BEGIN keyword and before the RETURN QUERY phrases, like:

CREATE OR REPLACE
  FUNCTION character_query (pattern VARCHAR)
  RETURNS TABLE ( character_id    INTEGER
                , character_text  VARCHAR ) AS
$$
BEGIN
  INSERT INTO logger
  ( message_text )
  VALUES
  ( pattern );
 
  RETURN QUERY
  SELECT c.character_id
  ,      c.character_name
  FROM   CHARACTER c
  WHERE  c.character_name SIMILAR TO '%'||pattern||'%';
END;
$$ LANGUAGE plpgsql;

Now let’s test the new character_query function with this test case:

SELECT * FROM character_query('Ron');

Then, let’s check the logger table with this query:

SELECT * FROM logger;

It displays:

 logger_id | message_text 
-----------+--------------
         1 | Hermione
(1 row)

As always, I hope this helps those looking for a solution.

Written by maclochlainn

February 21st, 2020 at 12:58 am

PostgreSQL Write JSON File

without comments

In the process of working through issues to support ETL transformations from JSON with Python. The first step was extracting a the data from a PostgreSQL table into a JSON file. The syntax wasn’t exactly clear and the PostgreSQL Tutorial was a bit misleading but I worked it out. My initial sample deals with writing the data from the item table of my videodb database to a item.json file in the /tmp directory.

The query retrieves the data but casts the creation_date and last_update_date timestamp data types into date data types, as follows:

SELECT row_to_json(
 (SELECT ROW
  FROM (SELECT item_id
        ,      item_barcode
        ,      item_type
        ,      item_title
        ,      item_subtitle
        ,      item_rating
        ,      item_release_date
        ,      created_by
        ,      creation_date::DATE
        ,      last_updated_by
        ,      last_update_date::DATE) ROW)) AS DATA
        FROM   item;

The internal SELECT-list creates the name element of name:value pairs in the JSON structure. Naturally, it does exclude the type casting ::date portion of the query from the name elements. The row_to_json() maps the name list to the list of value results from in each row, and returns a result set.

The query returns the following 21 JSON objects:

{"item_id":1001,"item_barcode":"9736-05640-4","item_type":1016,"item_title":"The Hunt for Red October","item_subtitle":"Special Collector's Edition","item_rating":"PG","item_release_date":"1990-03-02","created_by":1001,"creation_date":"2019-12-13","last_updated_by":1001,"last_update_date":"2019-12-13"}
{"item_id":1002,"item_barcode":"24543-02392","item_type":1016,"item_title":"Star Wars I","item_subtitle":"Phantom Menace","item_rating":"PG","item_release_date":"1999-05-04","created_by":1001,"creation_date":"2019-12-13","last_updated_by":1001,"last_update_date":"2019-12-13"}
{"item_id":1003,"item_barcode":"24543-5615","item_type":1015,"item_title":"Star Wars II","item_subtitle":"Attack of the Clones","item_rating":"PG","item_release_date":"2002-05-16","created_by":1001,"creation_date":"2019-12-13","last_updated_by":1001,"last_update_date":"2019-12-13"}
{"item_id":1004,"item_barcode":"24543-05539","item_type":1016,"item_title":"Star Wars II","item_subtitle":"Attack of the Clones","item_rating":"PG","item_release_date":"2002-05-16","created_by":1001,"creation_date":"2019-12-13","last_updated_by":1001,"last_update_date":"2019-12-13"}
{"item_id":1005,"item_barcode":"24543-20309","item_type":1016,"item_title":"Star Wars III","item_subtitle":"Revenge of the Sith","item_rating":"PG13","item_release_date":"2005-05-19","created_by":1001,"creation_date":"2019-12-13","last_updated_by":1001,"last_update_date":"2019-12-13"}
{"item_id":1006,"item_barcode":"86936-70380","item_type":1016,"item_title":"The Chronicles of Narnia","item_subtitle":"The Lion, the Witch and the Wardrobe","item_rating":"PG","item_release_date":"2002-05-16","created_by":1001,"creation_date":"2019-12-13","last_updated_by":1001,"last_update_date":"2019-12-13"}
{"item_id":1007,"item_barcode":"91493-06475","item_type":1019,"item_title":"RoboCop","item_subtitle":"","item_rating":"Mature","item_release_date":"2003-07-24","created_by":1001,"creation_date":"2019-12-13","last_updated_by":1001,"last_update_date":"2019-12-13"}
{"item_id":1008,"item_barcode":"93155-11810","item_type":1019,"item_title":"Pirates of the Caribbean","item_subtitle":"","item_rating":"Teen","item_release_date":"2003-06-30","created_by":1001,"creation_date":"2019-12-13","last_updated_by":1001,"last_update_date":"2019-12-13"}
{"item_id":1009,"item_barcode":"12725-00173","item_type":1019,"item_title":"The Chronicles of Narnia","item_subtitle":"The Lion, the Witch and the Wardrobe","item_rating":"Everyone","item_release_date":"2003-06-30","created_by":1001,"creation_date":"2019-12-13","last_updated_by":1001,"last_update_date":"2019-12-13"}
{"item_id":1010,"item_barcode":"45496-96128","item_type":1017,"item_title":"MarioKart","item_subtitle":"Double Dash","item_rating":"Everyone","item_release_date":"2003-11-17","created_by":1001,"creation_date":"2019-12-13","last_updated_by":1001,"last_update_date":"2019-12-13"}
{"item_id":1011,"item_barcode":"08888-32214","item_type":1018,"item_title":"Splinter Cell","item_subtitle":"Chaos Theory","item_rating":"Teen","item_release_date":"2003-04-08","created_by":1001,"creation_date":"2019-12-13","last_updated_by":1001,"last_update_date":"2019-12-13"}
{"item_id":1012,"item_barcode":"14633-14821","item_type":1018,"item_title":"Need for Speed","item_subtitle":"Most Wanted","item_rating":"Everyone","item_release_date":"2004-11-15","created_by":1001,"creation_date":"2019-12-13","last_updated_by":1001,"last_update_date":"2019-12-13"}
{"item_id":1013,"item_barcode":"10425-29944","item_type":1019,"item_title":"The DaVinci Code","item_subtitle":"","item_rating":"Teen","item_release_date":"2006-05-19","created_by":1001,"creation_date":"2019-12-13","last_updated_by":1001,"last_update_date":"2019-12-13"}
{"item_id":1014,"item_barcode":"52919-52057","item_type":1019,"item_title":"Cars","item_subtitle":"","item_rating":"Everyone","item_release_date":"2006-04-28","created_by":1001,"creation_date":"2019-12-13","last_updated_by":1001,"last_update_date":"2019-12-13"}
{"item_id":1015,"item_barcode":"9689-80547-3","item_type":1020,"item_title":"Beau Geste","item_subtitle":"","item_rating":"PG","item_release_date":"1992-03-01","created_by":1001,"creation_date":"2019-12-13","last_updated_by":1001,"last_update_date":"2019-12-13"}
{"item_id":1016,"item_barcode":"53939-64103","item_type":1020,"item_title":"I Remember Mama","item_subtitle":"","item_rating":"NR","item_release_date":"1998-01-05","created_by":1001,"creation_date":"2019-12-13","last_updated_by":1001,"last_update_date":"2019-12-13"}
{"item_id":1017,"item_barcode":"24543-01292","item_type":1020,"item_title":"Tora! Tora! Tora!","item_subtitle":"The Attack on Pearl Harbor","item_rating":"G","item_release_date":"1999-11-02","created_by":1001,"creation_date":"2019-12-13","last_updated_by":1001,"last_update_date":"2019-12-13"}
{"item_id":1018,"item_barcode":"43396-60047","item_type":1020,"item_title":"A Man for All Seasons","item_subtitle":"","item_rating":"G","item_release_date":"1994-06-28","created_by":1001,"creation_date":"2019-12-13","last_updated_by":1001,"last_update_date":"2019-12-13"}
{"item_id":1019,"item_barcode":"43396-70603","item_type":1020,"item_title":"Hook","item_subtitle":"","item_rating":"PG","item_release_date":"1991-12-11","created_by":1001,"creation_date":"2019-12-13","last_updated_by":1001,"last_update_date":"2019-12-13"}
{"item_id":1020,"item_barcode":"85391-13213","item_type":1020,"item_title":"Around the World in 80 Days","item_subtitle":"","item_rating":"G","item_release_date":"1992-12-04","created_by":1001,"creation_date":"2019-12-13","last_updated_by":1001,"last_update_date":"2019-12-13"}
{"item_id":1021,"item_barcode":"85391-10843","item_type":1020,"item_title":"Camelot","item_subtitle":"","item_rating":"G","item_release_date":"1998-05-15","created_by":1001,"creation_date":"2019-12-13","last_updated_by":1001,"last_update_date":"2019-12-13"}

Moving on to the next step you enclose the query in the copy command syntax, like:

COPY
(SELECT row_to_json(
  (SELECT ROW
   FROM (SELECT item_id
         ,      item_barcode
         ,      item_type
         ,      item_title
         ,      item_subtitle
         ,      item_rating
         ,      item_release_date
         ,      created_by
         ,      creation_date::DATE
         ,      last_updated_by
         ,      last_update_date::DATE) ROW)) AS DATA
         FROM   item) TO '/tmp/sample.json';

When you run it, it will fail with the following error message:

ERROR:  must be superuser or a member of the pg_write_server_files role to COPY to a file
HINT:  Anyone can COPY to stdout or from stdin. psql's \copy command also works for anyone.

You can fix this problem by connecting as the postgres user and granting the privilege to the student user:

GRANT pg_write_server_files TO student;

It then runs successfully as the student user and creates the sample.json file in the /tmp directory. As always, I hope this helps those looking for a solution.

Written by maclochlainn

January 5th, 2020 at 12:24 am

PostgreSQL Creating Schema

without comments

The process of creating a schema requires you grant the CREATE ON DATABASE privilege to the user as the postgres user. You use the following syntax:

GRANT CREATE ON DATABASE videodb TO student;

As the student user, you create the app schema with the following syntax:

CREATE SCHEMA app;

Then, you can query the result as follows:

SELECT   * 
FROM     pg_catalog.pg_namespace
ORDER BY nspname;

You should see the following:

      nspname       | nspowner |               nspacl                
--------------------+----------+-------------------------------------
 app                |    16390 | 
 information_schema |       10 | {postgres=UC/postgres,=U/postgres}
 pg_catalog         |       10 | {postgres=UC/postgres,=U/postgres}
 pg_temp_1          |       10 | 
 pg_toast           |       10 | 
 pg_toast_temp_1    |       10 | 
 public             |       10 | {postgres=UC/postgres,=UC/postgres}
(7 rows)

If you create a revision_history table without a schema name, it is automatically placed in the public schema, which means an attempt to describe the table will return an error. For example, you create a revision_history table with the following command:

CREATE TABLE app.revision_history
( revision_history_id serial
, session_id          VARCHAR
, TABLE_NAME          VARCHAR
, revision_id         INTEGER );

You describe the revision_history table with the following command:

\d revision_history

It will show the following because there is no revision_history table in the public schema and the default search path only include a schema that shares the name with the student user and the public schema.

Did not find any relation named "revision_history".

You can show the search path with the following:

show search_path;

It should return the following, which is a schema that shares the user’s name and public.

   search_path   
-----------------
 "$user", public
(1 row)

You set the search path as follows:

SET search_path TO app, "$user", public;

After you set the search_path, a standard attempt to describe the table will find the table whether it is in the app or public schema. That means the following command:

\d revision_history

Shows:

                                                  Table "app.revision_history"
       Column        |       Type        | Collation | Nullable |                            Default                            
---------------------+-------------------+-----------+----------+---------------------------------------------------------------
 revision_history_id | integer           |           | not null | nextval('revision_history_revision_history_id_seq'::regclass)
 session_id          | character varying |           |          | 
 table_name          | character varying |           |          | 
 revision_id         | integer           |           |          |

As always, I hope this helps somebody looking for an answer.

Written by maclochlainn

December 10th, 2019 at 9:37 pm

PostgreSQL WITH to INSERT

without comments

This demonstrates how you insert results from a common table expression (CTE) in a leading WITH clause. I thought it would be a nice add since the existing tutorials didn’t have an example.

Create the message table, like this:

CREATE TABLE message
( message_id    SERIAL
, message_text  VARCHAR );

Now, here’s a CTE with a two fabricated rows:

WITH cte AS
( SELECT 'x-ray' AS msg
  UNION ALL
  SELECT 'MRI' AS msg )
INSERT INTO message
( message_text )
( SELECT msg FROM cte );

Then, you can query the results from the message table, like this:

SELECT * FROM message;

You should get the following results:

 message_id | message_text 
------------+--------------
          1 | x-ray
          2 | MRI
(2 rows)

Unfortunately, if you re-ran that it would insert a duplicate set of rows. You can avoid that behavior by using correlation between the subquery in the WITH clause and target table of the INSERT statement.

The following only allows you to INSERT new rows in the table.

WITH cte AS
( SELECT 'x-ray' AS msg
  UNION ALL
  SELECT 'MRI' AS msg )
INSERT INTO message
( message_text )
( SELECT msg
  FROM   cte 
  WHERE  NOT EXISTS
          (SELECT NULL
           FROM   message 
           WHERE  message.message_text = cte.msg ));

You can add revision control column to the message table to verify there’s no change with subsequent attempts. The first step requires you create an independent message_rev_id_seq sequence, like this:

CREATE SEQUENCE message_rev_id_seq
  INCREMENT 1
  MINVALUE -9223372036854775807
  MAXVALUE  9223372036854775806
  START    -9223372036854765807;

You can alter the table by using the standard rev_id column name:

ALTER TABLE message
ADD COLUMN rev_id BIGINT NOT NULL DEFAULT NEXTVAL('public.message_rev_id_seq');

The ALTER statement will automatically add the first two values of the message_rev_id_seq to the existing rows. The ALTER statement is an immediate action because all Data Definition Language (DDL) commands are 1-Phase Commit operations.

The next step is to create an update_message_rev_id function that return a trigger. You do this using PL/pgSQL but you must specify the language value explicitly when returning a trigger data type, like:

1
2
3
4
5
6
7
8
CREATE OR REPLACE FUNCTION update_message_rev_id()
RETURNS TRIGGER AS
$$
BEGIN
  NEW.rev_id = NEXTVAL(‘public.message_rev_id_seq’);
  RETURN NEW;
END;
$$ LANGUAGE plpgsql;

The last step requires you create an update_message_revision trigger that calls the update_message_rev_id function, like:

CREATE TRIGGER update_message_revision BEFORE UPDATE
  ON message FOR EACH ROW
  EXECUTE PROCEDURE update_message_rev_id();

You can test it by running the correlated INSERT statement again. You’ll find that no rows are touched by the INSERT statement because the revision numbers are unchanged. You should see these two rows:

 message_id | message_text |        rev_id        
------------+--------------+----------------------
          1 | x-ray        | -9223372036854765807
          2 | MRI          | -9223372036854765806
(2 rows)

As always, I hope this helps those looking for a example of how to solve a problem.

Written by maclochlainn

December 3rd, 2019 at 10:15 pm

PostgreSQL Upsert Advanced

without comments

Nine years after writing how to use the MERGE statement in Oracle, I am writing how you implement an UPSERT statement in PostgreSQL. I wrote an initial post going over the basics of PostgreSQL’s upsert implementation of the INSERT statement with an DO UPDATE clause and a DO NOTHING clause.

I thought it was interesting that the PostgreSQL Upsert Using INSERT ON CONFLICT Statement web page didn’t cover using a subquery as the source for an INSERT statement.

Demonstration

Here are the steps to accomplish an import/upload with the COPY statement and the INSERT statement with DO UPDATE and DO NOTHING clauses.

Step #1 : Position your CSV file in the physical directory

The example uses the following upload directory:

/u01/app/postgres/upload

Put the following text in to the kingdom_import.csv file.

Narnia,77600,Peter the Magnificent,1272-03-20,1292-06-19
Narnia,77600,Edmund the Just,1272-03-20,1292-06-19
Narnia,77600,Susan the Gentle,1272-03-20,1292-06-19
Narnia,77600,Lucy the Valiant,1272-03-20,1292-06-19
Narnia,42100,Peter the Magnificent,1531-04-12,1531-05-31
Narnia,42100,Edmund the Just,1531-04-12,1531-05-31
Narnia,42100,Susan the Gentle,1531-04-12,1531-05-31
Narnia,42100,Lucy the Valiant,1531-04-12,1531-05-31
Camelot,15200,King Arthur,0631-03-10,0686-12-12
Camelot,15200,Sir Lionel,0631-03-10,0686-12-12
Camelot,15200,Sir Bors,0631-03-10,0635-12-12
Camelot,15200,Sir Bors,0640-03-10,0686-12-12
Camelot,15200,Sir Galahad,0631-03-10,0686-12-12
Camelot,15200,Sir Gawain,0631-03-10,0686-12-12
Camelot,15200,Sir Tristram,0631-03-10,0686-12-12
Camelot,15200,Sir Percival,0631-03-10,0686-12-12
Camelot,15200,Sir Lancelot,0670-09-30,0682-12-12
Camelot,15200,Modred,0681-09-30,0682-12-12

Step #2 : Run the script that creates tables and sequences

Copy the following code into a create_kingdom_knight_tables.sql file within a directory of your choice. Then, you run it as the student user. Assuming you put the code in the create_kingdom_knight_tables.sql script, you can call it like so

\i create_kingdom_knight_tables.sql

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
-- Conditionally drop three tables and sequences.
DO $$
DECLARE
  /* Declare an indefinite length string and record variable. */
  SQL  VARCHAR;
  ROW  RECORD;
 
  /* Declare a cursor. */
  table_cursor CURSOR FOR
    SELECT TABLE_NAME
    FROM   information_schema.tables
    WHERE  table_catalog = 'videodb'
    AND    table_schema = 'public'
    AND    TABLE_NAME IN ('kingdom','knight','kingdom_knight_import');
BEGIN
  /* Open the cursor. */
  OPEN table_cursor;
  LOOP
    /* Fetch table names. */
    FETCH table_cursor INTO ROW;
 
    /* Exit when no more records are found. */
    EXIT WHEN NOT FOUND;
 
    /* Concatenate together a DDL to drop the table with prejudice. */
    SQL := 'DROP TABLE IF EXISTS '||ROW.table_name||' CASCADE';
 
    /* Execute the DDL statement. */
    EXECUTE SQL;
  END LOOP;
 
  /* Close the cursor. */
  CLOSE table_cursor;
END;
$$;
 
-- Create normalized kingdom table.
CREATE TABLE kingdom
( kingdom_id    SERIAL
, kingdom_name  VARCHAR(20)
, population    INTEGER
, CONSTRAINT kingdom_uq_key
  UNIQUE ( kingdom_name
         , population ));
 
-- Create normalized knight table.
CREATE TABLE knight
( knight_id             SERIAL
, knight_name           VARCHAR(24)
, kingdom_allegiance_id INTEGER
, allegiance_start_date DATE
, allegiance_end_date   DATE
, CONSTRAINT knight_uq_key 
  UNIQUE ( knight_name
         , kingdom_allegiance_id
         , allegiance_start_date
         , allegiance_end_date ));
 
-- Create external import table.
CREATE TABLE kingdom_knight_import
( kingdom_name          VARCHAR(20)
, population            INTEGER
, knight_name           VARCHAR(24)
, allegiance_start_date DATE
, allegiance_end_date   DATE);

Step #3 : Run the COPY command.

Run the COPY command to move the data from the Comma Separated Values (CSV) file to the kingdom_knight_import table. Then, run it as the student account.

1
2
COPY kingdom_knight_import
FROM '/u01/app/postgres/upload/kingdom_import1.csv' DELIMITERS ',' CSV;

Step #4 : Create the upload_kingdom procedure

Copy the following code into a create_kingdom_knight_procedure.sql file within a directory of your choice. Assuming you put the code in the create_kingdom_knight_procedure.sql script, you can call it like so

\i create_kingdom_knight_procedure.sql

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
CREATE PROCEDURE upload_kingdom() AS 
$$
DECLARE
 
  /* Declare error handling variables. */
  err_num      TEXT;
  err_msg      INTEGER;
 
BEGIN
 
  /* Insert only unique rows. The DISTINCT operator compresses the
     result set to a unique set and avoids consuming sequence values
     for non-unique result sets. */
  INSERT INTO kingdom
  ( kingdom_name
  , population )
  ( SELECT   DISTINCT
             kki.kingdom_name
    ,        kki.population
    FROM     kingdom_knight_import kki LEFT JOIN kingdom k
    ON       kki.kingdom_name = k.kingdom_name
    AND      kki.population = k.population)
  ON CONFLICT ON CONSTRAINT kingdom_uq_key
  DO NOTHING;
 
  /* Insert only unique rows. */
  INSERT INTO knight
  ( knight_name
  , kingdom_allegiance_id
  , allegiance_start_date
  , allegiance_end_date )
  (SELECT kki.knight_name
   ,      k.kingdom_id
   ,      kki.allegiance_start_date AS start_date
   ,      kki.allegiance_end_date AS end_date
   FROM   kingdom_knight_import kki INNER JOIN kingdom k
   ON     kki.kingdom_name = k.kingdom_name
   AND    kki.population = k.population LEFT JOIN knight kn 
   ON     k.kingdom_id = kn.kingdom_allegiance_id
   AND    kki.knight_name = kn.knight_name
   AND    kki.allegiance_start_date = kn.allegiance_start_date
   AND    kki.allegiance_end_date = kn.allegiance_end_date)
  ON CONFLICT ON CONSTRAINT knight_uq_key
  DO NOTHING;
 
EXCEPTION
  WHEN OTHERS THEN
    err_num := SQLSTATE;
    err_msg := SUBSTR(SQLERRM,1,100);
    RAISE NOTICE 'Trapped Error: %', err_msg;
END
$$ LANGUAGE plpgsql;

Step #5 : Run the upload_kingdom procedure and query the results

You run the upload_kingdom procedure with the CALL statement and then query the results. Assuming you put the code in the call_kingdom_knight_procedure.sql script, you can call it like so

\i call_kingdom_knight_procedure.sql

/* Call the upload_kingdom procedure. */
CALL upload_kingdom();
 
/* Query the kingdom_knight_import table. */
SELECT   kingdom_name
,        population
,        knight_name
,        date_trunc('second',allegiance_start_date AT TIME ZONE 'MST') AS allegiance_start_date
,        date_trunc('second',allegiance_end_date AT TIME ZONE 'MST') AS allegiance_end_date
FROM     kingdom_knight_import;
 
/* Query the kingdom table. */
SELECT   *
FROM     kingdom;
 
/* Query the knight table. */
SELECT   kn.knight_id
,        kki.knight_name
,        k.kingdom_id
,        date_trunc('second',kki.allegiance_start_date AT TIME ZONE 'MST') AS start_date
,        date_trunc('second',kki.allegiance_end_date AT TIME ZONE 'MST') AS end_date
FROM     kingdom_knight_import kki INNER JOIN kingdom k
ON       kki.kingdom_name = k.kingdom_name
AND      kki.population = k.population LEFT JOIN knight kn
ON       k.kingdom_id = kn.kingdom_allegiance_id
AND      kki.knight_name = kn.knight_name
AND      kki.allegiance_start_date = kn.allegiance_start_date
AND      kki.allegiance_end_date = kn.allegiance_end_date;

It prints the following results:

CALL
 kingdom_name | population |      knight_name      | allegiance_start_date | allegiance_end_date 
--------------+------------+-----------------------+-----------------------+---------------------
 Narnia       |      77600 | Peter the Magnificent | 1272-03-19 23:59:56   | 1292-06-18 23:59:56
 Narnia       |      77600 | Edmund the Just       | 1272-03-19 23:59:56   | 1292-06-18 23:59:56
 Narnia       |      77600 | Susan the Gentle      | 1272-03-19 23:59:56   | 1292-06-18 23:59:56
 Narnia       |      77600 | Lucy the Valiant      | 1272-03-19 23:59:56   | 1292-06-18 23:59:56
 Narnia       |      42100 | Peter the Magnificent | 1531-04-11 23:59:56   | 1531-05-30 23:59:56
 Narnia       |      42100 | Edmund the Just       | 1531-04-11 23:59:56   | 1531-05-30 23:59:56
 Narnia       |      42100 | Susan the Gentle      | 1531-04-11 23:59:56   | 1531-05-30 23:59:56
 Narnia       |      42100 | Lucy the Valiant      | 1531-04-11 23:59:56   | 1531-05-30 23:59:56
 Camelot      |      15200 | King Arthur           | 0631-03-09 23:59:56   | 0686-12-11 23:59:56
 Camelot      |      15200 | Sir Lionel            | 0631-03-09 23:59:56   | 0686-12-11 23:59:56
 Camelot      |      15200 | Sir Bors              | 0631-03-09 23:59:56   | 0635-12-11 23:59:56
 Camelot      |      15200 | Sir Bors              | 0640-03-09 23:59:56   | 0686-12-11 23:59:56
 Camelot      |      15200 | Sir Galahad           | 0631-03-09 23:59:56   | 0686-12-11 23:59:56
 Camelot      |      15200 | Sir Gawain            | 0631-03-09 23:59:56   | 0686-12-11 23:59:56
 Camelot      |      15200 | Sir Tristram          | 0631-03-09 23:59:56   | 0686-12-11 23:59:56
 Camelot      |      15200 | Sir Percival          | 0631-03-09 23:59:56   | 0686-12-11 23:59:56
 Camelot      |      15200 | Sir Lancelot          | 0670-09-29 23:59:56   | 0682-12-11 23:59:56
(18 rows)
 
 kingdom_id | kingdom_name | population 
------------+--------------+------------
          1 | Narnia       |      42100
          2 | Narnia       |      77600
          3 | Camelot      |      15200
(3 rows)
 
 knight_id |      knight_name      | kingdom_id |     start_date      |      end_date       
-----------+-----------------------+------------+---------------------+---------------------
         1 | Peter the Magnificent |          2 | 1272-03-19 23:59:56 | 1292-06-18 23:59:56
         2 | Edmund the Just       |          2 | 1272-03-19 23:59:56 | 1292-06-18 23:59:56
         3 | Susan the Gentle      |          2 | 1272-03-19 23:59:56 | 1292-06-18 23:59:56
         4 | Lucy the Valiant      |          2 | 1272-03-19 23:59:56 | 1292-06-18 23:59:56
         5 | Peter the Magnificent |          1 | 1531-04-11 23:59:56 | 1531-05-30 23:59:56
         6 | Edmund the Just       |          1 | 1531-04-11 23:59:56 | 1531-05-30 23:59:56
         7 | Susan the Gentle      |          1 | 1531-04-11 23:59:56 | 1531-05-30 23:59:56
         8 | Lucy the Valiant      |          1 | 1531-04-11 23:59:56 | 1531-05-30 23:59:56
         9 | King Arthur           |          3 | 0631-03-09 23:59:56 | 0686-12-11 23:59:56
        10 | Sir Lionel            |          3 | 0631-03-09 23:59:56 | 0686-12-11 23:59:56
        11 | Sir Bors              |          3 | 0631-03-09 23:59:56 | 0635-12-11 23:59:56
        12 | Sir Bors              |          3 | 0640-03-09 23:59:56 | 0686-12-11 23:59:56
        13 | Sir Galahad           |          3 | 0631-03-09 23:59:56 | 0686-12-11 23:59:56
        14 | Sir Gawain            |          3 | 0631-03-09 23:59:56 | 0686-12-11 23:59:56
        15 | Sir Tristram          |          3 | 0631-03-09 23:59:56 | 0686-12-11 23:59:56
        16 | Sir Percival          |          3 | 0631-03-09 23:59:56 | 0686-12-11 23:59:56
        17 | Sir Lancelot          |          3 | 0670-09-29 23:59:56 | 0682-12-11 23:59:56
        69 | Modred                |          3 | 0681-09-29 23:59:56 | 0682-12-11 23:59:56
(18 rows)

As always, I hope this works to help those trying to solve a similar problem.

Written by maclochlainn

November 25th, 2019 at 1:24 am

PostgreSQL Upsert Intro

with one comment

Oracle and SQL Server use the MERGE statement, MySQL uses the REPLACE INTO statement or ON DUPLICATE KEY, but PostgreSQL uses an upsert. The upsert isn’t a statement per se. It is like MySQL’s INSERT statement with the ON DUPLICATE KEY clause. PostgreSQL uses an ON CONFLICT clause in the INSERT statement and there anonymous block without the $$ delimiters.

The general behaviors of upserts is covered in the PostgreSQL Tutorial. It has the following prototype:

INSERT INTO TABLE_NAME(column_list) VALUES(value_list)
ON CONFLICT target action;

The target can be a column name, an ON CONSTRAINT constraint name, or a WHERE predicate, while the action can be DO NOTHING (or ignore) or a DO UPDATE statement. I wrote the following example to show how to leverage a unique constraint with a DO NOTHING and DO UPDATE behavior.

My example conditionally drops a table, creates a table with a unique constraint, inserts a few rows, updates with a DO UPDATE clause, updates with DO NOTHING clause, and queries the results with a bit of formatting.

  1. Conditionally drop the test table.

    /* Suppress warnings from the log file. */
    SET client_min_messages = 'error';
     
    /* Conditionally drop table. */
    DROP TABLE IF EXISTS test;

  2. Create the test table.

    /* Create a test table. */
    CREATE TABLE test
    ( test_id      SERIAL
    , first_name   VARCHAR(20)
    , middle_name  VARCHAR(20)
    , last_name    VARCHAR(20)
    , updated      INTEGER DEFAULT 0
    , update_time  TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT CURRENT_TIMESTAMP
    , CONSTRAINT test_uq_key UNIQUE (first_name,middle_name,last_name));

  3. Insert six rows into the test table.

    /* Insert six rows. */
    INSERT INTO test
    ( first_name, middle_name, last_name )
    VALUES
     ('Harry','James','Potter')
    ,('Ginerva','Molly','Potter')
    ,('Lily','Luna','Potter')
    ,('Albus','Severus','Potter')
    ,('James',NULL,'Potter')
    ,('Lily',NULL,'Potter');

  4. Create a five second delay.

    /* Sleep for 5 seconds. */
    DO $$
    BEGIN
      PERFORM pg_sleep(5);
    END;
    $$;

  5. Use the INSERT statement with a DO UPDATE clause that increments the updated column of the test table.

    /* Upsert on unique key constraint conflict. */
    INSERT INTO test
    ( first_name
    , middle_name
    , last_name )
    VALUES
    ('Harry'
    ,'James'
    ,'Potter')
    ON CONFLICT ON CONSTRAINT test_uq_key
    DO
      UPDATE
      SET    updated = excluded.updated + 1
      ,      update_time = CURRENT_TIMESTAMP;

  6. Use the INSERT statement with a DO NOTHING clause.

    /* Upsert on unique key constraint ignore update. */
    INSERT INTO test
    ( first_name
    , middle_name
    , last_name )
    VALUES
    ('Harry'
    ,'James'
    ,'Potter')
    ON CONFLICT ON CONSTRAINT test_uq_key
    DO NOTHING;

  7. Query the test table.

    /* Formatted query to demonstrate result of UPSERT statement. */
    SELECT   test_id
    ,        last_name || ', '
    ||       CASE
               WHEN middle_name IS NOT NULL THEN first_name || ' ' || middle_name
               ELSE first_name
             END AS full_name
    ,        updated
    ,        date_trunc('second',update_time AT TIME ZONE 'MST') AS "timestamp"
    FROM     test
    ORDER BY last_name
    ,        first_name
    ,        CASE
               WHEN middle_name IS NOT NULL THEN middle_name
               ELSE 'A'
             END;

    Display results:

     test_id |       full_name       | updated |      timestamp      
    ---------+-----------------------+---------+---------------------
           4 | Potter, Albus Severus |       0 | 2019-11-24 19:23:10
           2 | Potter, Ginerva Molly |       0 | 2019-11-24 19:23:10
           1 | Potter, Harry James   |       1 | 2019-11-24 19:23:15
           5 | Potter, James         |       0 | 2019-11-24 19:23:10
           6 | Potter, Lily          |       0 | 2019-11-24 19:23:10
           3 | Potter, Lily Luna     |       0 | 2019-11-24 19:23:10
    (6 rows)

As always, I hope this helps those looking for clear examples to solve problems.

Written by maclochlainn

November 24th, 2019 at 7:26 pm

Postgres Reading Files

without comments

I’m working on items for migrating my database class from Oracle to PostgreSQL. I ran into an interesting limitation when I tried using the COPY command to read an external CSV file.

I had prepared the system by creating a new directory hierarchy owned by the postgres user on top of a /u01/app mount point. I set the ownership of the directories and files with the following command from the /u01/app mount point:

chown -R postgres:postgres postgres

After running the following command:

COPY transaction_upload
FROM '/u01/app/upload/postgres/transaction_upload_postgres.csv' DELIMITERS ',' CSV;

The command raised the following error:

COPY transaction_upload FROM '/u01/app/upload/postgres/transaction_upload_postgres.csv' DELIMITERS ',' CSV;
ERROR:  must be superuser or a member of the <code>pg_read_server_files</code> role to COPY from a file
HINT:  Anyone can COPY to stdout or from stdin. psql's \copy command also works for anyone.

The two options for fixing the problem are: Changing the student user to a superuser, and granting the pg_read_server_files role to the student user. Changing the student user to a superuser isn’t really a practical option. So, I connected as the postgres superuser and granted the pg_read_server_files role to the student user. It is a system level role and therefore doesn’t limit the role to only the videodb database.

As the postgres user, type the following command to grant the pg_read_server_files role to the system user:

GRANT pg_read_server_files TO student;

After granting the role to the student user, I created a small test case. The test table definition is:

CREATE TABLE test
( id          INTEGER
, first_name  VARCHAR(20)
, last_name   VARCHAR(20));

I created a test.csv file in the /u01/app/upload/postgres directory, like:

1,Simon,Bolivar
2,Peter,Davenport
3,Michael,Swan

The test.csv file requires the following permissions and ownerships:

-rw-r--r--. 1 postgres postgres 49 Nov 13 10:56 test.csv

The permissions are user read-write, groups read, and others read. The ownership should be granted to postgres and the primary group for the postgres user, which should also be postgres.

You can then connect to psql as the student user with the database set to videodb and run the following copy command:

COPY test
FROM '/u01/app/upload/postgres/test.csv' DELIMITERS ',' CSV;

If you put a comma at the end of each line, like you would do in MySQL, it raises an error. The trailing comma raises the following error:

ERROR:  extra data after last expected column

If you forget a delimiting commas somewhere on a line, the copy command raises the following error:

ERROR:  missing data for column "last_name"
CONTEXT:  COPY tester, line 3: "3,Michael Swan"

The error points to the column after the missing column. The context points to the line number while displaying the text.

You should take careful note that the copy command is an appending command. If you run it a second time, you insert a duplicate set of values in the target table.

After experimenting, its time to fix my student instance. The transaction_upload_mysql.csv file has two critical errors that need to be fixed. They are:

  1. A comma terminates each line, which would raise an extra data after last expected column error.
  2. A comma terminates each line followed by some indefinite amount of whitespace, which would also raise an extra data after last expected column error.

Since I have students with little expertise in Unix or Linux commands, I must provide a single command that they can use to convert the file with problems to one without problems. However, they should copy the transaction_upload_mysql.csv file to ensure they don’t disable the equivalent functionality for the MySQL solution space.

They should copy two files as the root user from the mysql directory to the postgres directory, as follows:

cp /u01/app/mysql/upload/transaction_upload_mysql.csv /u01/app/postgres/upload/transaction_upload_postgres.csv
cp /u01/app/mysql/upload/transaction_upload2_mysql.csv /u01/app/postgres/upload/transaction_upload2_postgres.csv

As the root user in the /u01/app/upload/postgres directory, run the following command:

cat transaction_upload_postgres.csv | sed -e 's/\,$//g' > x; cat x | sed -e 's/\,[[:space:]]*$//g' > y; mv y transaction_upload_postgres.csv; rm x

Please check the file permissions and ownerships with the ll (long list) command. If the file isn’t like this:

-rw-r--r--. 1 postgres postgres 49 Nov 13 10:56 transaction_upload_postgres.csv

Then, they should be able to change it as the root user with these commands:

chown postgres:postgres transaction_upload_postgres.csv
chmod 544 transaction_upload_postgres.csv

Lastly, they should connect to the psql as the student user, using the videodb database and run the following command:

COPY transaction_upload
FROM '/u01/app/postgres/upload/transaction_upload_postgres.csv' DELIMITERS ',' CSV;

A query of the import table with this:

SELECT COUNT(*) FROM transaction_upload;

should return:

 count 
-------
 11520
(1 row)

As always, I hope this helps those looking for some explanation and example on the copy feature of PostgreSQL.

Written by maclochlainn

November 13th, 2019 at 7:38 am

PostgreSQL and LPAD

without comments

While porting my Oracle code to PostgreSQL I encountered a little quirk. It’s probably not a quirk except for the fact that I’ve worked in Oracle so long. Oracle implicitly type casts so well that we seldom notice.

PostreSQL doesn’t work like Oracle. It does type cast sometimes but not very often. I tried porting the following segment from my Oracle stored procedure to PostgreSQL:

/* Add account number with zeros. */
FOR j IN 1..50 LOOP
  INSERT INTO account_list
  VALUES
  ( lv_airport_code||'-'||LPAD(j,6,'0')
  , NULL
  , NULL
  , 1002
  , 1002 );
END LOOP;

Oracle implicitly casts the integer j to a text string before running the LPAD function. In PostgreSQL, it doesn’t cast the integer to a text string before calling the LPAD function. Since the LPAD function is not overloaded, calling it with an integer, integer, and text set of parameters fails at runtime but the code doesn’t raise an exception when compiling it as a stored procedure.

At runtime, it raises the following error:

ERROR:  invalid INPUT syntax FOR INTEGER: "function lpad(integer, integer, unknown) does not exist"
CONTEXT:  PL/pgSQL FUNCTION seed_account_list() line 48 at assignment

You can fix the code by explicitly casting the LPAD function’s first parameter to a text string. You do that as follows below:

/* Add account number with zeros. */
FOR j IN 1..50 LOOP
  INSERT INTO account_list
  VALUES
  ( lv_airport_code||'-'||LPAD(j::text,6,'0')
  , NULL
  , NULL
  , 1002
  , 1002 );
END LOOP;

This fixes the casting problem. As always, I hope this helps those looking for a solution.

Written by maclochlainn

November 12th, 2019 at 4:08 pm

Postgres Overloaded Routines

without comments

Earlier I showed how to write an anonymous block in PostgreSQL PL/pgSQL to drop routines, like functions and procedures. However, it would only work when they’re not overloaded functions or procedures. The following lets you drop all routines, including overloaded functions and procedures. Overloaded procedures are those that share the same name but have different parameter lists.

Before you can test the anonymous block, you need to create a set of overloaded functions or procedures. You can create a set of overloaded hello procedures with the following syntax:

CREATE FUNCTION hello()
RETURNS text AS
$$
DECLARE
  output  VARCHAR;
BEGIN
  SELECT 'Hello World!' INTO output;
  RETURN output;
END
$$ LANGUAGE plpgsql;
 
CREATE FUNCTION hello(whom text)
RETURNS text AS
$$
DECLARE
  output  VARCHAR;
BEGIN
  SELECT CONCAT('Hello ',whom,'!') INTO output;
  RETURN output;
END
$$ LANGUAGE plpgsql;
 
CREATE FUNCTION hello(id int, whom text)
RETURNS text AS
$$
DECLARE
  output  VARCHAR;
BEGIN
  SELECT CONCAT('[',id,'] Hello ',whom,'!') INTO output;
  RETURN output;
END
$$ LANGUAGE plpgsql;

You can test the overloaded hello function, like so from the videodb schema:

videodb=> SELECT hello();
    hello     
--------------
 Hello World!
(1 ROW)
 
videodb=> SELECT hello('Captain Marvel');
         hello         
-----------------------
 Hello Captain Marvel!
(1 ROW)
 
videodb=> SELECT hello(1,'Captain America');
           hello            
----------------------------
 [1] Hello Captain America!
(1 ROW)

Then, you can query the information_schema to verify that you’ve created a set of overloaded procedures with the following query:

SELECT   proc.specific_schema AS procedure_schema
,        proc.specific_name
,        proc.routine_name AS procedure_name
,        proc.external_language
,        args.parameter_name
,        args.parameter_mode
,        args.data_type
FROM     information_schema.routines proc left join information_schema.parameters args
ON       proc.specific_schema = args.specific_schema
AND      proc.specific_name = args.specific_name
WHERE    proc.routine_schema NOT IN ('pg_catalog', 'information_schema')
AND      proc.routine_type IN ('FUNCTION','PROCEDURE')
ORDER BY procedure_schema
,        specific_name
,        procedure_name
,        args.ordinal_position;

It should return the following:

 procedure_schema | specific_name | procedure_name | external_language | parameter_name | parameter_mode | data_type 
------------------+---------------+----------------+-------------------+----------------+----------------+-----------
 public           | hello_35451   | hello          | PLPGSQL           |                |                | 
 public           | hello_35452   | hello          | PLPGSQL           | whom           | IN             | text
 public           | hello_35453   | hello          | PLPGSQL           | id             | IN             | integer
 public           | hello_35453   | hello          | PLPGSQL           | whom           | IN             | text
(4 rows)

The set session command maps the videodb catalog for the following anonymous block program.

SET SESSION "videodb.catalog_name" = 'videodb';

The following anonymous block lets you get rid of any ordinary or overloaded function and procedure:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
DO $$
DECLARE
  /* Declare an indefinite length string for SQL statement. */
  sql  VARCHAR;
 
  /* Declare variables to manage cursor return values. */
  row  RECORD;
  arg  VARCHAR;
 
  /* Declare parameter list. */
  list VARCHAR;
 
  /* Declare a routine cursor. */
  routine_cursor CURSOR FOR
    SELECT routine_name
    ,      specific_name
    ,      routine_type
    FROM   information_schema.routines
    WHERE  specific_catalog = current_setting('videodb.catalog_name')
    AND    routine_schema = 'public';
 
  /* Declare a parameter cursor. */
  parameter_cursor CURSOR (cv_specific_name varchar) FOR
    SELECT args.data_type
    FROM   information_schema.parameters args
    WHERE  args.specific_schema = 'public'
    AND    args.specific_name = cv_specific_name;
 
BEGIN
  /* Open the cursor. */
  OPEN routine_cursor;
  <<row_loop>>
  LOOP
    /* Fetch table names. */
    FETCH routine_cursor INTO row;
 
    /* Exit when no more records are found. */
    EXIT row_loop WHEN NOT FOUND;
 
    /* Initialize parameter list. */
    list := '(';
 
    /* Open the parameter cursor. */
    OPEN parameter_cursor(row.specific_name::varchar);
    <<parameter_loop>>
    LOOP
      FETCH parameter_cursor INTO arg;
 
      /* Exit the parameter loop. */
      EXIT parameter_loop WHEN NOT FOUND;
 
      /* Add parameter and delimit more than one parameter with a comma. */
      IF LENGTH(list) > 1 THEN
        list := CONCAT(list,',',arg);
      ELSE
        list := CONCAT(list,arg);
      END IF;
    END LOOP;
 
    /* Close the parameter list. */
    list := CONCAT(list,')');
 
    /* Close the parameter cursor. */
    CLOSE parameter_cursor;
 
    /* Concatenate together a DDL to drop the table with prejudice. */
    sql := 'DROP '||row.routine_type||' IF EXISTS '||row.routine_name||list;
 
    /* Execute the DDL statement. */
    EXECUTE sql;
  END LOOP;
 
  /* Close the routine_cursor. */
  CLOSE routine_cursor;
END;
$$;

Now, you possess the magic to automate cleaning up your schema when you combine this with my earlier post on dynamically dropping tables, sequences, and triggers.

Written by maclochlainn

November 5th, 2019 at 12:11 am

DBeaver for PostgreSQL

without comments

I’m migrating my database classes from the Oracle database to the PostgreSQL database. Using the Oracle Express Edition has always required a virtualized image because students use Windows and Mac OS. Also, the university doesn’t like my use of a virtualized image. Virtualization imposes incremental cost on students to have high end laptops.

The available Docker images don’t typically support the Oracle Express Edition. That means there are licensing implications tied to Oracle.

As a committee, we decided to use PostgreSQL as our new database platform. We opted to use PostgreSQL over MySQL because it supports arrays and stored procedures. PostgreSQL PL/pgSQL are a solid implementation of stored procedures. While MySQL supports a stored procedure language, it effectively discourages using it.

We use sqlplus, mysql, or psql Command-Line Interface (CLI) as the primary interface. That’s because CLI supports production-oriented deployment and DevOps practices. The Open Source DBeaver project provides a clean native GUI management console for PostgreSQL. DBeaver also supports other databases, which pgAdmin doesn’t.

You click the Download button from the DBeaver home page. It support Windows, Mac OS, and Linux distributions. I downloaded the Linux RPM package to my Downloads directory. You can run the RPM with the following syntax:

rpm -ivh dbeaver-ce-6.2.3-stable.x86_64.rpm

It doesn’t check for dependencies. The installation is quick. You can verify that it’s installed with the following command:

which dbeaver

It installs here:

/usr/bin/dbeaver

If you failed to install the PostgreSQL JAR file, you’ll see the following error message:

It will look like the following on Linux:

You should have a user in the pg_hba.conf file, like the following:

host    all             all             127.0.0.1/32            trust

As always, I hope this helps those solving problems.

Written by maclochlainn

November 3rd, 2019 at 5:06 pm