Sql | MacLochlainns Weblog

Archive for the ‘sql’ Category

Selective Aggregation

Learning Outcomes

Learn how to combine CASE operators and aggregation functions.
Learn how to selective aggregate values.
Learn how to use SQL to format report output.

Selective aggregation is the combination of the CASE operator and aggregation functions. Any aggregation function adds, sums, or averages the numbers that it finds; and when you embed the results of a CASE operator inside an aggregation function you get a selective result. The selectivity is determined by the WHEN clause of a CASE operator, which is more or less like an IF statement in an imperative programming language.

The prototype for selective aggregation is illustrated with a SUM function below:

SELECT   SUM(CASE
               WHEN left_operand = right_operand THEN result
               WHEN left_operand > right_operand THEN result
               WHEN left_operand IN (SET OF comma-delimited VALUES) THEN result
               WHEN left_operand IN (query OF results) THEN result
               ELSE alt_result
             END) AS selective_aggregate
FROM     some_table;

A small example let’s you see how selective aggregation works. You create a PAYMENT table and PAYMENT_S sequence for this example, as follows:

-- Create a PAYMENT table.
CREATE TABLE payment
( payment_id     NUMBER
, payment_date   DATE	      CONSTRAINT nn_payment_1 NOT NULL
, payment_amount NUMBER(20,2) CONSTRAINT nn_payment_2 NOT NULL
, CONSTRAINT pk_payment PRIMARY KEY (payment_id));
 
-- Create a PAYMENT_S sequence.
CREATE SEQUENCE payment_s;

After you create the table and sequence, you should insert some data. You can match the values below or choose your own values. You should just insert values for a bunch of rows.

View Anonymous PL/SQL Block →

You can populate data with the anonymous PL/SQL block, which creates 10,000 random rows in the payment table. Please note thatyou will get different payment dates and amounts each time you run the script.

DECLARE
  -- Create local collection data types.
  TYPE pmtval IS TABLE OF NUMBER(20,2);
  TYPE smonth IS TABLE OF VARCHAR2(3);
 
  -- Create variable to hold the list of payments.
  payments PMTVAL := pmtval();
 
  -- Declare month arrays.
  short_month SMONTH := smonth('JAN','FEB','MAR','APR','MAY','JUN'
                              ,'JUL','AUG','SEP','OCT','NOV','DEC');
 
  -- Declare variable values.
  month      VARCHAR2(3);
  year       NUMBER := '2019';
  pmt_date   DATE; 
  tpmt_date  VARCHAR2(11);
 
  -- Declare default number of random payments.
  payment_number  NUMBER := 10000;
BEGIN
  -- Populate payment list. 
  FOR i IN 1..payment_number LOOP
    payments.EXTEND;
    SELECT   ROUND(dbms_random.value() * 1000,0)
    ||       '.'
    ||       ROUND(dbms_random.value() * 100,0)
    INTO     payments(payments.COUNT)
    FROM     dual;
  END LOOP;
 
  -- Create and populate payment date and amount.
  FOR i IN 1..payment_number LOOP
    -- Assign random month value.
    month := short_month(dbms_random.value(1,short_month.COUNT));
 
    -- Assign random day of the month value and assemble random date.
    IF month IN ('JAN','MAR','MAY','JUL','AUG','OCT','DEC') THEN
      pmt_date := ROUND(dbms_random.value(1,31),0) || '-' || month || '-' || year;
    ELSIF month IN ('APR','JUN','SEP','NOV') THEN
      pmt_date := ROUND(dbms_random.value(1,30),0) || '-' || month || '-' || year;
    ELSE
      pmt_date := ROUND(dbms_random.value(1,28),0) || '-' || month || '-' || year;
    END IF;
 
    -- Insert values into the PAYMENT table.
    INSERT INTO payment
    ( payment_id, payment_date, payment_amount )
    VALUES
    ( payment_s.NEXTVAL, pmt_date, payments(i));
  END LOOP;
 
  -- Commit the writes.
  COMMIT;
END;
/

After inserting 10,000 rows, you can get an unformatted total with the following query:

-- Query total amount.
SELECT   SUM(payment_amount) AS payment_total
FROM     payment;

It outputs the following:

PAYMENT_TOTAL
-------------
   5011091.75

You can nest the result inside the TO_CHAR function to format the output, like

-- Query total formatted amount.
SELECT   TO_CHAR(SUM(payment_amount),'999,999,999.00') AS payment_total
FROM     payment;

It outputs the following:

PAYMENT_TOTAL
---------------
   5,011,091.75

Somebody may suggest that you use a PIVOT function to rotate the data into a summary by month but the PIVOT function has limits. The pivoting key must be numeric and the column values will use only those numeric values.

-- Pivoted summaries by numeric monthly value.
SELECT   *
FROM    (SELECT EXTRACT(MONTH FROM payment_date) payment_month
         ,      payment_amount
         FROM   payment)
         PIVOT (SUM(payment_amount) FOR payment_month IN
                 (1,2,3,4,5,6,7,8,9,10,11,12));

It outputs the following:

	 1	    2	       3	  4	     5		6	   7	      8 	 9	   10	      11	 12
---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
 245896.55  430552.36  443742.63  457860.27  470467.18	466370.71  415158.28  439898.72  458998.09  461378.56  474499.22  246269.18

You can use selective aggregation to get the results by a character label, like

SELECT   SUM(
           CASE
             WHEN EXTRACT(MONTH FROM payment_date) = 1
             AND  EXTRACT(YEAR FROM payment_date) = 2019  THEN payment_amount
           END) AS "JAN"
,        SUM(
           CASE
             WHEN EXTRACT(MONTH FROM payment_date) = 2
             AND  EXTRACT(YEAR FROM payment_date) = 2019  THEN payment_amount
           END) AS "FEB"
,        SUM(
           CASE
             WHEN EXTRACT(MONTH FROM payment_date) = 3
             AND  EXTRACT(YEAR FROM payment_date) = 2019  THEN payment_amount
           END) AS "MAR"
,        SUM(
           CASE
             WHEN EXTRACT(MONTH FROM payment_date) IN (1,2,3)
             AND  EXTRACT(YEAR FROM payment_date) = 2019 THEN payment_amount
           END) AS "1FQ"
,        SUM(
           CASE
             WHEN EXTRACT(MONTH FROM payment_date) = 4
             AND  EXTRACT(YEAR FROM payment_date) = 2019  THEN payment_amount
           END) AS "APR"
FROM     payment;

It outputs the following:

       JAN	  FEB	     MAR	1FQ	   APR
---------- ---------- ---------- ---------- ----------
 245896.55  430552.36  443742.63 1120191.54  457860.27

You can format the output with a combination of the TO_CHAR and LPAD functions. The TO_CHAR allows you to add a formatting mask, complete with commas and two mandatory digits to the right of the decimal point. The reformatted query looks like

COL JAN FORMAT A13 HEADING "Jan"
COL FEB FORMAT A13 HEADING "Feb"
COL MAR FORMAT A13 HEADING "Mar"
COL 1FQ FORMAT A13 HEADING "1FQ"
COL APR FORMAT A13 HEADING "Apr"
SELECT   LPAD(TO_CHAR(SUM(
           CASE
             WHEN EXTRACT(MONTH FROM payment_date) = 1
             AND  EXTRACT(YEAR FROM payment_date) = 2019  THEN payment_amount
           END),'9,999,999.00'),13,' ') AS "JAN"
,        LPAD(TO_CHAR(SUM(
           CASE
             WHEN EXTRACT(MONTH FROM payment_date) = 2
             AND  EXTRACT(YEAR FROM payment_date) = 2019  THEN payment_amount
           END),'9,999,999.00'),13,' ') AS "FEB"
,        LPAD(TO_CHAR(SUM(
           CASE
             WHEN EXTRACT(MONTH FROM payment_date) = 3
             AND  EXTRACT(YEAR FROM payment_date) = 2019  THEN payment_amount
           END),'9,999,999.00'),13,' ') AS "MAR"
,        LPAD(TO_CHAR(SUM(
           CASE
             WHEN EXTRACT(MONTH FROM payment_date) IN (1,2,3)
             AND  EXTRACT(YEAR FROM payment_date) = 2019 THEN payment_amount
           END),'9,999,999.00'),13,' ') AS "1FQ"
,        LPAD(TO_CHAR(SUM(
           CASE
             WHEN EXTRACT(MONTH FROM payment_date) = 4
             AND  EXTRACT(YEAR FROM payment_date) = 2019  THEN payment_amount
           END),'9,999,999.00'),13,' ') AS "APR"
FROM     payment;

It displays the formatted output:

Jan	      Feb	    Mar 	  1FQ		Apr
------------- ------------- ------------- ------------- -------------
   245,896.55	 430,552.36    443,742.63  1,120,191.54    457,860.27

Written by maclochlainn

April 5th, 2022 at 1:40 pm

Posted in Database,Database Design,DBA,Linux,Oracle,Oracle 12c,Oracle 18c,Oracle 21c,Oracle DBA,Oracle Developer,sql,Unix,Windows OS

Tagged with Oracle DBA, Oracle Developer

INSERT Statement

without comments

INSERT Statement

Learning Outcomes

Learn how to use positional- and named-notation in INSERT statements.
Learn how to use the VALUES clause in INSERT statements.
Learn how to use subqueries in INSERT statements.

The INSERT statement lets you enter data into tables and views in two ways: via an INSERT statement with a VALUES clause and via an INSERT statement with a query. The VALUES clause takes a list of literal values (strings, numbers, and dates represented as strings), expression values (return values from functions), or variable values.

Query values are results from SELECT statements that are subqueries (covered earlier in this appendix). INSERT statements work with scalar, single-row, and multiple-row subqueries. The list of columns in the VALUES clause or SELECT clause of a query (a SELECT list) must map to the positional list of columns that defines the table. That list is found in the data dictionary or catalog. Alternatively to the list of columns from the data catalog, you can provide a named list of those columns. The named list overrides the positional (or default) order from the data catalog and must provide at least all mandatory columns in the table definition. Mandatory columns are those that are not null constrained.

Oracle databases differ from other databases in how they implement the INSERT statement. Oracle doesn’t support multiple-row inserts with a VALUES clause. Oracle does support default and override signatures as qualified in the ANSI SQL standards. Oracle also provides a multiple- table INSERT statement. This section covers how you enter data with an INSERT statement that is based on a VALUES clause or a subquery result statement. It also covers multiple-table INSERT statements.

The INSERT statement has one significant limitation: its default signature. The default signature is the list of columns that defines the table in the data catalog. The list is defined by the position and data type of columns. The CREATE statement defines the initial default signature, and the ALTER statement can change the number, data types, or ordering of columns in the default signature.

The default prototype for an INSERT statement allows for an optional column list that overrides the default list of columns. When you provide the column list you choose to implement named-notation, which is the right way to do it. Relying on the insertion order of the columns is a bad idea. An INSERT statement without a list of column names is a position-notation statement. Position-notation is bad because somebody can alter that order and previously written INSERT statements will break or put data in the wrong columns.

Like methods in OOPLs, an INSERT statement without the optional column list constructs an instance (or row) of the table using the default constructor. The override constructor for a row is defined by any INSERT statement when you provide an optional column list. That’s because it overrides the default constructor.

The generic prototype for an INSERT statement is confusing when it tries to capture both the VALUES clause and the result set from a query. Therefore, I’ve opted to provide two generic prototypes.

Insert by value

The first uses the VALUES clause:

INSERT
INTO table_name
[( column1, column2, column3, ...)] VALUES
( value1, value2, value3, ...);

Notice that the prototype for an INSERT statement with the result set from a query doesn’t use the VALUES clause at all. A parsing error occurs when the VALUES clause and query both occur in an INSERT statement.

The second prototype uses a query and excludes the VALUES clause. The subquery may return one to many rows of data. The operative rule is that all columns in the query return the same number of rows of data, because query results should be rectangles—rectangles made up of one to many rows of columns.

Insert by subquery

Here’s the prototype for an INSERT statement that uses a subquery:

INSERT
INTO table_name
[( column1, column2, column3, ...)]
( SELECT value1, value2, value3, ... FROM table_name WHERE ...);

A query, or SELECT statement, returns a SELECT list. The SELECT list is the list of columns, and it’s evaluated by position and data type. The SELECT list must match the definition of the table or the override signature provided.

Default signatures present a risk of data corruption through insertion anomalies, which occur when you enter bad data in tables. Mistakes transposing or misplacing values can occur more frequently with a default signature, because the underlying table structure can change. As a best practice, always use named notation by providing the optional list of values; this should help you avoid putting the right data in the wrong place.

The following subsections provide examples that use the default and override syntax for INSERT statements in Oracle databases. The subsections also cover multiple-table INSERT statements and a RETURNING INTO clause, which is an extension of the ANSI SQL standard. Oracle uses the RETURNING INTO clause to manage large objects, to return autogenerated identity column values, and to support some of the features of Oracle’s dynamic SQL. Note that Oracle also supports a bulk INSERT statement, which requires knowledge of PL/SQL.

Insert by Values →

An INSERT statement with a VALUES clause can only insert one row at a time in and Oracle database. Other databases, like Microsoft SQL Server and MySQL allow you to insert a comma delimited set of values inside the VALUES clause. Oracle adheres to the ANSI standard that support single row inserts with a VALUES clause and multiple row inserts with a subquery.

Inserting by the VALUES clause is the most common type of INSERT statement. It’s most useful when interacting with single-row inserts.

You typically use this type of INSERT statement when working with data entered through end-user web forms. In some cases, users can enter more than one row of data using a form, which occurs, for example, when a user places a meal order in a restaurant and the meal and drink are treated as order items. The restaurant order entry system would enter a single-row in the order table and two rows in the order_item table (one for the meal and the other for the drink). PL/SQL programmers usually handle the insertion of related rows typically inside a loop structure when they use dynamic INSERT statements. Dynamic inserts are typically performed using NDS (Native Dynamic SQL) statements.

Oracle supports only a single-row insert through the VALUES clause. Multiple-row inserts require an INSERT statement from a query.

The VALUES clause of an INSERT statement accepts scalar values, such as strings, numbers, and dates. It also accepts calls to arrays, lists, or user-defined object types, which are called flattened objects. Oracle supports VARRAY as arrays and nested tables as lists. They can both contain elements of a scalar data type or user-defined object type.

The following sections discuss how you use the VALUES clause with scalar data types, how you convert various data types, and how you use the VALUES clause with nested tables and user-defined object data types.

Inserting Scalar Data Types

Instruction Details →

This section shows you how to INSERT scalar values into tables.

The basic syntax for an INSERT statement with a VALUES clause can include an optional override signature between the table name and VALUES keyword. With an override signature, you designate the column names and the order of entry for the VALUES clause elements. Without an override signature, the INSERT signature checks the definition of the table in the database catalog. The positional order of the column in the data catalog defines the positional, or default, signature for the INSERT statement. As shown previously, you can discover the structure of a table in Oracle with the DESCRIBE command issued at the SQL*Plus command line:

DESCRIBE table_name

You’ll see the following after describing the rental table in SQL*Plus:

Name                                 Null?    Type
------------------------------------ -------- --------
RENTAL_ID                            NOT NULL NUMBER
CUSTOMER_ID                          NOT NULL NUMBER
CHECK_OUT_DATE                       NOT NULL DATE
RETURN_DATE                                   DATE
CREATED_BY                           NOT NULL NUMBER
CREATION_DATE                        NOT NULL DATE
LAST_UPDATED_BY                      NOT NULL NUMBER
LAST_UPDATE_DATE                     NOT NULL DATE

The rental_id column is a surrogate key, or an artificial numbering sequence. The combination of the customer_id and check_out_date columns serves as a natural key because a DATE data type is a date-time value. If it were only a date, the customer would be limited to a single entry for each day, and limiting customer rentals to one per day isn’t a good business model.

The basic INSERT statement would require that you look up the next sequence value before using it. You should also look up the surrogate key column value that maps to the row where your unique customer is stored in the contact table. For this example, assume the following facts:

Next sequence value is 1086
Customer’s surrogate key value is 1009
Current date-time is represented by the value from the SYSDATE function
Return date is the fifth date from today
User adding and updating the row has a primary (surrogate) key value of 1
Creation and last update date are the value returned from the SYSDATE function.

An INSERT statement must include a list of values that match the positional data types of the database catalog, or it must use an override signature for all mandatory columns.

You can now write the following INSERT statement, which relies on the default signature:

Name                                      Null?    Type
------------------------------------ -------- --------
RENTAL_ID                            NOT NULL NUMBER
CUSTOMER_ID                          NOT NULL NUMBER
CHECK_OUT_DATE                       NOT NULL DATE
RETURN_DATE                                   DATE
CREATED_BY                           NOT NULL NUMBER
CREATION_DATE                        NOT NULL DATE
LAST_UPDATED_BY                      NOT NULL NUMBER
LAST_UPDATE_DATE                     NOT NULL DATE

Next sequence value is 1086
Customer’s surrogate key value is 1009
Current date-time is represented by the value from the SYSDATE function
Return date is the fifth date from today
User adding and updating the row has a primary (surrogate) key value of 1
Creation and last update date are the value returned from the SYSDATE function.

An INSERT statement must include a list of values that match the positional data types of the database catalog, or it must use an override signature for all mandatory columns.

You can now write the following INSERT statement, which relies on the default signature:

SQL> INSERT INTO rental
  2  VALUES
  3  ( 1086
  4  , 1009
  5  , SYSDATE
  6  , TRUNC(SYSDATE + 5) 7 ,1
  8  , SYSDATE
  9  , 1
 10  , SYSDATE);

If you weren’t using SYSDATE for the date-time value on line 5, you could manually enter a date-time with the following Oracle proprietary syntax:

  5  , TO_DATE('15-APR-2011 12:53:01','DD-MON-YYYY HH24:MI:SS')

The TO_DATE function is an Oracle-specific function. The generic conversion function would be the CAST function. The problem with a CAST function by itself is that it can’t handle a format mask other than the database defaults (‘DD-MON-RR‘ or ‘DD-MON-YYYY‘). For example, consider this syntax:

  5  , CAST('15-APR-2011 12:53:02' AS DATE)

It raises the following error:

  5  ,  CAST('15-APR-2011 12:53:02' AS DATE) FROM dual
        *
ERROR AT line 1:
ORA-01830: DATE format picture ends before converting entire input string

You actually need to double cast this type of format mask when you want to store it as a DATE data type. The working syntax casts the date-time string as a TIMESTAMP data type before recasting the TIMESTAMP to a DATE, like

  5  ,  CAST(CAST('15-APR-2011 12:53:02' AS TIMESTAMP) AS DATE)

Before you could write the preceding INSERT statement, you would need to run some queries to find the values. You would secure the next value from a rental_s1 sequence in an Oracle database with the following command:

SQL> SELECT   rental_s1.NEXTVAL FROM dual;

This assumes two things, because sequences are separate objects from tables. First, code from which the values in a table’s surrogate key column come must appear in the correct sequence. Second, a sequence value is inserted only once into a table as a primary key value.

In place of a query that finds the next sequence value, you would simply use a call against the .nextval pseudocolumn in the VALUES clause. You would replace line 3 with this:

  3  ( rental_s1.NEXTVAL

The .nextval is a pseudocolumn, and it instantiates an instance of a sequence in the current session. After a call to a sequence with the .nextval pseudocolumn, you can also call back the prior sequence value with the .currval pseudocolumn.

Assuming the following query would return a single-row, you can use the contact_id value as the customer_id value in the rental table:

SQL> SELECT   contact_id
  2  FROM     contact
  3  WHERE    last_name = 'Potter'
  4  AND      first_name = 'Harry';

Taking three steps like this is unnecessary, however, because you can call the next sequence value and find the valid customer_id value inside the VALUES clause of the INSERT statement. The following INSERT statement uses an override signature and calls for the next sequence value on line 11. It also uses a scalar subquery to look up the correct customer_id value with a scalar subquery on lines 12 through 15.

SQL> INSERT INTO rental
  2  ( rental_id
  3  , customer_id
  4  , check_out_date
  5  , return_date
  6  , created_by
  7  , creation_date
  8  , last_updated_by
  9  , last_update_date )
10 VALUES
 11  ( rental_s1.NEXTVAL
 12  ,(SELECT   contact_id
 13    FROM     contact
 14    WHERE    last_name = 'Potter'
 15    AND      first_name = 'Harry')
 16  , SYSDATE
 17  , TRUNC(SYSDATE + 5)
 18  , 1
 19  , SYSDATE
 20  , 3
 21  , SYSDATE);

When a subquery returns two or more rows because the conditions in the WHERE clause failed to find and return a unique row, the INSERT statement would fail with the following message:

,(SELECT   contact_id
  *
ERROR AT line 3:
ORA-01427: single-ROW subquery returns more than one ROW

In fact, the statement could fail when there are two or more “Harry Potter” names in the data set because three columns make up the natural key of the contact table. The third column is the member_id, and all three should be qualified inside a scalar subquery to guarantee that it returns only one row of data.

Handling Oracle’s Large Objects

Instruction Details →

This section shows you how to INSERT large object values into tables.

Oracle’s large objects present a small problem when they’re not null constrained in the table definition. You must insert empty object containers or references when you perform an INSERT statement.

Assume, for example, that you have the following three large object columns in a table:

Name                             Null?    Type
 ------------------------------- -------- -----------------------
ITEM_DESC                        NOT NULL CLOB
ITEM_ICON                        NOT NULL BLOB
ITEM_PHOTO                                BINARY FILE LOB

The item_desc column uses a CLOB (Character Large Object) data type, and it is a required column; it could hold a lengthy description of a movie, for example. The item_icon column uses a BLOB (Binary Large Object) data type, and it is also a required column. It could hold a graphic image. The item_photo column uses a binary file (an externally managed file) but is fortunately null allowed or an optional column in any INSERT statement. It can hold a null or a reference to an external graphic image.

Oracle provides two functions that let you enter an empty large object, and they are:

EMPTY_BLOB()
EMPTY_CLOB()

Although you could insert a null value in the item_photo column, you can also enter a reference to an Oracle database virtual directory file. Here’s the syntax to enter a valid BFILE name with the BFILENAME function call:

 10  , BFILENAME('VIRTUAL_DIRECTORY_NAME', 'file_name.png')

You can insert a large character or binary stream into BLOB and CLOB data types by using the stored procedures and functions available in the dbms_lob package. Chapter 13 covers the dbms_lob package.

You can use an empty_clob function or a string literal up to 32,767 bytes long in a VALUES clause. You must use the dbms_lob package when you insert a string that is longer than 32,767 bytes. That also changes the nature of the INSERT statement and requires that you append the RETURNING INTO clause. Here’s the prototype for this Oracle proprietary syntax:

INSERT INTO some_table
[( column1, column2, column3, ...)]
VALUES
( value1, value2, value3, ...)
RETURNING column1 INTO local_variable;

The local_variable is a reference to a procedural programming language. It lets you insert a character stream into a target CLOB column or insert a binary stream into a BLOB column.

Capturing the Last Sequence Value

Instruction Details →

This section shows you how to INSERT a new sequence in a parent table and a copy of that new sequence as a foreign key value in a child table.

Sometimes you insert into a series of tables in the scope of a transaction. In this scenario, one table gets the new sequence value (with a call to sequence_name.nextval) and enters it as the surrogate primary key, and another table needs a copy of that primary key to enter into a foreign key column. While scalar subqueries can solve this problem, Oracle provides the .currval pseudocolumn for this purpose.

The steps to demonstrate this behavior require a parent table and a child table. The parent table is defined as follows:

Name                                 Null?    Type
------------------------------------ -------- --------------
PARENT_ID                            NOT NULL NUMBER
PARENT_NAME                                   VARCHAR2(10)

The parent_id column is the primary key for the parent table. You include the parent_id column in the child table. In the child table, the parent_id column holds a copy of a valid primary key column value as a foreign key to the parent table.

Name                                 Null?    Type
------------------------------------ -------- --------------
CHILD_ID                             NOT NULL NUMBER
PARENT_ID                                     NUMBER
PARENT_NAME                                   VARCHAR2(10)

After creating the two tables, you can manage inserts into them with the .nextval and .currval pseudocolumns. The sequence calls with the .nextval pseudocolumn insert primary key values, and the sequence calls with the .currval pseudocolumn insert foreign key values.

You would perform these two INSERT statements as a group:

SQL> INSERT INTO parent
  2  VALUES
  3  ( parent_s1.NEXTVAL 4 ,'One Parent');
 
SQL> INSERT INTO child
  2  VALUES
  3  ( child_s1.NEXTVAL 4 , parent_s1.CURRVAL 5 ,'One Child');

The .currval pseudocolumn for any sequence fetches the value placed in memory by call to the .nextval pseudocolumn. Any attempt to call the .currval pseudocolumn before the .nextval pseudocolumn raises an ORA-02289 exception. The text message for that error says the sequence doesn’t exist, which actually means that it doesn’t exist in the scope of the current session. Line 4 in the insert into the child table depends on line 3 in the insert into the parent table.

You can use comments in INSERT statements to map to columns in the table. For example, the following shows the technique for the child table from the preceding example:

SQL> INSERT INTO child
  2  VALUES
  3  ( child_s1.NEXTVAL -- CHILD_ID
  4  , parent_s1.CURRVAL -- PARENT_ID
  5  ,'One Child') -- CHILD_NAME
  6  /

Comments on the lines of the VALUES clause identify the columns where the values are inserted. A semicolon doesn’t execute this statement, because a trailing comment would trigger a runtime exception. You must use the semicolon or forward slash on the line below the last VALUES element to include the last comment.

Insert by Subquery Results →

An INSERT statement with a subquery can insert one to many rows of data into any table provided the SELECT-list of the subquery matches the data dictionary definition of the table or the named-notation list provided by the INSERT statement. An INSERT statement with a subquery cannot have a VALUES keyword in it, or it raises an error.

The generic prototype for an INSERT statement follows the pattern of an INSERT statement by value prototype with one exception, it excludes the VALUES keyword and replaces the common delimited list of values with a SELECT-list from a subquery. If you want to rely on the positional definition of the table, exclude the list of comma delimited column values. The optional comma-delimited list of column values is necessary when you want to insert columns in a different order or exclude optional columns.

The generic prototype is:

INSERT
INTO table_name
[( column1, column2, column3, ...)]
( SELECT value1, value2, value3, ... FROM table_name WHERE ...);

The subquery, or SELECT statement, must return a SELECT-list that maps to the column definition in the data dictionary or the optional comma-delimited column list.

Written by maclochlainn

April 5th, 2022 at 1:23 pm

Posted in Database,Database Design,DBA,Linux,Oracle,Oracle 12c,Oracle 18c,Oracle 21c,sql,SQL Developer,Unix,Windows OS

Tagged with Oracle DBA, Oracle Developer

Dynamic Drop Table

without comments

I always get interesting feedback on some posts. On my test case for discovering the STR_TO_DATE function’s behavior, the comment was tragically valid. I failed to cleanup after my test case. That was correct, and I should have dropped param table and the two procedures.

While appending the drop statements is the easiest, I thought it was an opportunity to have a bit of fun and write another procedure that will cleanup test case tables within the test_month_name procedure. Here’s sample dynamic drop_table procedure that you can use in other MySQL stored procedures:

CREATE PROCEDURE drop_table
( table_name  VARCHAR(64))
BEGIN
 
  /* Declare a local variable for the SQL statement. */
  DECLARE stmt VARCHAR(1024);
 
  /* Set a session variable with two parameter markers. */
  SET @SQL := CONCAT('DROP TABLE ',table_name);
 
  /* Check if the constraint exists. */    
  IF EXISTS (SELECT NULL
             FROM   information_schema.tables t
             WHERE  t.table_schema = database()
             AND    t.table_name = table_name)
  THEN
 
    /* Dynamically allocated and run statement. */
    PREPARE stmt FROM @SQL;
    EXECUTE stmt;
    DEALLOCATE PREPARE stmt;
  END IF;
 
END;
$$

You can now put a call to the drop_table procedure in the test_month_name procedure from the earlier post. For convenience, here’s the modified test_month_name procedure with the call on line #33 right before you leave the loop and procedure:

CREATE PROCEDURE test_month_name()
BEGIN
 
  /* Declare a handler variable. */
  DECLARE month_name  VARCHAR(9);
 
  /* Declare a handler variable. */
  DECLARE fetched  INT DEFAULT 0;
 
  /* Cursors must come after variables and before event handlers. */
  DECLARE month_cursor CURSOR FOR
    SELECT m.month_name
    FROM   month m;
 
  /* Declare a not found record handler to close a cursor loop. */
  DECLARE CONTINUE HANDLER FOR NOT FOUND SET fetched = 1;
 
  /* Open cursor and start simple loop. */
  OPEN month_cursor;
  cursor_loop:LOOP
 
    /* Fetch a record from the cursor. */
    FETCH month_cursor
    INTO  month_name;
 
    /* Place the catch handler for no more rows found
       immediately after the fetch operations. */
    IF fetched = 1 THEN 
      /* Fetch the partial strings that fail to find a month. */
      SELECT * FROM param;
 
      /* Conditionally drop the param table. */
      CALL drop_table('param');
 
      /* Leave the loop. */
      LEAVE cursor_loop;
    END IF;
 
    /* Call the subfunction because stored procedures do not
       support nested loops. */
    CALL read_string(month_name);
  END LOOP;
END;
$$

As always, I hope sample code examples help others solve problems.

Written by maclochlainn

February 12th, 2022 at 12:33 pm

Posted in MySQL,MySQL 8,sql

Tagged with MySQL DBA, MySQL Developer

str_to_date Function

with 3 comments

As many know, I’ve adopted Learning SQL by Alan Beaulieu as a core reference for my database class. Chapter 7 in the book focuses on data generation, manipulation, and conversion.

The last exercise question in my check of whether they read the chapter and played with some of the discussed functions is:

Use one or more temporal function to write a query that convert the ’29-FEB-2024′ string value into a default MySQL date format. The result should display:
+--------------------+ | mysql_default_date | +--------------------+ | 2024-02-29 | +--------------------+ 1 row in set, 1 warning (0.00 sec)
+--------------------+ | mysql_default_date | +--------------------+ | 2024-02-29 | +--------------------+ 1 row in set, 1 warning (0.00 sec)

If you’re not familiar with the behavior of MySQL functions, this could look like a difficult problem to solve. If you’re risk inclined you would probably try the STR_TO_DATE function but if you’re not risk inclined the description of the %m specifier might suggest you don’t have SQL builtin to solve the problem.

I use the problem to teach the students how to solve problems in SQL queries. The first step requires putting the base ’29-FEB-2024′ string value into a mystringstrings table, like:

DROP TABLE IF EXISTS strings;
CREATE TABLE strings
(mystring  VARCHAR(11));
 
SELECT 'Insert' AS statement;
INSERT INTO strings
(mystring)
VALUES
('29-FEB-2024');

The next step requires creating a query with:

A list of parameters in a Common Table Expression (CTE)
A CASE statement to filter results in the SELECT-list
A CROSS JOIN between the strings table and params CTE

The query would look like this resolves the comparison in the CASE statement through a case insensitive comparison:

SELECT 'Query' AS statement;
WITH params AS
(SELECT 'January' AS full_month
 UNION ALL
 SELECT 'February' AS full_month)
SELECT s.mystring
,      p.full_month
,      CASE
         WHEN SUBSTR(s.mystring,4,3) = SUBSTR(p.full_month,1,3) THEN
           STR_TO_DATE(REPLACE(s.mystring,SUBSTR(s.mystring,4,3),p.full_month),'%d-%M-%Y') 
       END AS converted_date
FROM   strings s CROSS JOIN params p;

and return:

+-------------+------------+----------------+
| mystring    | full_month | converted_date |
+-------------+------------+----------------+
| 29-FEB-2024 | January    | NULL           |
| 29-FEB-2024 | February   | 2024-02-29     |
+-------------+------------+----------------+
2 rows in set (0.00 sec)

The problem with the result set, or derived table, is the CROSS JOIN. A CROSS JOIN matches every row in one table with every row in another table or derived table from prior joins. That means you need to add a filter in the WHERE clause to ensure you only get matches between the strings and parameters, like the modified query:

WITH params AS 
(SELECT 'January' AS full_month 
 UNION ALL
 SELECT 'February' AS full_month)
SELECT s.mystring
,      p.full_month
,      CASE
         WHEN SUBSTR(s.mystring,4,3) = SUBSTR(p.full_month,1,3) THEN
           STR_TO_DATE(REPLACE(s.mystring,SUBSTR(s.mystring,4,3),p.full_month),'%d-%M-%Y') 
       END AS converted_date
FROM   strings s CROSS JOIN params p
WHERE  SUBSTR(s.mystring,4,3) = SUBSTR(p.full_month,1,3);

It returns a single row, like:

+-------------+------------+----------------+
| mystring    | full_month | converted_date |
+-------------+------------+----------------+
| 29-FEB-2024 | February   | 2024-02-29     |
+-------------+------------+----------------+
1 row in set (0.00 sec)

However, none of this is necessary because the query can be written like this:

SELECT STR_TO_DATE('29-FEB-2024','%d-%M-%Y') AS mysql_date;

It returns:

+------------+
| mysql_date |
+------------+
| 2024-02-29 |
+------------+
1 row in set (0.00 sec)

That’s because the STR_TO_DATE() function with the %M specifier resolves all months with three or more characters. Three characters are required because both Mar and May, and June and July can only be qualified by three characters. If you provide less than three characters of the month, the function returns a null value.

Here’s a complete test case that lets you discover all the null values that may occur with two few characters:

/* Conditionally drop the table. */
DROP TABLE IF EXISTS month, param;
 
/* Create a table. */
CREATE TABLE month
( month_name  VARCHAR(9));
 
/* Insert into the month table. */
INSERT INTO month
( month_name )
VALUES
 ('January')
,('February')
,('March')
,('April')
,('May')
,('June')
,('July')
,('August')
,('September')
,('October')
,('November')
,('December');
 
/* Create a table. */
CREATE TABLE param
( month   VARCHAR(9)
, needle  VARCHAR(9));
 
/* Conditionally drop the procedure. */
DROP PROCEDURE IF EXISTS read_string;
DROP PROCEDURE IF EXISTS test_month_name;
 
/* Reset the execution delimiter to create a stored program. */
DELIMITER $$
 
/* Create a procedure. */
CREATE PROCEDURE read_string(month_name  VARCHAR(9))
BEGIN
 
  /* Declare a handler variable. */
  DECLARE display     VARCHAR(17);
  DECLARE evaluate    VARCHAR(17);
  DECLARE iterator    INT DEFAULT 1;
  DECLARE partial     VARCHAR(9);
 
  /* Read the list of characters. */
  character_loop:LOOP
 
    /* Print the character list. */
    IF iterator > LENGTH(month_name) THEN
      LEAVE character_loop;
    END IF;
 
    /* Assign substring of month name. */
    SELECT SUBSTR(month_name,1,iterator) INTO partial;
    SELECT CONCAT('01-',partial,'-2024') INTO evaluate;
 
    /* Print only the strings too short to identify as the month. */
    IF STR_TO_DATE(evaluate,'%d-%M-%Y') IS NULL THEN
      INSERT INTO param
      ( month, needle )
      VALUES
      ( month_name, partial );
    END IF;
 
    /* Increment the counter. */
    SET iterator = iterator + 1;
 
  END LOOP;
END;
$$
 
/* Create a procedure. */
CREATE PROCEDURE test_month_name()
BEGIN
 
  /* Declare a handler variable. */
  DECLARE month_name  VARCHAR(9);
 
  /* Declare a handler variable. */
  DECLARE fetched  INT DEFAULT 0;
 
  /* Cursors must come after variables and before event handlers. */
  DECLARE month_cursor CURSOR FOR
    SELECT m.month_name
    FROM   month m;
 
  /* Declare a not found record handler to close a cursor loop. */
  DECLARE CONTINUE HANDLER FOR NOT FOUND SET fetched = 1;
 
  /* Open cursor and start simple loop. */
  OPEN month_cursor;
  cursor_loop:LOOP
 
    /* Fetch a record from the cursor. */
    FETCH month_cursor
    INTO  month_name;
 
    /* Place the catch handler for no more rows found
       immediately after the fetch operations. */
    IF fetched = 1 THEN 
      /* Fetch the partial strings that fail to find a month. */
      SELECT * FROM param;
 
      /* Leave the loop. */
      LEAVE cursor_loop;
    END IF;
 
    /* Call the subfunction because stored procedures do not
       support nested loops. */
    CALL read_string(month_name);
  END LOOP;
END;
$$
 
/* Reset the delimter. */
DELIMITER ;
 
CALL test_month_name();

/* Conditionally drop the table. */ DROP TABLE IF EXISTS month, param; /* Create a table. */ CREATE TABLE month ( month_name VARCHAR(9)); /* Insert into the month table. */ INSERT INTO month ( month_name ) VALUES ('January') ,('February') ,('March') ,('April') ,('May') ,('June') ,('July') ,('August') ,('September') ,('October') ,('November') ,('December'); /* Create a table. */ CREATE TABLE param ( month VARCHAR(9) , needle VARCHAR(9)); /* Conditionally drop the procedure. */ DROP PROCEDURE IF EXISTS read_string; DROP PROCEDURE IF EXISTS test_month_name; /* Reset the execution delimiter to create a stored program. */ DELIMITER $$ /* Create a procedure. */ CREATE PROCEDURE read_string(month_name VARCHAR(9)) BEGIN /* Declare a handler variable. */ DECLARE display VARCHAR(17); DECLARE evaluate VARCHAR(17); DECLARE iterator INT DEFAULT 1; DECLARE partial VARCHAR(9); /* Read the list of characters. */ character_loop:LOOP /* Print the character list. */ IF iterator > LENGTH(month_name) THEN LEAVE character_loop; END IF; /* Assign substring of month name. */ SELECT SUBSTR(month_name,1,iterator) INTO partial; SELECT CONCAT('01-',partial,'-2024') INTO evaluate; /* Print only the strings too short to identify as the month. */ IF STR_TO_DATE(evaluate,'%d-%M-%Y') IS NULL THEN INSERT INTO param ( month, needle ) VALUES ( month_name, partial ); END IF; /* Increment the counter. */ SET iterator = iterator + 1; END LOOP; END; $$ /* Create a procedure. */ CREATE PROCEDURE test_month_name() BEGIN /* Declare a handler variable. */ DECLARE month_name VARCHAR(9); /* Declare a handler variable. */ DECLARE fetched INT DEFAULT 0; /* Cursors must come after variables and before event handlers. */ DECLARE month_cursor CURSOR FOR SELECT m.month_name FROM month m; /* Declare a not found record handler to close a cursor loop. */ DECLARE CONTINUE HANDLER FOR NOT FOUND SET fetched = 1; /* Open cursor and start simple loop. */ OPEN month_cursor; cursor_loop:LOOP /* Fetch a record from the cursor. */ FETCH month_cursor INTO month_name; /* Place the catch handler for no more rows found immediately after the fetch operations. */ IF fetched = 1 THEN /* Fetch the partial strings that fail to find a month. */ SELECT * FROM param; /* Leave the loop. */ LEAVE cursor_loop; END IF; /* Call the subfunction because stored procedures do not support nested loops. */ CALL read_string(month_name); END LOOP; END; $$ /* Reset the delimter. */ DELIMITER ; CALL test_month_name();

It returns the list of character fragments that fail to resolve English months:

+---------+--------+
| month   | needle |
+---------+--------+
| January | J      |
| March   | M      |
| March   | Ma     |
| April   | A      |
| May     | M      |
| May     | Ma     |
| June    | J      |
| June    | Ju     |
| July    | J      |
| July    | Ju     |
| August  | A      |
+---------+--------+
11 rows in set (0.02 sec)

There are two procedures because MySQL doesn’t support nested loops and uses a single-pass parser. So, the first read_string procedure is the inner loop and the second test_month_name procedure is the outer loop.

I wrote a follow-up to this post because of a reader’s question about not cleaning up the test case. In the other post, you will find a drop_table procedure that lets you dynamically drop the param table created to store the inner loop procedure’s results.

As always, I hope this helps those looking to open the hood and check the engine.

Written by maclochlainn

February 11th, 2022 at 1:13 am

Posted in MySQL,MySQL 8,sql

Tagged with MySQL DBA, MySQL Developer

Case Sensitive Comparison

without comments

Sometimes you hear from some new developers that MySQL only makes case insensitive string comparisons. One of my students showed me their test case that they felt proved it:

SELECT STRCMP('a','A') WHERE 'a' = 'A';

Naturally, it returns 0, which means:

The values compared by the STRCMP() function makes a case insensitive comparison, and
The WHERE clause also compares strings case insensitively.

As a teacher, you’re gratified that the student took the time to build their own use cases. However, in this case I had to explain that while he was right about the STRCMP() function and the case insensitive comparison the student used in the WHERE clause was a choice, it wasn’t the only option. The student was wrong to conclude that MySQL couldn’t make case sensitive string comparisons.

I modified his sample by adding the required BINARY keyword for a case sensitive comparison in the WHERE clause:

SELECT STRCMP('a','A') WHERE BINARY 'a' = 'A';

It returns an empty set, which means the binary comparison in the WHERE clause is a case sensitive comparison. Then, I explained while the STRCMP() function performs a case insensitive match, the REPLACE() function performs a case sensitive one. Then, I gave him the following expanded use case for the two functions:

SELECT STRCMP('a','A')      AS test1
,      REPLACE('a','A','b') AS test2
,      REPLACE('a','a','b') AS test3;

It returns:

+-------+-------+-------+
| test1 | test2 | test3 |
+-------+-------+-------+
|     0 | a     | b     |
+-------+-------+-------+
1 row in set (0.00 sec)

The behavior of one function may be different than another as to how it compares strings, and its the developers responsibility to make sure they understand its behavior thoroughly before they use it. The binary comparison was a win for the student since they were building a website that needed that behavior from MySQL.

As always, I hope tidbits like this save folks time using MySQL.

Written by maclochlainn

February 10th, 2022 at 3:05 pm

Posted in MySQL,MySQL 8,sql

Tagged with MySQL DBA, MySQL Developer

Read CSV with Python

without comments

In 2009, I showed an example of how to use the MySQL LOAD DATA INFILE command. Last year, I updated the details to reset the secure_file-priv privilege to use the LOAD DATA INFILE command, but you can avoid that approach with a simple Python 3 program like the one in this example. You also can use MySQL Shell’s new parallel table import feature, introduced in 8.0.17, as noted in a comment on this blog post.

The example requires creating an avenger table, avenger.csv file, a readWriteData.py Python script, run the readWriteData.py Python script, and a query that validates the insertion of the avenger.csv file’s data into the avenger table. The complete code in five steps using the sakila demonstration database:

Creating the avenger table with the create_avenger.sql script:

-- Conditionally drop the avenger table.
DROP TABLE IF EXISTS avenger;
 
-- Create the avenger table.
CREATE TABLE avenger
( avenger_id    int unsigned PRIMARY KEY AUTO_INCREMENT
, first_name    varchar(20)
, last_name     varchar(20)
, avenger_name  varchar(20))
  ENGINE=InnoDB
  AUTO_INCREMENT=1001
  DEFAULT CHARSET=utf8mb4
  COLLATE=utf8mb4_0900_ai_ci;

Create the avenger.csv file with the following data:

Anthony,Stark,Iron Man
Thor,Odinson,God of Thunder
Steven,Rogers,Captain America
Bruce,Banner,Hulk
Clinton,Barton,Hawkeye
Natasha,Romanoff,Black Widow
Peter,Parker,Spiderman
Steven,Strange,Dr. Strange
Scott,Lange,Ant-man
Hope,van Dyne,Wasp

Create the readWriteFile.py Python 3 script:

# Import libraries.
import csv
import mysql.connector
from mysql.connector import errorcode
from csv import reader
 
#  Attempt the statement.
# ============================================================
#  Use a try-catch block to manage the connection.
# ============================================================
try:
  # Open connection.
  cnx = mysql.connector.connect( user='student'
                               , password='student'
                               , host='127.0.0.1'
                               , database='sakila')
  # Create cursor.
  cursor = cnx.cursor()
 
  # Open file in read mode and pass the file object to reader.
  with open('avenger.csv', 'r') as read_obj:
    csv_reader = reader(read_obj)
 
    # Declare the dynamic statement.
    stmt = ("INSERT INTO avenger "
            "(first_name, last_name, avenger_name) "
            "VALUES "
            "(%s, %s, %s)")
 
    # Iterate over each row in the csv using reader object
    for row in csv_reader:
      cursor.execute(stmt, row)
 
    # Commit the writes.
    cnx.commit()
 
    #close the connection to the database.
    cursor.close()
 
# Handle exception and close connection.
except mysql.connector.Error as e:
  if e.errno == errorcode.ER_ACCESS_DENIED_ERROR:
    print("Something is wrong with your user name or password")
  elif e.errno == errorcode.ER_BAD_DB_ERROR:
    print("Database does not exist")
  else:
    print("Error code:", e.errno)        # error number
    print("SQLSTATE value:", e.sqlstate) # SQLSTATE value
    print("Error message:", e.msg)       # error message
 
# Close the connection when the try block completes.
else:
  cnx.close()

Run the readWriteFile.py file:
python3 readWriteFile.py
python3 readWriteFile.py

Query the avenger table:

SELECT * FROM avenger;

It returns:

+------------+------------+-----------+-----------------+
| avenger_id | first_name | last_name | avenger_name    |
+------------+------------+-----------+-----------------+
|       1001 | Anthony    | Stark     | Iron Man        |
|       1002 | Thor       | Odinson   | God of Thunder  |
|       1003 | Steven     | Rogers    | Captain America |
|       1004 | Bruce      | Banner    | Hulk            |
|       1005 | Clinton    | Barton    | Hawkeye         |
|       1006 | Natasha    | Romanoff  | Black Widow     |
|       1007 | Peter      | Parker    | Spiderman       |
|       1008 | Steven     | Strange   | Dr. Strange     |
|       1009 | Scott      | Lange     | Ant-man         |
|       1010 | Hope       | van Dyne  | Wasp            |
+------------+------------+-----------+-----------------+
10 rows in set (0.00 sec)

Written by maclochlainn

December 12th, 2021 at 12:17 am

Posted in Linux,MySQL,MySQL 8,MySQL Client,MySQL Connect/Python,Python 3.x,sql

Tagged with MySQL DBA, MySQL Developer, MySQL Techniques

MySQL Query Performance

without comments

Working through our chapter on MySQL views, I wrote the query two ways to introduce the idea of SQL tuning. That’s one of the final topics before introducing JSON types.

I gave the students this query based on the Sakila sample database after explaining how to use the EXPLAIN syntax. The query only uses only inner joins, which are generally faster and more efficient than subqueries as a rule of thumb than correlated subqueries.

SELECT   ctry.country AS country_name
,        SUM(p.amount) AS tot_payments
FROM     city c INNER JOIN address a
ON       c.city_id = a.city_id INNER JOIN customer cus
ON       a.address_id = cus.address_id INNER JOIN payment p
ON       cus.customer_id = p.customer_id INNER JOIN country ctry
ON       c.country_id = ctry.country_id
GROUP BY ctry.country;

It generated the following tabular explain plan output:

+----+-------------+-------+------------+--------+---------------------------+--------------------+---------+------------------------+------+----------+------------------------------+
| id | select_type | table | partitions | type   | possible_keys             | key                | key_len | ref                    | rows | filtered | Extra                        |
+----+-------------+-------+------------+--------+---------------------------+--------------------+---------+------------------------+------+----------+------------------------------+
|  1 | SIMPLE      | cus   | NULL       | index  | PRIMARY,idx_fk_address_id | idx_fk_address_id  | 2       | NULL                   |  599 |   100.00 | Using index; Using temporary |
|  1 | SIMPLE      | a     | NULL       | eq_ref | PRIMARY,idx_fk_city_id    | PRIMARY            | 2       | sakila.cus.address_id  |    1 |   100.00 | NULL                         |
|  1 | SIMPLE      | c     | NULL       | eq_ref | PRIMARY,idx_fk_country_id | PRIMARY            | 2       | sakila.a.city_id       |    1 |   100.00 | NULL                         |
|  1 | SIMPLE      | ctry  | NULL       | eq_ref | PRIMARY                   | PRIMARY            | 2       | sakila.c.country_id    |    1 |   100.00 | NULL                         |
|  1 | SIMPLE      | p     | NULL       | ref    | idx_fk_customer_id        | idx_fk_customer_id | 2       | sakila.cus.customer_id |   26 |   100.00 | NULL                         |
+----+-------------+-------+------------+--------+---------------------------+--------------------+---------+------------------------+------+----------+------------------------------+
5 rows in set, 1 warning (0.02 sec)

Then, I used MySQL Workbench to generate the following visual explain plan:

Then, I compared it against a refactored version of the query that uses a correlated subquery in the SELECT-list. The example comes form Appendix B in Learning SQL, 3^rd Edition by Alan Beaulieu.

SELECT ctry.country
,      (SELECT   SUM(p.amount)
        FROM     city c INNER JOIN address a
        ON       c.city_id = a.city_id INNER JOIN customer cus
        ON       a.address_id = cus.address_id INNER JOIN payment p
        ON       cus.customer_id = p.customer_id
        WHERE    c.country_id = ctry.country_id) AS tot_payments
FROM   country ctry;

It generated the following tabular explain plan output:

+----+--------------------+-------+------------+------+---------------------------+--------------------+---------+------------------------+------+----------+-------------+
| id | select_type        | table | partitions | type | possible_keys             | key                | key_len | ref                    | rows | filtered | Extra       |
+----+--------------------+-------+------------+------+---------------------------+--------------------+---------+------------------------+------+----------+-------------+
|  1 | PRIMARY            | ctry  | NULL       | ALL  | NULL                      | NULL               | NULL    | NULL                   |  109 |   100.00 | NULL        |
|  2 | DEPENDENT SUBQUERY | c     | NULL       | ref  | PRIMARY,idx_fk_country_id | idx_fk_country_id  | 2       | sakila.ctry.country_id |    5 |   100.00 | Using index |
|  2 | DEPENDENT SUBQUERY | a     | NULL       | ref  | PRIMARY,idx_fk_city_id    | idx_fk_city_id     | 2       | sakila.c.city_id       |    1 |   100.00 | Using index |
|  2 | DEPENDENT SUBQUERY | cus   | NULL       | ref  | PRIMARY,idx_fk_address_id | idx_fk_address_id  | 2       | sakila.a.address_id    |    1 |   100.00 | Using index |
|  2 | DEPENDENT SUBQUERY | p     | NULL       | ref  | idx_fk_customer_id        | idx_fk_customer_id | 2       | sakila.cus.customer_id |   26 |   100.00 | NULL        |
+----+--------------------+-------+------------+------+---------------------------+--------------------+---------+------------------------+------+----------+-------------+
5 rows in set, 2 warnings (0.00 sec)

and, MySQL Workbench generated the following visual explain plan:

The tabular explain plan identifies the better performing query to an experienced eye but the visual explain plan works better for those new to SQL tuning.

The second query performs best because it reads the least data by leveraging the indexes best. As always, I hope these examples help those looking at learning more about MySQL.

Written by maclochlainn

December 9th, 2021 at 1:01 am

Posted in Linux,MySQL,MySQL 8,sql

Tagged with MySQL DBA, MySQL DBA Techniques, MySQL Developer

MySQL DropIndexIfExists

without comments

In reply to a question about how to conditionally drop an index on a table in MySQL. It appears the syntax doesn’t exist. However, maybe it does and I missed it. If I did miss it, I’m sure somebody will let me know. However, I simply have a dropIndexIfExists stored procedure for this type of database maintenance.

Below is my dropIndexIfExists stored procedure:

-- Conditionally drop the procedure.
DROP PROCEDURE IF EXISTS dropIndexIfExists;
 
-- Change the default semicolon delimiter to write a PSM
-- (Persistent Stored Module) or stored procedure.
DELIMITER $$
 
-- Create the procedure.
CREATE PROCEDURE dropIndexIfExists
( pv_table_name  VARCHAR(64)
, pv_index_name  VARCHAR(64))
BEGIN
 
  /* Declare a local variable for the SQL statement. */
  DECLARE stmt VARCHAR(1024);
 
  /* Set a session variable with two parameter markers. */
  SET @SQL := CONCAT('ALTER TABLE ',pv_table_name,'DROP INDEX ',pv_index_name);
 
  /* Check if the constraint exists. */
  IF EXISTS (SELECT NULL
             FROM   information_schema.statistics s
             WHERE  s.index_schema = database()
             AND    s.table_name = pv_table_name
             AND    s.index_name = pv_index_name)
  THEN
 
    /* Dynamically allocated and run statement. */
    PREPARE stmt FROM @SQL;
    EXECUTE stmt;
    DEALLOCATE PREPARE stmt;
  END IF;
 
END;
$$
 
-- Reset the default semicolon delimiter.
DELIMITER ;

You call the procedure like:

CALL dropIndexIfExists('payment','idx_payment01');

As always, I hope this helps those looking for a solution.

Written by maclochlainn

December 1st, 2021 at 12:09 am

Posted in MySQL,MySQL 8,Persistent Stored Modules,sql

Tagged with MySQL DBA, MySQL Developer, MySQL Techniques

MySQL 8+ Catalog

without comments

I was working through some tutorials for my students and noticed that there was a change in how a WHERE clause must be written against the information_schema.table_constraints table. It might have been made in an earlier release, I actually hadn’t checked it since 2014 when I wrote this early post on capturing MySQL Foreign Keys.

You could use the following WHERE case insensitive clause:

WHERE    tc.constraint_type = 'foreign key'

Now, you must use a case sensitive WHERE clause:

WHERE    tc.constraint_type = 'FOREIGN KEY'

I’d love to know why but I can’t seem to find a note on the change. As always, I hope this helps those looking for an answer.

Written by maclochlainn

November 30th, 2021 at 11:06 pm

Posted in MySQL,MySQL 8,sql

MySQL WITH Clause

without comments

When I went over my example of using the WITH clause to solve how to use a series of literal values in data sets, some students got it right away and some didn’t. The original post showed how to solve a problem where one value in the data set is returned in the SELECT-list and two values are used as the minimum and maximum values with a BETWEEN operator. It used three approaches with literal values:

A list of Python dictionaries that require you to filter the return set from the database through a range loop and if statement that mimics a SQL BETWEEN operator.
A WITH clause that accepts the literals as bind variables to filter the query results inside the query.
A table design that holds the literals values that an analyst might use for reporting.

It was the last example that required elaboration. I explained you might build a web form that uses a table, and the table could allow a data analyst to enter parameter sets. That way the analyst could submit a flag value to use one or another set of values. I threw out the idea on the whiteboard of introducing a report column to the prior post’s level table. The student went off to try it.

Two problems occurred. The first was in the design of the new table and the second was how to properly use the MySQL Python driver.

Below is a formal table design that supports this extension of the first blog post as a list of parameter values. It uses a report column as a super key to return a set of possible values. One value will show in the SELECT-list and the other two values deploy as the minimum and maximum values in a BETWEEN operator. It is seeded with two sets of values. One of the report possibilities is Summary level with three possibilities and the other is the Detail level with five possibilities.

-- Conditionally drop the levels table.
DROP TABLE IF EXISTS levels;
 
-- Create the levels list.
CREATE TABLE levels
( level      VARCHAR(16)
, report     ENUM('Summary','Detail')
, min_roles  INT
, max_roles  INT );
 
-- Insert values into the list table.
INSERT INTO levels
( level, report, min_roles, max_roles )
VALUES
 ('Hollywood Star','Summary', 30, 99999)
,('Prolific Actor','Summary', 20, 29)
,('Newcommer','Summary', 1, 19)
,('Hollywood Star','Detail', 30, 99999)
,('Prolific Actor','Detail', 20, 29)
,('Regular Actor','Detail', 10, 19)
,('Actor','Detail', 5, 9)
,('Newcommer','Detail', 1, 4);

The foregoing table design uses an ENUM type because reporting parameter sets are typically fewer than 64 possibilities. If you use the table to support multiple reports, you should add a second super key column like report_type. The report_type column key would let you use the table to support a series of different report parameter lists.

While the student used a %s inside the query, they created a runtime error when trying to pass the single bind variable into the query. The student misunderstood how to convert a report column input parameter variable into a tuple, which shows up when the student calls the Python MySQL Driver, like this:

59	cursor.execute(query, (report))

The student’s code generated the following error stack:

Traceback (most recent call last):
  File "./python-with-clause.py", line 59, in <module>
    cursor.execute(query,(report))
  File "/usr/lib/python3.7/site-packages/mysql/connector/cursor_cext.py", line 248, in execute
    prepared = self._cnx.prepare_for_mysql(params)
  File "/usr/lib/python3.7/site-packages/mysql/connector/connection_cext.py", line 632, in prepare_for_mysql
    raise ValueError("Could not process parameters")
ValueError: Could not process parameters

The ValueError should indicate to the developer that they’ve used a wrong data type in the call to the method:

cursor.execute(<class 'str'>,<class 'tuple'>)

This clearly was a misunderstanding of how to cast a single string to a tuple. A quick explanation of how Python casts a single string into a tuple can best be illustrated inside an interactive Python shell, like:

>>> # Define a variable.
>>> x = 'Detail'
>>> # An incorrect attempt to make a string a tuple.
>>> y = (x)
>>> # Check type of y after assignment.
>>> print(type(y))
<class 'str'>
>>> # A correct attempt to make a string a tuple.
>>> y = tuple(x)
>>> # Check type of y after assignment.
>>> print(type(y))
<class 'tuple'>
>>> # An alternative to make a string a tuple.
>>> z = (x,)
>>> # Check type of z after assignment.
>>> print(type(z))
<class 'tuple'>

So, the fix was quite simple to line 59:

59	cursor.execute(query, (report,))

The student started with a copy of a Python program that I provided. I fixed the argument handling and added some comments. The line 59 reference above maps to this code example.

# Import the library.
import sys
import mysql.connector
from mysql.connector import errorcode
 
# Capture argument list.
fullCmdArguments = sys.argv
 
# Assign argument list to variable.
argumentList = fullCmdArguments[1:]
 
# Define a standard report variable.
report = "Summary"
 
#  Check and process argument list.
# ============================================================
#  If there are less than two arguments provide default values.
#  Else enumerate and convert strings to dates.
# ============================================================
if (len(argumentList) == 1):
  # Set a default start date.
  if (isinstance(report,str)):
    report = argumentList[0]
 
#  Attempt the query.
# ============================================================
#  Use a try-catch block to manage the connection.
# ============================================================
try:
  # Open connection.
  cnx = mysql.connector.connect(user='student', password='student',
                                host='127.0.0.1',
                                database='sakila')
  # Create cursor.
  cursor = cnx.cursor()
 
  # Set the query statement.
  query = ("WITH actors AS "
           "(SELECT   a.actor_id "
           " ,        a.first_name "
           " ,        a.last_name "
           " ,        COUNT(fa.actor_id) AS num_roles "
           " FROM     actor a INNER JOIN film_actor fa "
           " ON       a.actor_id = fa.actor_id "
           " GROUP BY a.actor_id "
           " ,        a.first_name "
           " ,        a.last_name ) "
           " SELECT   a.first_name "
           " ,        a.last_name "
           " ,        l.level "
           " ,        a.num_roles "
           " FROM     actors a CROSS JOIN levels l "
           " WHERE    a.num_roles BETWEEN l.min_roles AND l.max_roles "
           " AND      l.report = %s "
           " ORDER BY a.last_name "
           " ,        a.first_name")
 
  # Execute cursor.
  cursor.execute(query,(report,))
 
  # Display the rows returned by the query.
  for (first_name, last_name, level, num_roles) in cursor:
    print('{0} {1} is a {2} with {3} films.'.format( first_name.title()
                                                   , last_name.title()
                                                   , level.title()
                                                   , num_roles))
 
  # Close cursor.
  cursor.close()
 
# ------------------------------------------------------------
# Handle exception and close connection.
except mysql.connector.Error as e:
  if e.errno == errorcode.ER_ACCESS_DENIED_ERROR:
    print("Something is wrong with your user name or password")
  elif e.errno == errorcode.ER_BAD_DB_ERROR:
    print("Database does not exist")
  else:
    print("Error code:", e.errno)        # error number
    print("SQLSTATE value:", e.sqlstate) # SQLSTATE value
    print("Error message:", e.msg)       # error message
 
# Close the connection when the try block completes.
else:
  cnx.close()

# Import the library. import sys import mysql.connector from mysql.connector import errorcode # Capture argument list. fullCmdArguments = sys.argv # Assign argument list to variable. argumentList = fullCmdArguments[1:] # Define a standard report variable. report = "Summary" # Check and process argument list. # ============================================================ # If there are less than two arguments provide default values. # Else enumerate and convert strings to dates. # ============================================================ if (len(argumentList) == 1): # Set a default start date. if (isinstance(report,str)): report = argumentList[0] # Attempt the query. # ============================================================ # Use a try-catch block to manage the connection. # ============================================================ try: # Open connection. cnx = mysql.connector.connect(user='student', password='student', host='127.0.0.1', database='sakila') # Create cursor. cursor = cnx.cursor() # Set the query statement. query = ("WITH actors AS " "(SELECT a.actor_id " " , a.first_name " " , a.last_name " " , COUNT(fa.actor_id) AS num_roles " " FROM actor a INNER JOIN film_actor fa " " ON a.actor_id = fa.actor_id " " GROUP BY a.actor_id " " , a.first_name " " , a.last_name ) " " SELECT a.first_name " " , a.last_name " " , l.level " " , a.num_roles " " FROM actors a CROSS JOIN levels l " " WHERE a.num_roles BETWEEN l.min_roles AND l.max_roles " " AND l.report = %s " " ORDER BY a.last_name " " , a.first_name") # Execute cursor. cursor.execute(query,(report,)) # Display the rows returned by the query. for (first_name, last_name, level, num_roles) in cursor: print('{0} {1} is a {2} with {3} films.'.format( first_name.title() , last_name.title() , level.title() , num_roles)) # Close cursor. cursor.close() # ------------------------------------------------------------ # Handle exception and close connection. except mysql.connector.Error as e: if e.errno == errorcode.ER_ACCESS_DENIED_ERROR: print("Something is wrong with your user name or password") elif e.errno == errorcode.ER_BAD_DB_ERROR: print("Database does not exist") else: print("Error code:", e.errno) # error number print("SQLSTATE value:", e.sqlstate) # SQLSTATE value print("Error message:", e.msg) # error message # Close the connection when the try block completes. else: cnx.close()

A Linux shell program like the following (provided the name of the shell script and Python program are the same) can run the Python program with or without a parameter. It works without a parameter because it sets a default value for the report variable.

# Switch the file extension and run the python program.
file=${0/%sh/py}
python3 ${file} "${@}"

You call the shell script like this:

./python-with-clause.sh Detail

As always, I hope this helps those looking for a solution.

Written by maclochlainn

November 14th, 2021 at 11:01 pm

Posted in Linux,MySQL,MySQL 8,Python,Python 3.x,sql,Unix

Tagged with MySQL Developer, MySQL Techniques

MacLochlainns Weblog

Archive for the ‘sql’ Category

Selective Aggregation

INSERT Statement

Dynamic Drop Table

str_to_date Function

Case Sensitive Comparison

Read CSV with Python

MySQL Query Performance

MySQL DropIndexIfExists

MySQL 8+ Catalog

MySQL WITH Clause

Recent Posts

Things Written About

Pages

Blogroll

Archives