Archive for the ‘SQL/PSM’ Category
MySQL Shell Parsing
I’ve been experimenting with the mysqlsh
since installing it last week. It’s been interesting. Overall, I’m totally impressed but I did find a problem with how it parses stored procedures.
First thought is always, is it my code? I checked the file by running it as a script file through MySQL Workbench. It ran perfectly in MySQL Workbench but failed repeatedly when run from the mysqlsh
utility. Next step, reduce the code to a small test case, retest it, and log a bug if it is replicated. My test case in a test.sql
file generates the following errors when run from the mysqlsh
utility:
MySQL localhost:33060+ ssl studentdb SQL > source test.sql Query OK, 0 rows affected (0.0003 sec) ERROR: 1064: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'CREATE PROCEDURE test ( pv_input1 CHAR(12) , pv_input2 CHAR(19)) MODIFIES SQL ' at line 2 Query OK, 0 rows affected (0.0002 sec) Query OK, 0 rows affected (0.0003 sec) ERROR: 1305: PROCEDURE studentdb.test does not exist |
The same file generates this series of successful messages when run as a script file from MySQL Workbench:
Preparing... Importing test.sql... Finished executing script Statement CREATE PROCEDURE test pv_input1 One Operation completed successfully |
For those who are curious enough to review the test case, here it is:
-- Reset the delimiter so that a semicolon can be used as a statement and block terminator. DELIMITER $$ SELECT 'CREATE PROCEDURE test' AS "Statement"; CREATE PROCEDURE test ( pv_input1 CHAR(12) , pv_input2 CHAR(19)) MODIFIES SQL DATA BEGIN SELECT CONCAT(pv_input1,', ',pv_input2) AS message; END; $$ -- Reset the standard delimiter to let the semicolon work as an execution command. DELIMITER ; -- Call the test procedure. CALL test('One','Two'); |
The reply in the bug explained the behavior difference between MySQL Workbench and the MySQL Shell (mysqlsh
) environments. MySQL Workbench uses the MySQL client, which supports multiple client statements with the CLIENT_MULTI_STATEMENTS
option. Recognizing that, the logging entry SELECT
statement should move to a position before setting the DELIMITER
, like:
-- Set a label for the log file. SELECT 'CREATE PROCEDURE test' AS "Statement"; -- Reset the delimiter so that a semicolon can be used as a statement and block terminator. DELIMITER $$ CREATE PROCEDURE test ( pv_input1 CHAR(12) , pv_input2 CHAR(19)) MODIFIES SQL DATA BEGIN SELECT CONCAT(pv_input1,', ',pv_input2) AS message; END; $$ -- Reset the standard delimiter to let the semicolon work as an execution command. DELIMITER ; -- Call the test procedure. CALL test('One','Two'); |
The new test case only submits one statement at a time. The logging query is submitted by the semicolon, and the test procedure by the double dollar ($$
) symbol set.
So, I was correct identifying a parsing behavior difference between MySQL Workbench and MySQL Shell. It appears to be a difference by design but the MySQL Shell documentation fails to explain it can’t manage multiple statements. I hope identifying this saves others time.
It’s also true that the MySQL client software supports TEE
and NOTEE
to write log files. Unfortunately, MySQL Shell (mysqlsh) doesn’t support the TEE
and NOTEE
syntax. You can only do minimal logging with the control of standard error (stderr) by using the application and AdminAPI log utilities, which are covered in Chapter 8 of the MySQL Shell 8.0 documentation.
Capture MySQL Foreign Keys
Shantanu asked a follow-up question on my Cleanup a MySQL Schema post from last month. He wanted to know if there was a way to capture foreign keys before removing them. The answer is yes, but how you do it depends on whether the primary key is based on a surrogate key using an auto incrementing sequence of a natural key using descriptive columns.
You can capture foreign keys with a simple query when they’re determined by a single column value. However, this script creates ALTER
statements that will fail when a table holds a multiple column foreign key value. The SELECT
statement would look like this when capturing all foreign key values in a MySQL Server:
1 2 3 4 5 6 7 8 9 10 11 | SELECT CONCAT('ALTER TABLE',' ',tc.table_schema,'.',tc.table_name,' ' ,'ADD CONSTRAINT',' fk_',tc.constraint_name,' ' ,'FOREIGN KEY (',kcu.column_name,')',' ' ,'REFERENCES',' ',kcu.referenced_table_schema,'.',kcu.referenced_table_name,' ' ,'(',kcu.referenced_column_name,');') AS script FROM information_schema.table_constraints tc JOIN information_schema.key_column_usage kcu ON tc.constraint_name = kcu.constraint_name AND tc.constraint_schema = kcu.constraint_schema WHERE tc.constraint_type = 'foreign key' ORDER BY tc.TABLE_NAME , kcu.column_name; |
In MySQL 8 forward, you must make a case sensitive comparison, like:
9 | WHERE tc.constraint_type = 'FOREIGN KEY' |
You would add a line in the WHERE
clause to restrict it to a schema and a second line to restrict it to a table within a schema, like this:
AND tc.table_schema = 'your_mysql_database' AND tc.table_name = 'your_table_name' |
Unfortunately, when the primary and foreign keys involve two or more columns you require a procedure and function. The function because you need to read two cursors, and the NOT FOUND
can’t be nested in the current deployment of MySQL’s SQL/PSM stored programs. In this example the storedForeignKeys
procedure finds the table’s foreign key constraints, and the columnList
function adds the column detail. The command_list
table stores the commands to restore foreign key constraints.
The command_list
table that stores the values is:
CREATE TABLE command_list ( command_list_id INT UNSIGNED PRIMARY KEY AUTO_INCREMENT , sql_command VARCHAR(6) NOT NULL , sql_object VARCHAR(10) NOT NULL , sql_constraint VARCHAR(11) , sql_statement VARCHAR(768) NOT NULL); |
This is the storedForeignKeys
procedure:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 | CREATE PROCEDURE storeForeignKeys ( pv_schema_name VARCHAR(64) , pv_table_name VARCHAR(64)) BEGIN /* Declare local variables. */ DECLARE lv_schema_name VARCHAR(64); DECLARE lv_table_name VARCHAR(64); DECLARE lv_constraint_name VARCHAR(64); DECLARE sql_stmt VARCHAR(1024); /* Declare control variable for handler. */ DECLARE fetched INT DEFAULT 0; /* Declare local cursor for foreign key table, it uses null replacement because the procedure supports null parameters. When you use null parameters, you get all foreign key values. */ DECLARE foreign_key_table CURSOR FOR SELECT tc.table_schema , tc.table_name , tc.constraint_name FROM information_schema.table_constraints tc WHERE tc.table_schema = IFNULL(lv_schema_name, tc.table_schema) AND tc.table_name = IFNULL(lv_table_name, tc.table_name) AND tc.constraint_type = 'FOREIGN KEY' ORDER BY tc.table_name; /* Declare a not found record handler to close a cursor loop. */ DECLARE CONTINUE HANDLER FOR NOT FOUND SET fetched = 1; /* Assign parameter values to local variables. */ SET lv_schema_name := pv_schema_name; SET lv_table_name := pv_table_name; /* Open a local cursor. */ OPEN foreign_key_table; cursor_foreign_key_table: LOOP /* Fetch a row into the local variables. */ FETCH foreign_key_table INTO lv_schema_name , lv_table_name , lv_constraint_name; /* Catch handler for no more rows found from the fetch operation. */ IF fetched = 1 THEN LEAVE cursor_foreign_key_table; END IF; /* The nested calls to the columnList function returns the list of columns in the foreign key. Surrogate primary to foreign keys can be resolved with a simply query but natural primary to foreign key relationships require the list of columns involved in the primary and foreign key. The columnList function returns the list of foreign key columns in the dependent table and the list of referenced columns (or the primary key columns) in the independent table. */ SET sql_stmt := CONCAT('ALTER TABLE ',' ',lv_schema_name,'.',lv_table_name,' ' ,'ADD CONSTRAINT ',lv_constraint_name,' ' ,'FOREIGN KEY (',columnList(lv_schema_name,lv_table_name,lv_constraint_name)); /* Record the SQL statements. */ INSERT INTO command_list ( sql_command , sql_object , sql_constraint , sql_statement ) VALUES ('ALTER' ,'TABLE' ,'FOREIGN KEY' , sql_stmt ); END LOOP cursor_foreign_key_table; CLOSE foreign_key_table; END; $$ |
This is the columnList
function:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 | CREATE FUNCTION columnList ( pv_schema_name VARCHAR(64) , pv_table_name VARCHAR(64) , pv_constraint_name VARCHAR(64)) RETURNS VARCHAR(512) BEGIN /* Declare local variables. */ DECLARE lv_schema_name VARCHAR(64); DECLARE lv_table_name VARCHAR(64); DECLARE lv_constraint_name VARCHAR(64); DECLARE lv_column_count INT UNSIGNED; DECLARE lv_column_name VARCHAR(64); DECLARE lv_column_list VARCHAR(512); DECLARE lv_column_ref_list VARCHAR(64); DECLARE lv_referenced_table_schema VARCHAR(64); DECLARE lv_referenced_table_name VARCHAR(64); DECLARE lv_referenced_column_name VARCHAR(64); DECLARE lv_return_string VARCHAR(768); /* Declare control variable for handler. */ DECLARE fetched INT DEFAULT 0; /* Declare local cursor for foreign key column. */ DECLARE foreign_key_column CURSOR FOR SELECT kcu.column_name , kcu.referenced_table_schema , kcu.referenced_table_name , kcu.referenced_column_name FROM information_schema.key_column_usage kcu WHERE kcu.referenced_table_schema = lv_schema_name AND kcu.table_name = lv_table_name AND kcu.constraint_name = lv_constraint_name ORDER BY kcu.column_name; /* Declare a not found record handler to close a cursor loop. */ DECLARE CONTINUE HANDLER FOR NOT FOUND SET fetched = 1; /* Assign parameter values to local variables. */ SET lv_schema_name := pv_schema_name; SET lv_table_name := pv_table_name; SET lv_constraint_name := pv_constraint_name; /* Set the first column value. */ SET lv_column_count := 1; /* Open the nested cursor. */ OPEN foreign_key_column; cursor_foreign_key_column: LOOP /* Fetch a row into the local variables. */ FETCH foreign_key_column INTO lv_column_name , lv_referenced_table_schema , lv_referenced_table_name , lv_referenced_column_name; /* Catch handler for no more rows found from the fetch operation. */ IF fetched = 1 THEN LEAVE cursor_foreign_key_column; END IF; /* Initialize the column list or add to it. */ IF lv_column_count = 1 THEN SET lv_column_list := lv_column_name; SET lv_column_ref_list := lv_referenced_column_name; /* Increment the counter value. */ SET lv_column_count := lv_column_count + 1; ELSE SET lv_column_list := CONCAT(lv_column_list,',',lv_column_name); SET lv_column_ref_list := CONCAT(lv_column_ref_list,',',lv_referenced_column_name); END IF; END LOOP cursor_foreign_key_column; CLOSE foreign_key_column; /* Set the return string to a list of columns. */ SET lv_return_string := CONCAT(lv_column_list,')',' ' ,'REFERENCES',' ',lv_referenced_table_schema,'.',lv_referenced_table_name,' ' ,'(',lv_column_ref_list,');'); RETURN lv_return_string; END; $$ |
You can call the procedure with a schema and table name, and you’ll get the foreign keys from just that table. You can create the following parent and child tables to test how multiple column foreign keys work in the script (provided because most folks use surrogate keys):
CREATE TABLE parent ( first_name VARCHAR(20) NOT NULL DEFAULT '' , last_name VARCHAR(20) NOT NULL DEFAULT '' , PRIMARY KEY (first_name, last_name) ) ENGINE=InnoDB DEFAULT CHARSET=utf8; CREATE TABLE child ( child_name VARCHAR(20) NOT NULL , first_name VARCHAR(20) DEFAULT NULL , last_name VARCHAR(20) DEFAULT NULL , PRIMARY KEY (child_name) , KEY fk_parent(first_name, last_name) , CONSTRAINT fk_parent FOREIGN KEY (first_name, last_name) REFERENCES parent (first_name, last_name) ) ENGINE=InnoDB DEFAULT CHARSET=utf8; |
You call the storeForeignKeys
procedure for the child
table with this syntax:
CALL storeForeignKeys('studentdb', 'child'); |
You call the storeForeignKeys
procedure for all tables in a schema with this syntax:
CALL storeForeignKeys('studentdb', null); |
While unlikely you’ll need this, the following calls the storeForeignKeys
procedure for all tables in all schemas:
CALL storeForeignKeys(null, null); |
You can export the command sequence with the following command to a script file:
SELECT sql_statement INTO OUTFILE 'c:/Data/MySQL/apply_foreign_keys.sql' FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' LINES TERMINATED BY '\r\n' FROM command_list; |
While preservation of tables and foreign keys is best managed by using a tool, like MySQL Workbench, it’s always handy to have scripts to do specific tasks. I hope this helps those looking for how to preserve foreign keys. You also can find a comprehensive treatment on how to write SQL/PSM code in Chapter 14 of my Oracle Database 11g and MySQL 5.6 Developer Handbook.
SQL Injection Risks
While I tried to deflect how you perform SQL Injection attacks against a MySQL procedure, my students requested that I post examples of what to do to avoid SQL injection, and what not to do to invite attacks. The best practice to avoid SQL injection attacks is too always bind inputs to data types, and avoid providing completely dynamic WHERE
clauses.
Here’s the correct way to dynamically generate a result from a MySQL Stored Procedure:
CREATE PROCEDURE hello (IN pv_input VARCHAR(50)) BEGIN SELECT sample_id , sample_name FROM sample WHERE sample_name = pv_input; END; $$ |
A call to this hello
procedure will only return the row or rows where the pv_input
value matches the sample_name
column value. Any attempt to exploit it like the one below fails.
CALL hello('\'Harriet\' OR 1 = 1'); |
It fails because there’s no matching 'Harriet' OR 1 = 1
in the table’s sample_name
column. However, it works well when we submit 'Harriet'
by herself, without the intended SQL inject phrase “OR 1 = 1
“, as you can see:
+-----------+-------------+ | sample_id | sample_name | +-----------+-------------+ | 2 | Harriet | +-----------+-------------+ 1 row in set (0.00 sec) |
There are two approaches that you should never put in your code because they can be readily exploited unless you carefully parse the incoming string parameter. The problem in both cases is causes by concatenation rather than binding. The first example is extremely unlikely as an error but possible.
CREATE PROCEDURE hello (IN pv_input VARCHAR(50)) BEGIN SET @sql := CONCAT('SELECT sample_id, sample_name FROM sample WHERE sample_name = ',pv_input); PREPARE stmt FROM @sql; EXECUTE stmt; END; $$ |
The last hello
procedure using concatenation and a prepared statement is vulnerable to SQL injection. A call like the following would return all two rows in the sample
table.
CALL hello('\'Juliette\' OR 1 = 1'); |
It would display:
+-----------+-------------+ | sample_id | sample_name | +-----------+-------------+ | 1 | Hank | | 2 | Harriet | +-----------+-------------+ 2 rows in set (0.00 sec) |
While there’s no sample_name
value of 'Juliette'
, the “OR 1 = 1
” is true. Therefore, the SELECT
statement filters out nothing and returns all the data from the table. It’s probably clear you’d never do this type of prepared statement inside a stored procedure, but most SQL Injection attacks exploit your scripting language implementation. Unfortunately, bad coding practices can infrequently expose this type of vulnerability; and they typically occur when a junior programmers is following a bad coding example.
A solution with the WHERE
clause as part of the parameter would look like this:
CREATE PROCEDURE hello (IN pv_where VARCHAR(50)) BEGIN SELECT sample_id , sample_name FROM sample pv_where; END; $$ |
The modified call:
CALL hello('WHERE sample_name = \'Juliette\' OR 1 = 1'); |
returns all rows from the table.
A solution without the WHERE
clause as part of the parameter would look like the following but it fails during runtime and returns no rows [Updated in response to comment]. The failure has nothing to do with the comment’s dialog on the CONCAT
function, which also added nothing to the example once I tested it.
CREATE PROCEDURE hello (IN pv_where VARCHAR(50)) BEGIN SELECT sample_id , sample_name FROM sample WHERE pv_where; END; $$ |
It’s simply works only when you provide a “1 = 1” or other comparison without embedded apostrophes (‘) but fails with embedded apostrophes. That means the following statement fails:
CALL hello('sample_name = \'Juliette\' OR 1 = 1'); |
but this SQL injection statement works:
CALL hello('1 = 1'); |
returns all rows from the table.
This example, when you omit the white space also works with embedded strings or numeric operands and an operator:
CREATE PROCEDURE hello (IN pv_where VARCHAR(50)) BEGIN SELECT sample_id , sample_name FROM sample WHEREpv_where; END; $$ |
It returns all rows with a call like this:
CALL hello('sample_name = \'Juliette\' OR 1 = 1'); |
My take initially was that it might be a bug, and I logged one (Bug 68903). That’s was a dumb thing to do because WHEREpv_where
simply becomes a table alias in the query.
In conclusion, the first example is a good practice. The other two should never exist! Well, they shouldn’t exist unless you’re parsing the web form inputs vigilantly.
Hope this helps those trying to understand how to avoid SQL injection attacks. Always try to solve dynamic SQL statement problems by binding variables into statements.
MySQL Database Triggers
One of the students wanted an equivalent example to an Oracle DML trigger sample that replaces a white space in a last name with a dash for an INSERT
statement. Apparently, the MySQL trigger example in the Oracle Database 11g and MySQL 5.6 Developer Handbook was a bit long. I have to agree with that because the MySQL DML trigger demonstrated cursors and loops in the trigger code.
Triggers can be statement- or row-level actions. Although some databases let you define statement-level triggers, MySQL doesn’t support them. MySQL only supports row-level triggers. Row-level triggers support critical or non-critical behaviors. Critical behavior means the trigger observes an insert, update, or delete that must be stopped, which means it raises an error. Non-critical behavior means the trigger observes a DML statement and logs it or implements a change during the context of the DML activity.
The first example shows you non-critical behavior. It observes an attempt to enter a two-part last name, and replaces the white space with a dash (you can find help on MySQL Regular Expressions in this other post). This means the trigger ensures compliance on how names are entered in the database, which should be protected in the web form (through JQuery or JavaScript) and the database.
DELIMITER $$ DROP TRIGGER IF EXISTS contact_insert$$ CREATE TRIGGER contact_insert BEFORE INSERT ON contact FOR EACH ROW BEGIN IF new.last_name REGEXP '^.* .*$' THEN SET new.last_name := REPLACE(new.last_name,' ','-'); END IF; END; $$ DELIMITER ; |
The problem with implementing a non-critical trigger is that the database performs the work but clerks entering the data don’t learn the business rule. A critical trigger simply disallows non-conforming data entry. The next program shows a critical behavior with an UPDATE
statement row-level trigger. After all, won’t a data entry clerk update the entry with a white space after the INSERT
statement didn’t?
Yes, that was a rhetorical question. Spelling out the business rule in the UPDATE
statement row-level trigger should educate the persistent errant behavior. While letting the INSERT
statement row-level trigger manage the behavior probably saves time for most end-users who make a casual mistake.
DELIMITER $$ DROP TRIGGER IF EXISTS contact_update$$ CREATE TRIGGER contact_update BEFORE UPDATE ON contact FOR EACH ROW BEGIN IF new.last_name REGEXP '^.* .*$' THEN SIGNAL SQLSTATE '42000'; END IF; END; $$ DELIMITER ; |
Somebody wanted to know why I choose SQLSTATE
42000. That’s because it signals an error in the SQL statement, and that’s the closest existing SQLSTATE
to the actual behavior that exists. Moreover, the error identified by the critical trigger signals non-compliance with the application’s SQL standards that protects the data.
Hopefully, this helps somebody looking for a MySQL database trigger example that raises an exception. The example works with MySQL 5.5 forward because the critical trigger uses the SIGNAL
feature, which was introduced in MySQL 5.5. Another article shows you how to leverage MyISAM tables to create a logging mechanism for critical event triggers, and you click this link to the MySQL Triggers with Logging blog entry.
Placement over substance
I was stunned when a SQL query raised an ERROR 1630 (42000)
telling me the SUM
function didn’t exist in MySQL 5.5.23. The fix was simple. The opening parenthesis of the SUM
function must be on the same line as the SUM
keyword without an intervening white space. Alternatively phrased, you can’t have a line return or white space between the SUM
function name and the opening parenthesis of the call parameter list. The same rule doesn’t apply to the opening parenthesis of the FORMAT
function and it seems to me that this parsing inconsistency is problematic.
Therefore, my surprise, observation, and complaint is that all functions don’t parse the same way, using the same rules. That is, unless you use specialized SQL_MODE
settings. This assumption was borne out by Kolbe Kegel’s comment on this post, and there are 30 remaining built in functions that have specialized parsing and resolution markers.
A simplified version of the code that raises the error follows. As you’ll notice the opening parenthesis for the FORMAT
and SUM
function have intervening white space and a line return.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | SELECT t.transaction_account AS "Transaction" , LPAD(FORMAT (SUM (CASE WHEN EXTRACT(MONTH FROM transaction_date) = 1 AND EXTRACT(YEAR FROM transaction_date) = 2011 THEN CASE WHEN t.transaction_type = cl.common_lookup_type THEN t.transaction_amount ELSE t.transaction_amount * -1 END END),2),10,' ') AS "JAN" FROM TRANSACTION t CROSS JOIN common_lookup cl WHERE cl.common_lookup_table = 'TRANSACTION' AND cl.common_lookup_column = 'TRANSACTION_TYPE' AND cl.common_lookup_type = 'DEBIT' GROUP BY t.transaction_account; |
Based on the comments, the SQL_MODE is:
mysql> SELECT @@version, @@sql_mode; +-----------+----------------------------------------------------------------+ | @@version | @@sql_mode | +-----------+----------------------------------------------------------------+ | 5.5.23 | STRICT_TRANS_TABLES,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION | +-----------+----------------------------------------------------------------+ 1 ROW IN SET (0.00 sec) |
It raises the following error:
ERROR 1630 (42000): FUNCTION studentdb.SUM does NOT exist. CHECK the 'Function Name Parsing and Resolution' SECTION IN the Reference Manual |
Moving ONLY the opening parenthesis to the end of the SUM
keyword (or removing the line return and white space from between the SUM
keyword and opening parenthesis) prevents the error but it would be more convenient if it supported both approaches. It seems odd that an intervening line return and white space for the SUM
function raises an exception while the same intervening line return and white space doesn’t raise an exception for the FORMAT
function. It strikes me the parser should support both or reject both. Here’s the fixed code that works without enabling the IGNORE_SPACE
SQL Mode option.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | SELECT t.transaction_account AS "Transaction" , LPAD(FORMAT (SUM( CASE WHEN EXTRACT(MONTH FROM transaction_date) = 1 AND EXTRACT(YEAR FROM transaction_date) = 2011 THEN CASE WHEN t.transaction_type = cl.common_lookup_type THEN t.transaction_amount ELSE t.transaction_amount * -1 END END),2),10,' ') AS "JAN" FROM TRANSACTION t CROSS JOIN common_lookup cl WHERE cl.common_lookup_table = 'TRANSACTION' AND cl.common_lookup_column = 'TRANSACTION_TYPE' AND cl.common_lookup_type = 'DEBIT' GROUP BY t.transaction_account; |
As noted by the comments, adding the IGNORE_SPACE
to the SQL_MODE
lets both queries work without moving the open parenthesis. You can do that in a session with the following syntax (which is covered in an older post):
SET SQL_MODE=(SELECT CONCAT(@@sql_mode,',IGNORE_SPACE')); |
Hope this helps folks…
Implicit Commit Functions?
Somebody asked about the possibility of putting DML statements inside MySQL stored functions. DML statements like the INSERT
, UPDATE
, and DELETE
. When I said, “Yes, you can put DML statements inside functions.” They showed me the error they encountered, which is only raised at compilation when you put an explicit COMMIT
statement or a Data Definition Language (DDL) statement (CREATE
, ALTER
, DROP
, or RENAME
) inside a MySQL function. The actual error message displayed is:
ERROR 1422 (HY000): Explicit OR implicit commit IS NOT allowed IN stored FUNCTION OR TRIGGER. |
While an explicit COMMIT
is obvious when placed inside a function, the implicit COMMIT
statement isn’t obvious unless you know a DDL statement generates one. This means you can’t include any DDL statement inside a stored function.
The following example shows how a DDL statement creates an immediate implicit COMMIT
. It requires two sessions and Transaction Control Language (TCL) statements.
In session one, create a table, start a transaction scope, and insert one row into the table:
-- Create a table. CREATE TABLE message ( message_id INT UNSIGNED PRIMARY KEY AUTO_INCREMENT , message_text VARCHAR(20)); -- Start a transaction context. BEGIN WORK; -- Insert a row into the MESSAGE table. INSERT INTO message (message_text) VALUES ('Implicit Commit?'); |
In session two, the empty set is displayed when you query the MESSAGE
table. Returning to session one, add a new column to the MESSAGE
table with this statement:
ALTER TABLE message ADD (sent DATETIME); |
The ALTER
statement automatically makes current any pending data changes, effectively committing the record from session one. You can return to the second session and query the table you get the following results:
+------------+------------------+------+ | message_id | message_text | sent | +------------+------------------+------+ | 1 | Implicit Commit? | NULL | +------------+------------------+------+ |
When you embed DML statements inside functions, they rely on automatic commit behaviors or an external transaction scope. This function compiles and returns 1 when successful and 0 when unsuccessful, which effectively controls all error conditions within the function:
CREATE FUNCTION modifies_data ( pv_message CHAR(20) ) RETURNS INT MODIFIES SQL DATA BEGIN /* Declare Boolean-like variables as FALSE. */ DECLARE lv_return_value INT DEFAULT FALSE; DECLARE lv_error_value INT DEFAULT FALSE; /* Declare a generic exit handler to reset error control variable to true. */ DECLARE CONTINUE HANDLER FOR SQLEXCEPTION SET lv_error_value := TRUE; /* Insert statement with auto commit enabled. */ INSERT INTO message (message_text) VALUES (pv_message); /* True unless the CONTINUE HANDLER disables the error control variable. */ IF lv_error_value = FALSE THEN SET lv_return_value := TRUE; END IF; /* Return local variable. */ RETURN lv_return_value; END; $$ |
With auto commit enabled, any call to the function in a query writes a row to the table. If you start a transaction and in the scope of the transaction test the function in one session and query the table in another you’ll see that transaction control can be managed outside the function. You can also manage the transaction control inside a stored procedure, which holds all the TCL commands. An example follows:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 | CREATE PROCEDURE test_modifies_data ( pv_message CHAR(20) ) MODIFIES SQL DATA BEGIN /* Declare a control variable that manages transaction success or failure. */ DECLARE lv_success_value INT DEFAULT FALSE; /* Start a transaction context. */ START TRANSACTION; /* Set a SAVEPOINT in the transaction context. */ SAVEPOINT before_transaction; /* Call the function. */ SET lv_success_value := modifies_data(pv_message); /* Check the status of the control variable, and commit or rollback the transaction. */ IF lv_success_value = TRUE THEN COMMIT; ELSE ROLLBACK TO before_transaction; END IF; END; $$ |
Test the two with the following call to the procedure:
CALL test_modifies_data('Agent Coulson'); |
Answers my question and I hope it helps others.
Value or Reference?
In class today, we reviewed pass-by-value (IN
-only mode) parameters and pass-by-reference (INOUT
and OUT
mode) parameters for stored procedures. The analogy that finally seemed to hit home for the students was linking the modes to the story of Alice in Wonderland.
Here’s the analogy and below is the code to support it:
“A pass-by-value parameter in a procedure is like sending an immutable copy of Alice into the rabbit hole, which means she can’t shrink, grow, or learn throughout the story; whereas, a pass-by-reference parameter in a procedure is like sending Alice into the rabbit hole where she can shrink, grow, fight the Jabberwocky, and learn things that make her life better when she exits the rabbit hole – consistent with the storyline of Alice’s revisit to Wonderland.”
The example code creates a stored procedure that accepts two parameters – one pass-by-value and one pass-by-reference. Inside the procedure there’s a local variable and a reassignment of value to the pass-by-reference parameter. It’s in this wonderland
procedure (by the way don’t forget to manage the DELIMITER
value when you test it):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 | CREATE PROCEDURE wonderland ( IN pv_value_param VARCHAR(20) , INOUT pv_ref_param VARCHAR(20)) BEGIN /* Declare a variable. */ DECLARE lv_value_param VARCHAR(20); /* Query the local variable and reference parameter before changing values. */ SELECT 'On Entry' AS "Where" , pv_value_param AS "Value Parameter" , pv_ref_param AS "Reference Parameter" , IFNULL(lv_value_param,' ') AS "Local Variable"; /* Assign a lowercase value parameter to a local variable. */ SET lv_value_param := LOWER(pv_value_param); /* Assign a uppercase reference parameter value to the reference parameter. */ SET pv_ref_param := UPPER(pv_ref_param); /* Query the local variable and reference parameter after changing values. */ SELECT 'On Exit ' AS "Where" , pv_value_param AS "Value Parameter" , pv_ref_param AS "Reference Parameter" , IFNULL(lv_value_param,' ') AS "Local Variable"; END; $$ |
A tester
procedure than tests how the pass-by-value and pass-by-reference modes of operation differ. It’s here:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | CREATE PROCEDURE tester ( IN pv_value_param VARCHAR(20) , IN pv_ref_param VARCHAR(20)) BEGIN /* Query the local and reference parameters. */ SELECT 'Before ' AS "Where" , pv_value_param AS "Value Parameter" , pv_ref_param AS "Reference Parameter"; /* Call the wonderland procedure that changes the pass-by-reference parameter. */ CALL wonderland(pv_value_param, pv_ref_param); /* Query the local and reference parameters. */ SELECT 'After ' AS "Where" , pv_value_param AS "Value Parameter" , pv_ref_param AS "Reference Parameter"; END; $$ |
You call the tester
program with this syntax:
CALL tester('Alice','Alice'); |
The test case returns the following values:
+----------+-----------------+---------------------+ | Where | Value Parameter | Reference Parameter | +----------+-----------------+---------------------+ | Before | Alice | Alice | +----------+-----------------+---------------------+ 1 row in set (0.00 sec) +----------+-----------------+---------------------+----------------+ | Where | Value Parameter | Reference Parameter | Local Variable | +----------+-----------------+---------------------+----------------+ | On Entry | Alice | Alice | | +----------+-----------------+---------------------+----------------+ 1 row in set (0.02 sec) +----------+-----------------+---------------------+----------------+ | Where | Value Parameter | Reference Parameter | Local Variable | +----------+-----------------+---------------------+----------------+ | On Exit | Alice | ALICE | alice | +----------+-----------------+---------------------+----------------+ 1 row in set (0.02 sec) +----------+-----------------+---------------------+ | Where | Value Parameter | Reference Parameter | +----------+-----------------+---------------------+ | After | Alice | ALICE | +----------+-----------------+---------------------+ 1 row in set (0.03 sec) |
Basically, Alice inside the pv_ref_param
parameter grows to uppercase during the trip through the wonderland
procedure, while Alice inside the pv_value_param
remains unchanged. If it didn’t help you learn a principle, maybe it gave you a laugh on how to view the travels of IN
-only and INOUT
parameters. 😉
NOTE: Line 12 in the alice
procedure is impossible with an immutable variable because the value of a call parameter to an immutable IN
-only formal parameter shouldn’t allow the call parameter value to change during the execution of the program. This means that MySQL IN
-only mode parameter values actually hold a mutable copy of the call parameter and the call parameter can be either a variable or literal value. The parameter value is discarded at the completion of procedure. This is more easily demonstrated with this assignment
procedure:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | CREATE PROCEDURE assignment ( IN pv_value_param VARCHAR(20)) BEGIN /* Query the local and reference parameters. */ SELECT 'Before ' AS "Where" , pv_value_param AS "Value Parameter"; /* Call the wonderland procedure that changes the pass-by-reference parameter. */ SET pv_value_param := UPPER(pv_value_param); /* Query the local and reference parameters. */ SELECT 'After ' AS "Where" , pv_value_param AS "Value Parameter"; END; $$ |
Therefore a call like this prints an uppercase INBOUND
string inside the program but can return nothing to the calling scope since the call parameter is a string literal.
SET @sv_session = 'inbound'; CALL assignment(@sv_session); SELECT @sv_test AS "Current Value"; |
The query displays:
+---------------+ | Current Value | +---------------+ | inbound | +---------------+ |
I hope this helps.
Oracle & MySQL Handbook
My new Oracle Database 11g & MySQL 5.6 Developer Handbook will be available at Oracle Open World 2011 (OOW2011). It’s a great book to compare and contrast approaches in Oracle and MySQL. It covers Oracle SQL & PL/SQL and MySQL SQL & SQL/PSM (Persistent Stored Modules – functions and procedures). Unfortunately, it seems like the book won’t be available on amazon.com until much later in the month, and not available from amazon.de until November. You can read about it at McGraw Hill Professional’s web site. They’ve made a special effort to get copies to OOW2011. Here’s the source code for the book because I don’t know when it’ll be on the publisher’s web site.
I’ll also be at OOW2011. They’ve scheduled me in the bookstore (probably 2nd floor of Moscone North, as usual) from 10 to 10:30 A.M. on Monday and Wednesday for a book signing. If you’re at OOW2011 and you like to stop by and say hello, I look forward to meeting you. Many folks leave comments on the posts but only a few suggest what they’d like me to write on when I’ve got a chance, and you can do that if you stop by to chat.
It’s also interesting to know how many folks use both Oracle and MySQL (any updates on that are always appreciated). Last year at the Bioinformatics and Computation Biology (ACM-BCB 2010) Conference in Niagara Falls, I found it interesting to discover how many pharmaceutical companies and national labs were using both Oracle and MySQL. They appeared consistent about using Oracle for their systems governed by legal compliance rules and MySQL for actual research.
The pharmaceutical companies also had clear barriers between the researchers and professional IT staff, specifically the DBAs. It seems that the DBAs don’t want to cede any control over installed Oracle instances, and they place barriers to research by denying additional Oracle instances when their site licenses would allow them to do so at no incremental cost. On the other hand, the DBAs are fine with letting researchers host and pilot with the MySQL Community Edition databases. This book supports those trying to figure out how to write portable SQL and how to port solutions from MySQL to Oracle and vice versa.
Hope to meet a few new folks at OOW2011. The Kindle version of the book became available 11/25/2011.
As an addendum to this original post, some folks asked for the summary of content for the new book, and the location of the errata (the errors of omission and commission in the book). Below is a summary of the book from page XVIII of the Introduction, and the errata is in the second comment to this post:
Part I: Development Components
- Chapter 1, “Architectures,” explains the Oracle 11g and MySQL 5.6 development architectures and highlights the comparative aspects of both client and server environments.
- Chapter 2, “Client Interfaces,” explains and demonstrates the basics of how you use SQL*Plus and MySQL Monitor client software.
- Chapter 3, “Security,” explains the security barriers for database servers and Data Control Language (DCL) commands that let you manage user and account privileges in the database servers.
- Chapter 4, “Transactions,” explains the nature of ACID-compliant transactions and the Two-phase Commit (2PC) process demonstrated by INSERT, UPDATE, and DELETE statements.
- Chapter 5, “Constraints,” explains the five primary database-level constraints and covers the check, not null, unique, primary key, and foreign key constraints.
Part II: SQL Development
- Chapter 6, “Creating Users and Structures,” explains how you can create users, databases, tables, sequences, and indexes.
- Chapter 7, “Modifying Users and Structures,” explains how you modify users, databases, tables, sequences, and indexes.
- Chapter 8, “Inserting Data,” explains how you insert data into tables.
- Chapter 9, “Updating Data,” explains how you update data in tables.
- Chapter 10, “Deleting Data,” explains how you delete data from tables.
- Chapter 11, “Querying Data,” explains how you query data from a single table, from a join of two or more tables, and from a join of two or more queries through set operators.
- Chapter 12, “Merging Data,” explains how you import denormalized data from external tables or source files and insert or update records in normalized tables.
Part III: Stored Program Development
- Chapter 13, “PL/SQL Basics,” explains the basics of using PL/SQL to write transactional blocks of code.
- Chapter 14, “SQL/PSM Basics,” explains the basics of using SQL/PSM to write transactional blocks of code.
- Chapter 15, “Triggers,” explains how to write database triggers in Oracle and MySQL databases.
Part IV: Appendix
- Appendix, Covers the answers to the mastery questions at the end of the chapters.
MySQL Timestamp Columns
Somebody asked how to work around an error message they got after converting one of their who-audit columns to a TIMESTAMP
column. A TIMESTAMP
column has a DEFAULT
or ON UPDATE
current timestamp. They wanted to have two TIMESTAMP
columns in the same table, with the intention of:
- Having the
created
column assign a current timestamp value on insert - Having the
updated
column assign a current timestamp value on insert and update
This is a sample table with two of the traditional four who-audit columns:
CREATE TABLE sample ( sample_id INT UNSIGNED AUTO_INCREMENT PRIMARY KEY , created TIMESTAMP NOT NULL , updated TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP); |
When they tried to create the table they got the following error:
ERROR 1293 (HY000): Incorrect TABLE definition; there can be ONLY one TIMESTAMP COLUMN WITH CURRENT_TIMESTAMP IN DEFAULT OR ON UPDATE clause |
While you can define a table with two columns that have a TIMESTAMP
data type, you can’t define a table with two TIMESTAMP
columns when one holds a DEFAULT
or ON UPDATE
CURRENT_TIMESTAMP
value. However, you can use a DATETIME
data type for the created
column provided it’s null allowed, like:
CREATE TABLE sample ( sample_id INT UNSIGNED AUTO_INCREMENT PRIMARY KEY , created DATETIME , updated TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP); |
You can then define an on insert database trigger that fires when you create a new row, like this:
CREATE TRIGGER sample_t BEFORE INSERT ON sample FOR EACH ROW BEGIN SET NEW.created = CURRENT_TIMESTAMP(); END; $$ |
An INSERT
statement would look like this:
INSERT INTO sample VALUES (NULL, NULL, NULL); |
You would see the following if you query the table:
+-----------+---------------------+---------------------+ | sample_id | created | updated | +-----------+---------------------+---------------------+ | 1 | 2011-08-11 00:26:21 | 2011-08-11 00:26:21 | +-----------+---------------------+---------------------+ |
Hope this helps other too.
Why SELECT-INTO, eh?
Somebody raised the question about writing stored functions while we were discussing scalar subqueries against COMMON_LOOKUP
tables. Common look up tables store collections of possible <OPTION>
elements. They’re basically generalized tables that contain a set of smaller tables, where the row sets make up a list of unique values.
When you write a function that must return one and only one row, the SELECT-INTO
syntax is ideal in Oracle because it automatically raises an exception when the query returns no row or two or more rows. MySQL doesn’t raise an automatic exception when a SELECT-INTO
fails to return a row but it does raise an ERROR 1722
when two or more rows are found.
Here are sample implementations written in Oracle’s PL/SQL and MySQL’s SQL/PSM languages. If you’re an Oracle developer and new to MySQL, the biggest oddity may be the delimiter, or it may be the sizing of formal parameters. Hopefully, its not the lack of formal declaration and exception blocks in SQL/PSM. For those coming from MySQL, the OR REPLACE
command or dynamic sizing of formal parameter list values are the big changes.
Oracle PL/SQL Function
The Oracle PL/SQL function is very straightforward and helped by pre-defined exceptions for both the no data found and too many rows found errors. Formal parameters inherit their physical size at run time from the calling scope program.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 | -- Create or replace existing function with new implementation. CREATE OR REPLACE FUNCTION get_common_lookup_id ( pv_table VARCHAR2 , pv_column VARCHAR2 , pv_type VARCHAR2 ) RETURN NUMBER IS -- Declare a return variable. lv_return NUMBER; BEGIN -- Query data and assign it to a local variable. SELECT common_lookup_id INTO lv_return FROM common_lookup WHERE common_lookup_table = pv_table AND common_lookup_column = pv_column AND common_lookup_type = pv_type; -- Return the value found. RETURN lv_return; EXCEPTION -- Handle errors. WHEN NO_DATA_FOUND THEN RAISE; WHEN TOO_MANY_ROWS THEN RAISE; END; / |
MySQL SQL/PSM Function
The lack of an implicitly raised exception when a SELECT-INTO
structure fails to return a row adds several steps to a SQL/PSM function. You must define a custom exception and an if-block to raise the exception, which is trigger by returning a null value into the lv_return
local variable.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 | /* Conditionally drop the function before trying to create it. */ DROP FUNCTION IF EXISTS get_common_lookup_id; /* Change the delimiter to write a stored procedure into the database. */ DELIMITER $$ /* Create a function that replaces a scalar subquery. */ CREATE FUNCTION get_common_lookup_id ( pv_table VARCHAR(30) , pv_column VARCHAR(30) , pv_type VARCHAR(30)) RETURNS INT UNSIGNED BEGIN /* Declare a return variable. */ DECLARE lv_return INT UNSIGNED; /* Declare a local variable for a subsequent handler. */ DECLARE no_data_found CONDITION FOR SQLSTATE '99001'; /* Query data and assign it to a local variable. */ SELECT common_lookup_id INTO lv_return FROM common_lookup WHERE common_lookup_table = pv_table AND common_lookup_column = pv_column AND common_lookup_type = pv_type; /* The local variable is only null when no rows are returned by the implicit cursor. */ IF lv_return IS NULL THEN SIGNAL no_data_found SET MESSAGE_TEXT = 'Result was no rows found.'; END IF; /* Return the value found. */ RETURN lv_return; END; $$ /* Reset the delimiter to the default to run other programs. */ DELIMITER ; |
Hope this help solve your problems.