Archive for the ‘MySQL’ Category
Oracle to MySQL dates
My students have to do their group labs in Oracle and then port them individually to MySQL. Most of the conversion resolves around dates because Oracle spoils us with their simplicity in their SQL dialect.
For example, if we wanted to add 9 days to today’s system date (June 27, 2009) we can do this in Oracle:
SQL> SELECT SYSDATE + 9 FROM dual; |
It prints
06-JUL-09 |
If we tried the equivalent in MySQL, we get a null because it treats any day value over 31 as a null. The maximum date in any month is 31, regardless of month. If you add more days than the maximum number minus your current date, you return a null. This is because adding a day never changes the month, and that yields invalid dates. In MySQL, you need to make this calculation with the adddate()
or date_add() functions because they’ll increment months and years.
By way of example, if we only added four to today’s date (June 27, 2009) and formatted the output as a date, it works
mysql> SELECT str_to_date(utc_date() + 4,'%Y%m%d'); |
We’d get this:
+--------------------------------------+ | str_to_date(utc_date() + 4,'%Y%m%d') | +--------------------------------------+ | 2009-06-31 | +--------------------------------------+ |
I always wanted an extra day in June. 😉
In short, there are three possible non-dates in February that format as dates, and one day in every thirty day month. I’ve a hunch this is a bug (I’ve logged a bug and they’ve verified it and moved it to triage).
The correct way to perform this calculation in MySQL is to use either the adddate()
or date_add()
functions. They take the same arguments. Here’s a quick example:
mysql> SELECT adddate(utc_date(),INTERVAL 9 DAY); |
It yields the correct date:
+------------------------------------+ | adddate(utc_date(),INTERVAL 9 DAY) | +------------------------------------+ | 2009-07-06 | +------------------------------------+ |
I know this answers a student email but I hope it helps a few others too.
MySQL security risk?
Somebody asked me why you must provide the IDENTIFIED BY password
clause on a grant after you’ve already created the user. This is a very good question, I wondered it myself a while back. This blog provides the what, why, and how of it.
The reason you typically must provide the IDENTIFIED BY password
clause is that typically you’ve enabled the NO_AUTO_CREATE_USER
value in your sql_mode
. That parameter is there by default from MySQL 5.0.2 forward. It’s what requires you to use the IDENTIFIED BY password
clause.
The parameter enforces a match against user
, host
, and password
columns in the user
table of the mysql
database. If you disable it, the match is on user
and host
columns in the user
table. If you make a grant to a user without specifying a host and a wildcard (%) host doesn’t exist for that user, the grant will insert a duplicate user. The same user name with a wildcard host trumps the one with a localhost
host value for access privileges, which means that user no longer needs a password to gain entry to the database.
The following steps will show and explain what happens in MySQL’s security model.
1. Sign on as the root
user, then create a new myuser
user:
mysql> CREATE USER 'myuser'@'localhost' IDENTIFIED BY 'myuser'@'localhost'; |
If you attempt to grant permissions to the myuser
without a password value, like this
mysql> GRANT ALL ON sampledb.* TO 'myuser'@'localhost'; |
You should see an outcome like the following. At least, you will when the NO_AUTO_CREATE_USER
value is set in your sql_mode
variable.
ERROR 1133 (42000): Can't find any matching row in the user table |
If it let you set it, then the NO_AUTO_CREATE_USER
value isn’t set. You can check what’s set with this query:
mysql> SELECT @@sql_mode; |
You should see this:
+----------------------------------------------------------------+ | @@sql_mode | +----------------------------------------------------------------+ | STRICT_TRANS_TABLES,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION | +----------------------------------------------------------------+ |
2. Reset the sql_mode
to disable the NO_AUTO_CREATE_USER
value, like this (presuming you’ve got the generic MySQL 5.1 settings):
mysql> SET sql_mode='STRICT_TRANS_TABLES,NO_ENGINE_SUBSTITUTION'; |
3. After disabling the restriction, you can grant privileges on a sampledb
to the myuser
user without the IDENTIFIED BY password
clause. You’d use the following syntax:
mysql> GRANT ALL ON sampledb.* TO 'myuser'@'localhost'; |
Before you sign off and sign on to test your new user, you should now query the USER
table. Here’s the query to focus on the three important columns:
mysql> SELECT USER, host, password -> FROM USER -> WHERE USER = 'myuser'; |
If you return more than one row, you made an error and most likely left off the @'localhost'
or used @'%'
. If you made that mistake, you’ll find that disabling the NO_AUTO_CREATE_USER
value, can yield multiple entries in the user
table for the same user. Worse yet, the user no longer can sign in with their password.
+--------+-----------+-------------------------------------------+ | USER | host | password | +--------+-----------+-------------------------------------------+ | myuser | % | | | myuser | localhost | *CBA73BBE5D9AF59311C3F4D7E8C20AA847F7B188 | +--------+-----------+-------------------------------------------+ |
If you were to sign off and attempt to sign on with the myuser
account password, you’d get the following error message:
ERROR 1045 (28000): Access denied FOR USER 'myuser'@'localhost' (USING password: YES) |
You can now sign on as the myuser
user only without a password. Fixing it is simple, just drop the user for any host:
mysql> DROP USER 'myuser'@'%'; |
While disabling the sql_mode
parameters makes your job of granting privileges easier, it risks compromising your database. Therefore, avoid it or be careful. If I missed something or got it wrong, let me know.
Excel date conversion
I put together a post on how to upload to MySQL from a CSV file with dates. It was more or less for my students but one of them was curious how the mega formula worked. As commented, the easier solution is to create a custom format. Oddly, Open Office does support the MySQL default format natively.
Excel doesn’t support the native MySQL date format as a default format mask, which is YYYY-MM-DD, or 2009-06-02 for June 2, 2009. That means you have to convert it from a scalar date to a string or create a custom format mask (see Dmitri’s comment below). If you just shook your head at the term scalar date, maybe a custom format mask is best. However, if you want a programming solution let me explain that Excel supports only three data types. They’re a string literal, a numeric literal, and a formula. Dates in Excel are merely formatted numbers. When the numbers are integers, the date is a date, but when the number has a fractional component, the date is really a timestamp.
Here’s a brief description of the process required to convert a date in Excel into a MySQL date format string literal in a CSV file. You need the following Excel functions:
Date Functions
- The
DAY(date)
function returns a 1 or 2 digit numeric value for the day of the month, with ranges of 1 to 28, 1 to 29, 1 to 30, or 1 to 31 dependent on the month and year. - The
MONTH(date)
function returns a 1 or 2 digit numeric value for the month of the year. - The
YEAR(date)
function returns a 4 digit numeric value for the year.
Logical Functions
- The
IF(logical_expression,truth_action,false_action)
function returns the truth action when the expression is true, and the false action when the expression isn’t true.
MySQL Server
CONCATENATE(string_1, string_2, ...)
glues strings together.LEN(numeric_value)
function returns the length of a string or number.
MySQL requires that you return an eight character string of numbers. The first four numbers must be a valid year, the fifth and sixth numbers a valid month, and the seventh and eigth numbers a valid day in the context of the year and month provided. Unfortunately, the DAY()
and MONTH()
functions may return a 1 or 2 digit value. That can’t happen in the CSV file’s string for a date, so you use the IF()
and LEN()
functions to guarantee a 2 digit return value.
Here are the examples that guarantee 2 digit day and month values, assuming that the base date is in the A1 cell. The concatenation of a "0"
(zero between two double quotes) or the ""
(two double quotes or a string null) ensures the number data types become strings.
=IF(LEN(DAY(A1))=1,CONCATENATE("0",DAY(A1)),DAY(A1)) =IF(LEN(MONTH(A1))=1,CONCATENATE("0",MONTH(A1)),MONTH(A1)) |
A zero is placed before the day or month when the logical condition is met, which means the day or month value is a single digit string. A null is place before the day or month when the logical condition isn’t met, which means the day or month value is a two digit string. There’s only one problem with these mega functions. They return a number.
The year calculation doesn’t require the explicit casting when you concatenate it with the other strings because it is implicitly cast as a string. However, it’s a better practice to include it for clarity (most folks don’t know about the implicit casting behaviors in Excel).
=CONCATENATE(YEAR(A1),"-",IF(LEN(MONTH(A1))=1,CONCATENATE("0",MONTH(A1)),MONTH(A1)),"-",IF(LEN(DAY(A1)) = 1,CONCATENATE("0",DAY(A1)),DAY(A1))) |
As Goodwin reported in a comment, there’s an easier way that I missed. You can simply use the TEXT function when the source column is a valid serialized date value.
=TEXT(A1,"YYYYMMDD") |
You can see the full MySQL import from CSV in the previous post. Naturally, you may want to copy and paste special the value before creating the CSV file. Also, don’t forget to delete any unused columns to the right or rows beneath because if you don’t your file won’t map to your table definition.
MySQL Upload from CSV
You have one approach to uploading CSV file data into a MySQL database. You use the LOAD
statement. There two caveats about using the LOAD
command, and they are:
- You must convert dates to the default MySQL format – YYYYMMDD, or the four digit year, two digit month, and two digit day.
- You can’t use
AUTO_INCREMENT
from the source to the loading table.
It’s untrue that you must input dates as strings and convert them in the database. However, you can do that if you like. I’d suggest you open the file in Microsoft Excel, then create a column with the correct data format. You can do that by creating a custom format, or with a formula. I figure the custom format is easiest but sometimes a formula is necessary for applications.
Excel Formula to Convert a Date to MySQL Format ↓
This is an example formula that’ll convert any supported Excel date format into a CSV-compliant MySQL date string.
The formula is large because MySQL can’t process a date when a one character month or day occurs in the string. You can find more on the logic of the Excel mega formula in this blog post.
=CONCATENATE(CONCATENATE("",YEAR(A1)),IF(LEN(MONTH(A1)),CONCATENATE("0",MONTH(A1))),IF(LEN(DAY(A1)) = 1,CONCATENATE("0",DAY(A1)),CONCATENATE("",DAY(A1)))) |
You should note that loading a date at the end of a line has problems in MySQL. The easiest fix that I’ve found is to place a comma at the end of each line. In a Microsoft world, that eliminates the need for the \r
from the LINES TERMINATED BY
clause.
While this shows a trivial amount of data, here’s a CSV file to use in the test case. Create a directory like Data
off your base logical drive or mount point. Then create a file named transaction_upload.csv and put the following data set in it.
20090102,5 20091231,5 20091128,5 20090616,5 |
Create an importing.sql
file in the C:\Data
folder with the contents below. It’s only a recommendation but generally import tables aren’t useful in between uploads, and that’s why I create the table as an in-memory table. In-memory tables are dropped when you recycle (boot) the MySQL instance.
-- Conditionally drop the table. DROP TABLE IF EXISTS transaction_upload; -- Create the new upload target table. CREATE TABLE transaction_upload ( transaction_date DATE , transaction_amount FLOAT ) ENGINE=MEMORY; -- Load the data from a file, don't forget the \n after the \r on Windows or it won't work. LOAD DATA INFILE 'c:/Data/mysql/transaction_upload.csv' INTO TABLE transaction_upload FIELDS TERMINATED BY ',' ENCLOSED BY '"' ESCAPED BY '\\' LINES TERMINATED BY '\r\n'; -- Select the uploaded records. SELECT * FROM transaction_upload; |
Before you connect as anything other than the root account, you’ll need to grant the global FILE
privilege to the targeted user. This is true because you’ve not qualified a LOCAL
file in the import statement. Users that have all privileges on a given database also have read-write access to LOCAL
tables, which is read write to client-side files. When the LOCAL
key word is omitted, you’re instructing a read from the server by the client tool. You must therefore grant a global permission to enable a connected user can call a server-side file.
If you want to import without granting any additional global permissions, you can rewrite the statement like this:
LOAD DATA LOCAL INFILE 'c:/Data/mysql/transaction_upload.csv' INTO TABLE transaction_upload FIELDS TERMINATED BY ',' ENCLOSED BY '"' ESCAPED BY '\\' LINES TERMINATED BY '\r\n'; |
If you don’t want to rewrite the statement, you’ll get the following error:
ERROR 1045 (28000): Access denied FOR USER 'student'@'localhost' (USING password: YES) Empty SET (0.01 sec) |
Grant Global File Permission ↓
This shows you how to grant a global file permissions to a restricted user. It is only necessary when you want the user to read or write server-side files.
C:\Data> mysql -uroot -pcangetin -P3306 |
Then, grant the global privilege to the user:
mysql> GRANT FILE ON *.* TO 'student'@'localhost' IDENTIFIED BY 'student'; |
Now, you can connect to the data base as the restricted user and read an external server-side file:
C:\Data> mysql -ustudent -pstudent -P3306 |
Once you’ve granted these permissions, the user is no longer truly a restricted user. I’d strongly discourage doing this if the user is accessed via web applications.
Run the script:
mysql> \. importing.sql |
Then, you should see the following:
+------------------+--------------------+ | transaction_date | transaction_amount | +------------------+--------------------+ | 2009-01-02 | 5 | | 2009-12-31 | 5 | | 2009-11-28 | 5 | | 2009-06-16 | 5 | +------------------+--------------------+ 4 rows in set (0.00 sec) |
Another quick tidbit, dates can be tricky when they’re the last column in the file. The best solution is to put a comma at the end of each string. If you put the comma at the end of each line, you can also dispense with the \r
on the Windows platform.
MySQL Stored Procedure
A friend, who’s a MySQL-only developer, asked me a question that I couldn’t resist sharing. He wanted to know how to simplify his PHP code by making a single call to the database for a set of inserts to multiple tables.
You can imagine my delight when I got to say a stored procedure. He said, “It’s not Oracle; that won’t work.” I said, “MySQL supports stored procedures too.” Then, he asked how do you do that. Since the reference manual is a bit sparse on that, here’s a quick example.
Here are the detailed steps, even though there are some other blog entries with more information on these topics.
1. Sign on as the root user, if you need more on that go here. Create a database – storeddb
:
CREATE DATABASE storeddb; |
2. Create a user and grant them permissions:
mysql> CREATE USER 'student' IDENTIFIED BY 'student'; mysql> GRANT ALL ON storeddb.* TO 'student'@'localhost' IDENTIFIED BY 'student'; |
3. Create a create_mysql_procedure.sql
file with the following contents:
-- Select the database. USE storeddb; -- Conditionally drop the objects to make this rerunnable. DROP TABLE IF EXISTS a; DROP TABLE IF EXISTS b; DROP PROCEDURE IF EXISTS double_insert; -- Create the tables. CREATE TABLE a ( a_id INT UNSIGNED PRIMARY KEY AUTO_INCREMENT , a_text CHAR(12)); CREATE TABLE b ( b_id INT UNSIGNED PRIMARY KEY AUTO_INCREMENT , a_id INT UNSIGNED , b_text CHAR(12)); -- Reset the execution delimiter to create a stored program. DELIMITER $$ -- Create a stored procedure. CREATE PROCEDURE double_insert ( input_a CHAR(12), input_b CHAR(12)) BEGIN START TRANSACTION; INSERT INTO a VALUES (NULL, input_a); INSERT INTO b VALUES (NULL, last_insert_id(), input_b); COMMIT; END; $$ -- Reset the delimiter to the default. DELIMITER ; -- Declare a couple local session variables. SET @text1 = 'This is one.'; SET @text2 = 'This is two.'; -- Call the local procedure. CALL double_insert(@text1,@text2); SELECT * FROM a; SELECT * FROM b; |
4. Quit the session as the root user.
mysql> QUIT; |
5. Sign on as the student
user.
C:\> mysql -ustudent -pstudent |
6. As the student
user, source the file. You have two ways to do that. One leverage csh/Tcsh
shell syntax and the other uses Bourne, Korn, or BASH
shell syntax.
6(a). The csh/Tcsh
syntax:
mysql> SOURCE create_mysql_procedure.sql |
6(b). The Bourne, Korn, or BASH syntax:
mysql> \. create_mysql_procedure.sql |
When you source it, you should see the following. Don’t worry if you see the three warnings because when you rerun the script they won’t be there. There telling you that the tables didn’t exist to be dropped the first time.
DATABASE changed Query OK, 0 ROWS affected, 1 warning (0.00 sec) Query OK, 0 ROWS affected, 1 warning (0.00 sec) Query OK, 0 ROWS affected, 1 warning (0.00 sec) Query OK, 0 ROWS affected (0.14 sec) Query OK, 0 ROWS affected (0.00 sec) Query OK, 0 ROWS affected (0.08 sec) Query OK, 0 ROWS affected (0.00 sec) Query OK, 0 ROWS affected (0.00 sec) Query OK, 0 ROWS affected (0.09 sec) +------+--------------+ | a_id | a_text | +------+--------------+ | 1 | This IS one. | +------+--------------+ 1 ROW IN SET (0.00 sec) +------+------+--------------+ | b_id | a_id | b_text | +------+------+--------------+ | 1 | 1 | This IS two. | +------+------+--------------+ 1 ROW IN SET (0.00 sec) |
A quick note to those new to automatic numbering in MySQL. You use a null
when you don’t want to write an override signature for the INSERT
statement. You can find more on SQL Automated Numbering for Oracle, MySQL, and SQL Server in this blog post.
7. Write the following PHP program, and name it call_mysql_procedure.php
. We’re going to call it from the command line, but you shouldn’t have to modify it when you call it from a browser.
<?php // Attempt to connect to your database. $c = @mysqli_connect("localhost", "student", "student", "storeddb"); if (!$c) { print "Sorry! The connection to the database failed. Please try again later."; die(); } else { // Initialize a statement in the scope of the connection. $stmt = mysqli_stmt_init($c); // Declare two variables for the test procedure call. $val1 = "Hello Hal!"; $val2 = "Hello Dave!"; // Set the call statement, like a SQL statement. $sql = "CALL double_insert(?,?)"; // Prepare the statement and bind the two strings. if (mysqli_stmt_prepare($stmt, $sql)) { mysqli_stmt_bind_param($stmt, "ss", $val1, $val2); // Execute it and print success or failure message. $success = mysqli_stmt_execute($stmt); if ($success) { print "Congrats! You've executed a MySQL stored procedure from PHP!"; } else { print "Sorry, I can't do that Dave..."; } } } ?> |
8. Run the call_mysql_procedure.php
from the command line, like this:
php call_mysql_procedure.php |
You should see the following message:
Congrats! You've executed a MySQL stored procedure from PHP! |
9. You can now connect to the MySQL database storeddb
and re-query the tables a
and b
. You should see the following, which tells you that your PHP code worked.
mysql> SELECT * FROM a; +------+--------------+ | a_id | a_text | +------+--------------+ | 1 | This IS one. | | 2 | Hello Hal! | +------+--------------+ 2 ROWS IN SET (0.00 sec) mysql> SELECT * FROM b; +------+------+--------------+ | b_id | a_id | b_text | +------+------+--------------+ | 1 | 1 | This IS two. | | 2 | 2 | Hello Dave! | +------+------+--------------+ 2 ROWS IN SET (0.00 sec) |
I guess Alice that writing PL/SQL may actually be portable to other databases? Actually, there are some severe limits on loops in MySQL’s implementation but you can move basic stored procedures as of MySQL 5.1. Officially, MySQL 5.0 supports them.
As always, I hope this helps some folks.
MySQL Information Schema
Having gone through the old DBA training at Oracle, I really appreciated finding a nice diagram of the MySQL Database Catalog. The catalog is found in the information_schema
database. If you click on the image file, it’ll take you to the site where the original Visio File is found. You can also download it from there.
What may appear as a downside of the information_schema
is that you can’t grant even select privileges to external users. You’d get the following if you tried to do so as the root
user.
mysql> GRANT SELECT ON information_schema.* TO myuser; ERROR 1044 (42000): Access denied FOR USER 'root'@'localhost' TO DATABASE 'information_schema' |
There’s a reason for this behavior. You already have select privileges by default because the information_schema
is a query only repository and open to all users. Here’s a quick example of accessing them from inside another database, which requires that you reference the tables with the owning database name.
SELECT t.table_name , c.column_name FROM information_schema.tables t JOIN information_schema.columns c ON t.table_name = c.table_name WHERE t.table_schema = 'SAMPLEDB'; |
Hope this answers a question or two.
MySQL Merge gone Awry
Sometimes it gets tiresome when people take shots at Oracle, MySQL, SQL Server 2008, or PostgreSQL. When I went to the MySQL for Database Administrators, the instructor mentioned a number of times how many Oracle people he’d get in his class. It was said almost as if all the Oracle customers were migrating to MySQL, which I don’t think is the case. If I’m wrong just post a comment.
After writing SQL since 1985 (IBM SQL/DS), I’ve come to appreciate a number of the features in the Oracle database. This probably makes sense because I worked there for over eight years, and probably would still be there if I hadn’t left to teach at University. I’m constantly amazed as I explore and look for points of commonality across SQL dialects and PL/SQL dialects. At least, I think it’s fair to call T-SQL stored programs and MySQL stored programs PL/SQL dialects.
This blog post is about the MERGE
statement and it’s close MySQL cousin, the ON DUPLICATE KEY UPDATE
statement (that only works with single row INSERT
statements). I’ve recently added a post demonstrating the REPLACE INTO
command syntax. You may also be interested in a newer blog post about importing external data inside a MySQL procedure with cursor loops. Upfront, I have to vote for Oracle’s MERGE
statement because it’s more complete as an implementation. By the way, T-SQL supports the same MERGE
syntax. They’re so exact it wasn’t worth taking the space to show the syntax for the SQL Server 2008 Express product.
Oracle’s Merge Statement
The downside of Oracle’s MERGE
statement is that it takes a LOT of typing. The upside from my perspective is that it enforces that you must use the primary key column. The same is not true with the MySQL syntax.
Here’s a quick example that you can cut and paste into your environment for Oracle Database 11g or remove the Oracle specific FROM dual
and it’ll run in SQL Server. You can also see how to leverage joins and imports with the MERGE
statement in this more recent blog post.
-- Conditionally drop the table and sequence. BEGIN FOR i IN (SELECT NULL FROM user_tables WHERE TABLE_NAME = 'SYSTEM_USER') LOOP EXECUTE IMMEDIATE 'DROP TABLE system_user CASCADE CONSTRAINTS'; END LOOP; FOR i IN (SELECT NULL FROM user_sequences WHERE sequence_name = 'SYSTEM_USER_S1') LOOP EXECUTE IMMEDIATE 'DROP SEQUENCE system_user_s1'; END LOOP; END; / -- Create the table. CREATE TABLE system_user ( system_user_id NUMBER CONSTRAINT pk_su PRIMARY KEY , system_user_name VARCHAR2(20) CONSTRAINT nn_su_1 NOT NULL , system_user_group_id NUMBER CONSTRAINT nn_su_2 NOT NULL , system_user_type NUMBER CONSTRAINT nn_su_3 NOT NULL , first_name VARCHAR2(20) , middle_name VARCHAR2(10) , last_name VARCHAR2(20) , created_by NUMBER CONSTRAINT nn_su_4 NOT NULL , creation_date DATE CONSTRAINT nn_su_5 NOT NULL , last_updated_by NUMBER CONSTRAINT nn_su_6 NOT NULL , last_update_date DATE CONSTRAINT nn_su_7 NOT NULL); -- Create the sequence with a default start value of 1. CREATE SEQUENCE system_user_s1; -- Insert new row. INSERT INTO system_user VALUES ( system_user_s1.nextval , 'SYSADMIN' , 1 , 1 , NULL , NULL , NULL , 1 , SYSDATE - 1 , 1 , SYSDATE - 1); -- Insert new or merge into existing row. MERGE INTO system_user target USING (SELECT 1 AS system_user_id , 'SYSADMIN' AS system_user_name , 1 AS system_user_group_id , 1 AS system_user_type , 'Samuel' AS first_name , 'the' AS middle_name , 'Lamanite' AS last_name , 1 AS created_by , SYSDATE AS creation_date , 1 AS last_updated_by , SYSDATE AS last_update_date FROM dual) SOURCE ON (target.system_user_id = SOURCE.system_user_id) WHEN MATCHED THEN UPDATE SET first_name = 'Samuel' , middle_name = 'the' , last_name = 'Lamanite' , last_updated_by = 1 , last_update_date = SYSDATE WHEN NOT MATCHED THEN INSERT ( target.system_user_id , target.system_user_name , target.system_user_group_id , target.system_user_type , target.first_name , target.middle_name , target.last_name , target.created_by , target.creation_date , target.last_updated_by , target.last_update_date ) VALUES ( SOURCE.system_user_id , SOURCE.system_user_name , SOURCE.system_user_group_id , SOURCE.system_user_type , SOURCE.first_name , SOURCE.middle_name , SOURCE.last_name , SOURCE.created_by , SOURCE.creation_date , SOURCE.last_updated_by , SOURCE.last_update_date ); |
MySQL On Duplicate Key Update Statement
The downside of MySQL’s ON DUPLICATE KEY UPDATE
statement is that it takes it lets you use an override signature on the INSERT
that then creates a new row when it shouldn’t. I logged a bug when I discovered that behavior earlier today but they don’t quite agree that it’s a bug. The upside is that the typing is MUCH shorter provided you remember to use the auto increment key column in the SELECT
clause.
Here’s a quick example of what not to do! The next one shows you what you should do. You can see how to tie together MySQL subroutines with an INSERT ... ON DUPLICATE KEY
statement in this more recent blog entry.
-- Conditionally drop the table. DROP TABLE IF EXISTS system_user; -- Create the table. CREATE TABLE system_user ( system_user_id INT UNSIGNED PRIMARY KEY AUTO_INCREMENT , system_user_name CHAR(20) NOT NULL , system_user_group_id INT NOT NULL , system_user_type INT NOT NULL , first_name CHAR(20) , middle_name CHAR(10) , last_name CHAR(20) , created_by INT NOT NULL , creation_date DATE NOT NULL , last_updated_by INT NOT NULL , last_update_date DATE NOT NULL); -- Insert new row. INSERT INTO system_user ( system_user_name , system_user_group_id , system_user_type , created_by , creation_date , last_updated_by , last_update_date ) VALUES ('SYSADMIN' , 1 , 1 , 1 , DATE_SUB(NOW(),INTERVAL 1 DAY) , 1 , DATE_SUB(NOW(),INTERVAL 1 DAY)); -- Insert new or merge into existing row. INSERT INTO system_user ( system_user_name , system_user_group_id , system_user_type , first_name , middle_name , last_name , created_by , creation_date , last_updated_by , last_update_date ) VALUES ('SYSADMIN' , 1 , 1 ,'Samuel' ,'the' ,'Lamanite' , 1 , NOW() , 1 , NOW()) ON DUPLICATE KEY UPDATE first_name = 'Samuel' , middle_name = 'the' , last_name = 'Lamanite' , last_updated_by = 1 , last_update_date = UTC_DATE(); |
This script ends up inserting two rows when only one should be present. Why did that happen? Great question! You can fix this by adding a unique key defined for the columns that make up the natural key for the SYSTEM_USER
table. As noted by a comment below, the unique key must only include columns that are NOT NULL
constrained. Here’s the results without such a unique key:
mysql> SELECT * FROM system_user\G *************************** 1. ROW *************************** system_user_id: 1 system_user_name: SYSADMIN system_user_group_id: 1 system_user_type: 1 first_name: NULL middle_name: NULL last_name: NULL created_by: 1 creation_date: 2009-05-24 last_updated_by: 1 last_update_date: 2009-05-24 *************************** 2. ROW *************************** system_user_id: 2 system_user_name: SYSADMIN system_user_group_id: 1 system_user_type: 1 first_name: Samuel middle_name: NULL last_name: Lamanite created_by: 1 creation_date: 2009-05-25 last_updated_by: 1 last_update_date: 2009-05-25 2 ROWS IN SET (0.02 sec) |
The correct way to do this in MySQL is shown in the next example.
-- Conditionally drop table. DROP TABLE IF EXISTS system_user; -- Create table. CREATE TABLE system_user ( system_user_id INT UNSIGNED PRIMARY KEY AUTO_INCREMENT , system_user_name CHAR(20) NOT NULL , system_user_group_id INT NOT NULL , system_user_type INT NOT NULL , first_name CHAR(20) , middle_name CHAR(10) , last_name CHAR(20) , created_by INT NOT NULL , creation_date DATE NOT NULL , last_updated_by INT NOT NULL , last_update_date DATE NOT NULL); -- Insert new row. INSERT INTO system_user ( system_user_name , system_user_group_id , system_user_type , created_by , creation_date , last_updated_by , last_update_date ) VALUES ('SYSADMIN' , 1 , 1 , 1 , DATE_SUB(UTC_DATE(),INTERVAL 1 DAY) , 1 , DATE_SUB(UTC_DATE(),INTERVAL 1 DAY)); -- Insert new or merge into existing row. INSERT INTO system_user VALUES ( 1 ,'SYSADMIN' , 1 , 1 ,'Samuel' ,'the' ,'Lamanite' , 1 , NOW() , 1 , NOW()) ON DUPLICATE KEY UPDATE first_name = 'Samuel' , middle_name = 'the' , last_name = 'Lamanite' , last_updated_by = 1 , last_update_date = NOW(); |
This ensures that the auto increment column values are matched. It returns what you’d expect, a single row inserted into or updated in the table.
mysql> SELECT * FROM system_user\G *************************** 1. ROW *************************** system_user_id: 1 system_user_name: SYSADMIN system_user_group_id: 1 system_user_type: 1 first_name: Samuel middle_name: NULL last_name: Lamanite created_by: 1 creation_date: 2009-05-24 last_updated_by: 1 last_update_date: 2009-05-25 1 ROW IN SET (0.00 sec) |
I hope this saves somebody from an insertion anomaly.
MySQL Installation and More
Installing MySQL is pretty straightforward, especially on a Windows operating system. However, recently I was surprised to find out that there were folks who didn’t know you could install multiple version on their Windows development machine. Here’s are the step-by-step instructions to do so. At the end of the screen shots you’ll find out how to sign-on and create your first database and user account. Naturally, it’s all at the command line because they work on any platform the same way.
1. After you download an Microsoft Software Installation (MSI) file or the zip file, you launch the installer by clicking the Next button:
2. If you want to install only one copy of the software at any given time, you can accept the default Typical radio button choice. This way, you can simply uninstall the software before you install a new version. When you want to install multiple versions on a single machine, you should check the Custom radio button. This choice lets you maintain multiple versions on your single development machine and check changes between releases. After choosing a direction, you click the Next button to continue.
3. The first thing you should do is upgrade the Developer Components. You do that by clicking on the down arrow to the right of the , which yields the context menu below. Place your mouse over the top most This feature will be installed on local hard drive. After that you can begin to customize the installation. 4. This is a step that you’ll see when you opt to do a Custom installation. It is important that you click the Change button and change the default directory for the installation. When you click the Change button it will launch a File Finder window. 5. You can put the files anywhere you choose but I’d recommend that you opt to use something like this directory path: As you’ll see in the screen shot mine is slightly different because my virtual Windows XP machine is a 64-bit operating system. That’s why you’ll see the 6. The choice of your file directory is now displayed below the caption Install to:, and you click the Next button to continue. 7. Before you click that Next button, you have one more step if you plan on having multiple instances installed on your machine. You need to change the default storage location for data files. You do that by clicking on the MySQL Server Datafiles. When you click on that, you’ll see the following default file location below the caption Install to: You click the Change button to enter another location for the database files. 8. After you click on Change button, you’ll see the File Finder. You can then enter the directory that you’d prefer for the installation. My suggestion is the following: Click on the OK button to complete the selection of a customized database file location. 9. You should now see the wizard screen again. Choose the Next button to cotinue. 10. This is the screen where you see what you’re about to do, before you do it. It should look like this if you’re following my instructions or like whatever you’ve opted to use in your installation. Click the Install button to begin the installation. 11. The installation starts and you’ll see a dialog with a creeping status bar. It should take about 2-3 minutes on many computers but it may seem longer. After the status bar, you’ll see the following marketing screen. Click the Next button to continue. 12. The second marketing dialog then shows up, just click the Next button to continue. 13. The installation is now complete. You should ensure you’ve checked the Configure the MySQL Server now box. Click the Finish button to end the installation of the file system and begin the configuration of the MySQL server. 14. The next dialog is the configuration welcome. Click the Next button to continue. 15. This is an important dialog because you must choose between a Standard or Detailed configuration. You should check the Detailed radio button When you plan to install more than one MySQL version on your computer because you’ll need to control the listening port for the server. A default MySQL Sever installation sets the listening port at 3306. Typically, you’ll want the main version to run on port 3306 and set the other versions to 3307, 3308, et cetera. 16. This dialog lets you choose how to configure performance characteristics of your MySQL Server. As you become more skilled at MySQL, you’ll find that these are configurable in a file, but for now using the interface is probably your best bet. The choice you make here sets how much memory MySQL will take while running. Typically, you should choose Developer Machine unless this is a server installation or you have a log of extra memorty to dedicate to MySQL Server (that’s the mysqld process). 17. This dialog lets you configure the Inno DB files location. You should make sure that you ensure they’ll be different if you’re installing multiple releases. You click the drop down arrow for the Installation Path box, choose one and manually edit it, like shown for MySQL 5.1. 18. This dialog governs configuration parameters about concurrency. If this is for a developer machine click the Decision Support (DSS)/OLAP radio button. It generally gives you more than enough connections for development activities. 19. This is the dialog where you enable a firewall exception and set a port. Since MySQL 5.1 is my main testing environment, port 3306 (the default) is selected. You also want to enable both check boxes. 20. This dialog lets you choose a multiple byte unicode character set or the standard 21. This dialog names your Microsoft Windows Service for MySQL. It is important that you change this to represent the release. The default for every release is MySQL. You should also uncheck the Include Bin Directory in Windows PATH, and set that manually in a command shell file (e.g., a mysql51.bat file). You can run the batch file when you want to work in this environment. If you allow this to append to the system path, make sure you know which one you’re using when working with MySQL. 22. This dialog sets the root password. My advise is that you leave unchecked the Create An Anonymous Account and the Enable root access from remote machines boxes. Make sure you use a password that you can remember, even if it is your personal machine. 23. You’re almost done. The next dialog confirms you want to configure MySQL. Click the Execute button to configure the server. After the configuration, you’ll see the following dialog. It means you’re done. 24. If you encounter a failure starting the service, it’s not infrequent in any number of MSI files. I wrote another post on how to work around it here. The following is the error screen that may occur. 25. Here are the steps to create your own database and user. It also shows you how to grant privileges to the new user on only their database. (a). You connect as the superuser by calling the client-side You can confirm your the root user by running the following query: (b). You create a database with the following syntax: (c). You create a user with the trivial password of the user’s name, which is a really bad example when you’re doing anything but writing documentation. The following creates the user and then grants the user all privileges on their database. (d). You sign-off as the root user by typing quit, like (e). You sign-on as the restricted You can confirm your the restricted user by running the following query: (f). You can’t do much except explore generic metadata at this point because you must select a database. You select the You could now create a Alternatively, you could place those three commands (minus the “mysql>” on each line) in a file and run the file from the command line. Just so there’s no misunderstanding, the file would look like this: If you named the file This completes the basic steps, I’ll put a link here for a cross platform comparative tutorial on client-side command line interfaces, like Oracle’s C:\Program Files\MySQL 5.1\MySQL Server 5.1\
(x86)
as part of the Windows Program Files directory path. It indicates that MySQL will be installed in the 32-bit executable folder. You click the OK button to accept the chosen directory and return control back to the MySQL installation program.C:\Documents and Settings\All Users\Application Data\MySQL\MySQL Server 5.1\
C:\Data\MySQL 5.1\MySQL Server 5.1\
latin1
. For those coming from an Oracle world, that’s WEISO8859-1
character set.mysql
program, like the following (assumes you set the password for the root user to cangetin). You don’t need to provide the port number if it is the default 3306
but will need to provide it for any other port that the mysqld service is listening on.C:\Data> mysql -uroot -pcangetin -P3306
mysql> SELECT CURRENT_USER();
mysql> CREATE DATABASE sampledb;
mysql> CREATE USER 'myuser' IDENTIFIED BY 'myuser';
mysql> GRANT ALL ON sampledb.* TO 'myuser'@'localhost' IDENTIFIED BY 'myuser';
mysql> QUIT;
myuser
with the following syntax:C:\Data> mysql -umyuser -pmyuser -P3306
mysql> SELECT CURRENT_USER();
sampledb
database by doing the following:mysql> USE sampledb;
sample
table that uses automatic numbering, insert values, and query the contents by using the following syntax.mysql> CREATE TABLE sample ( sample_id INT PRIMARY KEY AUTO_INCREMENT
, sample_text VARCHAR(20));
mysql> INSERT INTO sample (sample_text) VALUES ('One'),('Two'),('Three'),('Four'),('Five');
mysql> SELECT * FROM sample;
+-----------+-------------+
| sample_id | sample_text |
+-----------+-------------+
| 1 | One |
| 2 | Two |
| 3 | Three |
| 4 | Four |
| 5 | Five |
+-----------+-------------+
-- Open the database.
USE sampledb;
-- Split standard out, and write a log file to the relative directory.
TEE C:/DATA/sample.txt
-- Run code.
DROP TABLE IF EXISTS sample;
CREATE TABLE sample ( sample_id INT PRIMARY KEY AUTO_INCREMENT
, sample_text VARCHAR(20));
INSERT INTO sample (sample_text) VALUES ('One'),('Two'),('Three'),('Four'),('Five');
SELECT * FROM sample;
-- Turn off tee, close file to automatic appending.
NOTEE
sample.sql
and it is found in the C:\Data
directory, you’d run it like this:mysql> SOURCE C:/DATA/sample.sql
sqlplus
, MySQL’s mysql
, and Microsoft SQL Server 2008’s sqlcmd
.
SQL Automated Numbers
I’ve begun putting together an online database tutorial and expanded this entry and added horizontal scrolling to it. You can find the improved version of the blog post as blog page here.
Surrogate keys are interesting structures in databases. They’re essential if you want to make sure you optimize your design. They’re also very useful when you want to capture the automatic numbering value for a prior INSERT
statement and reuse the automatic numbering value as the foreign key value in a subsequent statement. It was interesting to see how they’re implemented differently across Oracle, MySQL, and SQL Server while providing the same utility.
Below is a synopsis of how you implement these in Oracle, MySQL, and SQL Server.
Oracle
The first thing to qualify is that Oracle is generally always in a transactional mode. That means you don’t need to do anything special to set this example up.
Oracle doesn’t support automated numbering in tables prior to Oracle 12c. Oracle 12c introduces identity columns, and the mechanics change. However, you can use sequences to mimic automated numbering prior to Oracle 12c and without identity columns in Oracle 12c. A sequence is a structure in the database that holds a current value, increments by a fixed value – typically 1. Sequences are available in SQL and PL/SQL scopes through two pseudo columns. The pseudo columns are .nextval
and .currval
(note the two r’s because it’s not a stray dog).
The sequence_name.nextval
call in any session places the next number from the sequence into your Personal Global Area (PGA), which is a memory context. After you’ve called the sequence into memory, you can access it again by using sequence_name.currval
. The sequence only changes when you call it again with the .nextval
pseudo column.
-- Conditionally drop data sturctures - tables and sequences. BEGIN FOR i IN (SELECT TABLE_NAME FROM user_tables WHERE TABLE_NAME IN ('ONE','TWO')) LOOP EXECUTE IMMEDIATE 'DROP TABLE '||i.table_name||' CASCADE CONSTRAINT'; END LOOP; FOR i IN (SELECT sequence_name FROM user_sequences WHERE sequence_name IN ('ONE_S1','TWO_S1')) LOOP EXECUTE IMMEDIATE 'DROP SEQUENCE '||i.sequence_name; END LOOP; END; / -- Create base table and sequence. CREATE TABLE one ( one_id INT NOT NULL CONSTRAINT pk_one PRIMARY KEY , one_text VARCHAR(10) NOT NULL ); CREATE SEQUENCE one_s1; -- Create dependent table and sequence. CREATE TABLE two ( two_id INT NOT NULL CONSTRAINT pk_two PRIMARY KEY , one_id INT NOT NULL , two_text VARCHAR(10) NOT NULL ); CREATE SEQUENCE two_s1; -- Insert rows into the tables with sequence values. INSERT INTO one VALUES (one_s1.nextval,'One!'); INSERT INTO one VALUES (one_s1.nextval,'Two!'); INSERT INTO two VALUES (two_s1.nextval, one_s1.currval,'Other Two!'); -- Display the values inserted with sequences. SELECT o.one_id , o.one_text , t.two_id , t.two_text FROM one o JOIN two t ON o.one_id = t.one_id; |
If you mimic automatic numbering with database triggers, you may not have access to the .currval
value for the second INSERT
statement. This occurs when you provide a NULL
value expecting the trigger to manage .NEXTVAL
call for you.
Transactions require that you keep the primary key value for the first table in a locally scoped variable for reuse. Then, you can pass it to the next INSERT
statement. You do that with the .CURRVAL
value.
You can make a potentially erroneous assumption that you’re the only user updating the table. Operating under that assumption, you can query the highest sequence number from the table before an insert, add one to it, and then attempt the INSERT
statement. In a multi-user system, it’s possible that somebody beats you to the finish line with their INSERT
statement. Your insert would then have a duplicate surrogate key value for the one_id
column, and fail on an ORA-00001
error for a uniqueness violation on a primary key column.
A database trigger can help you avoid a race condition. The trigger would ensure sequence values are unique but it may also introduce problems. A common Oracle trigger with a pseudo automatic numbering paradigm is represented by the following trigger (found in APEX generated code).
CREATE OR REPLACE TRIGGER one_t1 BEFORE INSERT ON one FOR EACH ROW BEGIN :NEW.one_id := one_s1.nextval; END; / |
Caution is required on this type of automated sequence trigger. There are two problems with this type of trigger.
One scenario is where you include a call to sequence_name.NEXTVAL
in your INSERT
statement. It then increments the sequence, and attempts to insert the value whereupon the trigger fires and repeats the behavior. Effectively, this type of logic creates a sequence that increments by one when you submit a null value in the values clause and by two when you submit a sequence_name.NEXTVAL
value.
Another scenario occurs when you attempt a bulk INSERT
operation on the table. The sequence call and substitution occurs on each row of the sequence.
You face another problem when you rewrite the trigger to only fire when a surrogate primary key isn’t provided, like this:
CREATE OR REPLACE TRIGGER one_t1 BEFORE INSERT ON one FOR EACH ROW WHEN (NEW.one_id IS NULL) -- Asynchronous with bulk insert operations when a value is provided by the bulk operation to the surrogate key column. BEGIN :NEW.one_id := one_s1.nextval; END; / |
This trigger design causes a problem only with bulk INSERT
statements. Effectively, the sequence remains unaltered when you provide surrogate key values as part of inserting an array of values. The next non-bulk INSERT
statement would then grab the .NEXTVAL
value, attempt to use it, and raise a unique constraint violation because the bulk operation probably already used the value from the sequence.
The fix to bulk operations requires that you lock the table, disable a trigger like this, and get the .NEXTVAL
value. Then, you assign the .NEXTVAL
value to two local variables. One of these remains unchanged while the other increments as you populate the array for the bulk insert operation. After assigning the result from the .NEXTVAL
, you drop the sequence and find the highest key value for the bulk insertion operation, add one to the highest key value, and store it in another locally stored variable. You perform the bulk insert operation and then recreate the sequence with a value one greater than the highest value in the table, which should already be in a locally scored variable. Don’t forget that you’d locked the table, so unlock it now.
You should note that database triggers run in a subshell with access only to the immediate shell that fired them. Therefore, you can’t set a bind variable in a SQL*Plus session and subsequently reference it inside the trigger body because it doesn’t have access to the variable.
MySQL
MySQL supports automatic numbering but not a default transactional mode like Oracle. You need to disable auto commit and start a transaction. You also need to assign the last automatic numbering value to a variable before using it in a subsequent INSERT
statement. You must also provide an overriding list of mandatory columns when you opt to exclude the automated numbering column value. The one thing that we should all appreciate about MySQL is their desire to stay close to and comply with ANSI standards.
-- Conditionally drop the tables. DROP TABLE IF EXISTS one; DROP TABLE IF EXISTS two; -- Create the tables with a surrogate key that automatically increments. CREATE TABLE one ( one_id INT PRIMARY KEY AUTO_INCREMENT , one_text VARCHAR(20)); CREATE TABLE two ( two_id INT PRIMARY KEY AUTO_INCREMENT , one_id INT , two_text VARCHAR(20)); -- Start transaction cycle. START TRANSACTION; -- Insert first row, transfer auto increment to memory. INSERT INTO one (one_text) VALUES ('One'); -- Assign last auto increment to local scope variable, the = works too. SET @one_fk := last_insert_id(); -- Insert second row with auto increment and local scope variable. INSERT INTO b (one_id, two_text) VALUES (@one_fk,'Two'); COMMIT; -- Display the values inserted with auto incremented values. SELECT o.one_id , o.one_text , t.two_id , t.two_text FROM one o JOIN two t ON o.one_id = t.one_id; |
SQL Server
SQL Server supports automatic numbering but they call it the identity value. There are two ways to use it but the one I’m showing is for SQL Server 2005 or newer. You can replace the older @@identity
for the SCOPE_IDENTITY()
function call but Microsoft has already removed first level support from SQL Server 2000. While they’ve not said @@identity
is deprecated, it sure appears that’s possible in a future release.
USE student; BEGIN TRAN; -- Conditionally drop tables when they exist. IF OBJECT_ID('dbo.one','U') IS NOT NULL DROP TABLE dbo.one; IF OBJECT_ID('dbo.two','U') IS NOT NULL DROP TABLE dbo.two; -- Create auto incrementing tables. CREATE TABLE one ( one_id INT NOT NULL IDENTITY(1,1) CONSTRAINT pk_one PRIMARY KEY , one_text VARCHAR(10) NOT NULL ); CREATE TABLE two ( two_id INT NOT NULL IDENTITY(1,1) CONSTRAINT pk_two PRIMARY KEY , one_id INT NOT NULL , two_text VARCHAR(10) NOT NULL ); -- Insert the values, and magically no override signature required. INSERT INTO one VALUES ('One!'); INSERT INTO one VALUES ('Two!'); INSERT INTO two VALUES (SCOPE_IDENTITY(),'Other Two!'); -- Query the results. SELECT o.one_id , o.one_text , t.two_id , t.two_text FROM one o JOIN two t ON o.one_id = t.one_id; COMMIT TRAN; |
You should note that T-SQL doesn’t require an override signature when you use an automatic numbering column. This is different, isn’t it?
While the prior example works with two tables, it doesn’t scale to a series of tables. You should consider the following assignment pattern when you’ll have multiple last identity values in a single transaction scope.
DECLARE @one_pk AS INT; SET @one_pk = SCOPE_IDENTITY(); |
As mentioned, this style is important when you’ve got a series of primary and foreign keys to map in the scope of a single transaction. Also, I’d suggest that you put all the declarations at the beginning of the transaction’s scope.
As always, I hope this helps some folks.
MySQL Explain Plan
I finally got the magic trick to find the explain plan for a query in MySQL, but chagrined to find out that you can’t explain the cost or execution of INSERT
, UPDATE
or DELETE
statements. This blog shows you how to get the execution and execution plan for a query and the error when you try to EXPLAIN
DML (Data Manipulation Language) statements.
You can see the selection type, possible indexes, chosen index, et cetera:
EXPLAIN query; |
You can see the selection type, possible indexes, chosen index, et cetera plus the query execution plan with the magic EXTENDED
key word:
EXPLAIN EXTENDED query; |
In some cases, you may need to type SHOW ERRORS
to see the execution plan. There’s no way to format it natively, like Oracle, but there may be a tool out there.
If you try to explain a DML statement, you’ll get an error message like the following:
ERROR 1064 (42000): You have an error IN your SQL syntax; CHECK the manual that corresponds TO your MySQL server version FOR the RIGHT syntax TO USE near 'UPDATE city SET District = 'Funny'' at line 1 |
The last rule of thumb on query execution is that you should avoid subqueries because they degrade processing speed.