MacLochlainns Weblog

Michael McLaughlin's Technical Blog

Site Admin

SQL Automated Numbers

with 4 comments

I’ve begun putting together an online database tutorial and expanded this entry and added horizontal scrolling to it. You can find the improved version of the blog post as blog page here.

Surrogate keys are interesting structures in databases. They’re essential if you want to make sure you optimize your design. They’re also very useful when you want to capture the automatic numbering value for a prior INSERT statement and reuse the automatic numbering value as the foreign key value in a subsequent statement. It was interesting to see how they’re implemented differently across Oracle, MySQL, and SQL Server while providing the same utility.

Below is a synopsis of how you implement these in Oracle, MySQL, and SQL Server.

Oracle

The first thing to qualify is that Oracle is generally always in a transactional mode. That means you don’t need to do anything special to set this example up.

Oracle doesn’t support automated numbering in tables prior to Oracle 12c. Oracle 12c introduces identity columns, and the mechanics change. However, you can use sequences to mimic automated numbering prior to Oracle 12c and without identity columns in Oracle 12c. A sequence is a structure in the database that holds a current value, increments by a fixed value – typically 1. Sequences are available in SQL and PL/SQL scopes through two pseudo columns. The pseudo columns are .nextval and .currval (note the two r’s because it’s not a stray dog).

The sequence_name.nextval call in any session places the next number from the sequence into your Personal Global Area (PGA), which is a memory context. After you’ve called the sequence into memory, you can access it again by using sequence_name.currval. The sequence only changes when you call it again with the .nextval pseudo column.

-- Conditionally drop data sturctures - tables and sequences.
BEGIN
  FOR i IN (SELECT TABLE_NAME
            FROM   user_tables
			WHERE  TABLE_NAME IN ('ONE','TWO')) LOOP
    EXECUTE IMMEDIATE 'DROP TABLE '||i.TABLE_NAME||' CASCADE CONSTRAINT';
  END LOOP;
  FOR i IN (SELECT sequence_name
            FROM   user_sequences
			WHERE  sequence_name IN ('ONE_S1','TWO_S1')) LOOP
    EXECUTE IMMEDIATE 'DROP SEQUENCE '||i.sequence_name;
  END LOOP;
END;
/
 
-- Create base table and sequence.
CREATE TABLE one
( one_id   INT         NOT NULL CONSTRAINT pk_one PRIMARY KEY
, one_text VARCHAR(10) NOT NULL );
 
CREATE SEQUENCE one_s1;
 
-- Create dependent table and sequence.
CREATE TABLE two
( two_id   INT         NOT NULL CONSTRAINT pk_two PRIMARY KEY
, one_id   INT         NOT NULL
, two_text VARCHAR(10) NOT NULL );
 
CREATE SEQUENCE two_s1;
 
-- Insert rows into the tables with sequence values.
INSERT INTO one VALUES (one_s1.NEXTVAL,'One!');
INSERT INTO one VALUES (one_s1.NEXTVAL,'Two!');
INSERT INTO two VALUES (two_s1.NEXTVAL, one_s1.currval,'Other Two!');
 
-- Display the values inserted with sequences.
SELECT   o.one_id
,        o.one_text
,        t.two_id
,        t.two_text
FROM     one o JOIN two t ON o.one_id = t.one_id;

If you mimic automatic numbering with database triggers, you may not have access to the .currval value for the second INSERT statement. This occurs when you provide a NULL value expecting the trigger to manage .NEXTVAL call for you.

Transactions require that you keep the primary key value for the first table in a locally scoped variable for reuse. Then, you can pass it to the next INSERT statement. You do that with the .CURRVAL value.

You can make a potentially erroneous assumption that you’re the only user updating the table. Operating under that assumption, you can query the highest sequence number from the table before an insert, add one to it, and then attempt the INSERT statement. In a multi-user system, it’s possible that somebody beats you to the finish line with their INSERT statement. Your insert would then have a duplicate surrogate key value for the one_id column, and fail on an ORA-00001 error for a uniqueness violation on a primary key column.

A database trigger can help you avoid a race condition. The trigger would ensure sequence values are unique but it may also introduce problems. A common Oracle trigger with a pseudo automatic numbering paradigm is represented by the following trigger (found in APEX generated code).

CREATE OR REPLACE TRIGGER one_t1 
BEFORE INSERT ON one
FOR EACH ROW
BEGIN
  :NEW.one_id := one_s1.NEXTVAL;
END;
/

Caution is required on this type of automated sequence trigger. There are two problems with this type of trigger.

One scenario is where you include a call to sequence_name.NEXTVAL in your INSERT statement. It then increments the sequence, and attempts to insert the value whereupon the trigger fires and repeats the behavior. Effectively, this type of logic creates a sequence that increments by one when you submit a null value in the values clause and by two when you submit a sequence_name.NEXTVAL value.

Another scenario occurs when you attempt a bulk INSERT operation on the table. The sequence call and substitution occurs on each row of the sequence.

You face another problem when you rewrite the trigger to only fire when a surrogate primary key isn’t provided, like this:

CREATE OR REPLACE TRIGGER one_t1 
BEFORE INSERT ON one
FOR EACH ROW
WHEN (NEW.one_id IS NULL)  -- Asynchronous with bulk insert operations when a value is provided by the bulk operation to the surrogate key column.
BEGIN
  :NEW.one_id := one_s1.NEXTVAL;
END;
/

This trigger design causes a problem only with bulk INSERT statements. Effectively, the sequence remains unaltered when you provide surrogate key values as part of inserting an array of values. The next non-bulk INSERT statement would then grab the .NEXTVAL value, attempt to use it, and raise a unique constraint violation because the bulk operation probably already used the value from the sequence.

The fix to bulk operations requires that you lock the table, disable a trigger like this, and get the .NEXTVAL value. Then, you assign the .NEXTVAL value to two local variables. One of these remains unchanged while the other increments as you populate the array for the bulk insert operation. After assigning the result from the .NEXTVAL, you drop the sequence and find the highest key value for the bulk insertion operation, add one to the highest key value, and store it in another locally stored variable. You perform the bulk insert operation and then recreate the sequence with a value one greater than the highest value in the table, which should already be in a locally scored variable. Don’t forget that you’d locked the table, so unlock it now.

You should note that database triggers run in a subshell with access only to the immediate shell that fired them. Therefore, you can’t set a bind variable in a SQL*Plus session and subsequently reference it inside the trigger body because it doesn’t have access to the variable.

MySQL

MySQL supports automatic numbering but not a default transactional mode like Oracle. You need to disable auto commit and start a transaction. You also need to assign the last automatic numbering value to a variable before using it in a subsequent INSERT statement. You must also provide an overriding list of mandatory columns when you opt to exclude the automated numbering column value. The one thing that we should all appreciate about MySQL is their desire to stay close to and comply with ANSI standards.

-- Conditionally drop the tables.
DROP TABLE IF EXISTS one;
DROP TABLE IF EXISTS two;
 
-- Create the tables with a surrogate key that automatically increments.
CREATE TABLE one ( one_id   INT PRIMARY KEY AUTO_INCREMENT
                 , one_text VARCHAR(20));
 
CREATE TABLE two ( two_id INT PRIMARY KEY AUTO_INCREMENT
                 , one_id INT
                 , two_text VARCHAR(20));
 
-- Start transaction cycle.
START TRANSACTION;
 
-- Insert first row, transfer auto increment to memory.
INSERT INTO one (one_text) VALUES ('One');
 
-- Assign last auto increment to local scope variable, the = works too.
SET @one_fk := last_insert_id();
 
-- Insert second row with auto increment and local scope variable.
INSERT INTO b (one_id, two_text) VALUES (@one_fk,'Two');
 
COMMIT;
 
-- Display the values inserted with auto incremented values.
SELECT   o.one_id
,        o.one_text
,        t.two_id
,        t.two_text
FROM     one o JOIN two t ON o.one_id = t.one_id;

SQL Server

SQL Server supports automatic numbering but they call it the identity value. There are two ways to use it but the one I’m showing is for SQL Server 2005 or newer. You can replace the older @@identity for the SCOPE_IDENTITY() function call but Microsoft has already removed first level support from SQL Server 2000. While they’ve not said @@identity is deprecated, it sure appears that’s possible in a future release.

USE student;
 
BEGIN TRAN;
 
-- Conditionally drop tables when they exist.
IF OBJECT_ID('dbo.one','U') IS NOT NULL DROP TABLE dbo.one;
IF OBJECT_ID('dbo.two','U') IS NOT NULL DROP TABLE dbo.two;
 
-- Create auto incrementing tables.
CREATE TABLE one
( one_id   INT         NOT NULL IDENTITY(1,1) CONSTRAINT pk_one PRIMARY KEY
, one_text VARCHAR(10) NOT NULL );
 
CREATE TABLE two
( two_id   INT         NOT NULL IDENTITY(1,1) CONSTRAINT pk_two PRIMARY KEY
, one_id   INT         NOT NULL
, two_text VARCHAR(10) NOT NULL );
 
-- Insert the values, and magically no override signature required.
INSERT INTO one VALUES ('One!');
INSERT INTO one VALUES ('Two!');
INSERT INTO two VALUES (SCOPE_IDENTITY(),'Other Two!');
 
-- Query the results.
SELECT   o.one_id
,        o.one_text
,        t.two_id
,        t.two_text
FROM     one o JOIN two t ON o.one_id = t.one_id;
 
COMMIT TRAN;

You should note that T-SQL doesn’t require an override signature when you use an automatic numbering column. This is different, isn’t it?

While the prior example works with two tables, it doesn’t scale to a series of tables. You should consider the following assignment pattern when you’ll have multiple last identity values in a single transaction scope.

DECLARE @one_pk AS INT;
SET @one_pk = SCOPE_IDENTITY();

As mentioned, this style is important when you’ve got a series of primary and foreign keys to map in the scope of a single transaction. Also, I’d suggest that you put all the declarations at the beginning of the transaction’s scope.

As always, I hope this helps some folks.

Written by maclochlainn

May 16th, 2009 at 11:04 pm

4 Responses to 'SQL Automated Numbers'

Subscribe to comments with RSS or TrackBack to 'SQL Automated Numbers'.

  1. Actually, I have a requirement that requires: Inserting data into two tables that are related but unconstrained by a foreign key constraint. After the insert, the relationship should resolve normally through an inner join. How do I do this?

    Govardhan

    17 Oct 09 at 12:27 am

  2. I reworded your question. Is this what you want to accomplish?

    maclochlainn

    17 Oct 09 at 10:11 am

  3. Can you please give much clarity on this? Still I am in confusion why surrogate key is useful

    krishna

    6 Mar 13 at 10:42 pm

  4. Krishna,

    I wrote a whole chapter on this in my new MySQL Workbench: Data Modeling & Development book, but here’s the short version. The natural key defines unique rows in a table based on the best knowledge at the time of uniqueness. Over time, knowledge of subdomains increase and sometimes new columns must be added to the natural key. The most common example is when a designer forgets to temporarily qualify data – providing a start_date and end_date within a natural key.

    Assuming that case, and that you’ve got 100 joins between that table and others. You must then do the following:

    • Alter the table that holds the foreign key column set by adding the start_date and end_date columns.
    • Alter all the tables that hold the foreign key column set by adding the start_date and end_date columns.
    • Add the temporal data to the table that holds the primary key, based on a natural key.
    • Migrate the data from the start_date and end_date primary key columns to the foreign keys.
    • Change the 100 joins because they now involve two additional columns.

    If you were using surrogate keys (SQL automated numbers) as the primary keys, you only need to drop and recreate an index. The index would start with the surrogate key and include the other columns from the natural key. The index would give you optimal join resolution, and you would only need to do the following to add those two new natural key columns:

    • Alter the table that holds the foreign key column set by adding the start_date and end_date columns.
    • Add the temporal data to the table that holds the primary key, based on a natural key.
    • Drop and recreate the unique index with the two new columns.

    maclochlainn

    6 Mar 13 at 11:55 pm

Leave a Reply