I’ve begun putting together an online database tutorial and expanded this entry and added horizontal scrolling to it. You can find the improved version of the blog post as blog page here.
Surrogate keys are interesting structures in databases. They’re essential if you want to make sure you optimize your design. They’re also very useful when you want to capture the automatic numbering value for a prior
INSERT statement and reuse the automatic numbering value as the foreign key value in a subsequent statement. It was interesting to see how they’re implemented differently across Oracle, MySQL, and SQL Server while providing the same utility.
Below is a synopsis of how you implement these in Oracle, MySQL, and SQL Server.
The first thing to qualify is that Oracle is generally always in a transactional mode. That means you don’t need to do anything special to set this example up.
Oracle doesn’t support automated numbering in tables prior to Oracle 12c. Oracle 12c introduces identity columns, and the mechanics change. However, you can use sequences to mimic automated numbering prior to Oracle 12c and without identity columns in Oracle 12c. A sequence is a structure in the database that holds a current value, increments by a fixed value – typically 1. Sequences are available in SQL and PL/SQL scopes through two pseudo columns. The pseudo columns are
.currval (note the two r’s because it’s not a stray dog).
sequence_name.nextval call in any session places the next number from the sequence into your Personal Global Area (PGA), which is a memory context. After you’ve called the sequence into memory, you can access it again by using
sequence_name.currval. The sequence only changes when you call it again with the
.nextval pseudo column.
-- Conditionally drop data sturctures - tables and sequences. BEGIN FOR i IN (SELECT TABLE_NAME FROM user_tables WHERE TABLE_NAME IN ('ONE','TWO')) LOOP EXECUTE IMMEDIATE 'DROP TABLE '||i.TABLE_NAME||' CASCADE CONSTRAINT'; END LOOP; FOR i IN (SELECT sequence_name FROM user_sequences WHERE sequence_name IN ('ONE_S1','TWO_S1')) LOOP EXECUTE IMMEDIATE 'DROP SEQUENCE '||i.sequence_name; END LOOP; END; / -- Create base table and sequence. CREATE TABLE one ( one_id INT NOT NULL CONSTRAINT pk_one PRIMARY KEY , one_text VARCHAR(10) NOT NULL ); CREATE SEQUENCE one_s1; -- Create dependent table and sequence. CREATE TABLE two ( two_id INT NOT NULL CONSTRAINT pk_two PRIMARY KEY , one_id INT NOT NULL , two_text VARCHAR(10) NOT NULL ); CREATE SEQUENCE two_s1; -- Insert rows into the tables with sequence values. INSERT INTO one VALUES (one_s1.NEXTVAL,'One!'); INSERT INTO one VALUES (one_s1.NEXTVAL,'Two!'); INSERT INTO two VALUES (two_s1.NEXTVAL, one_s1.currval,'Other Two!'); -- Display the values inserted with sequences. SELECT o.one_id , o.one_text , t.two_id , t.two_text FROM one o JOIN two t ON o.one_id = t.one_id;
If you mimic automatic numbering with database triggers, you may not have access to the
.currval value for the second
INSERT statement. This occurs when you provide a
NULL value expecting the trigger to manage
.NEXTVAL call for you.
Transactions require that you keep the primary key value for the first table in a locally scoped variable for reuse. Then, you can pass it to the next
INSERT statement. You do that with the
You can make a potentially erroneous assumption that you’re the only user updating the table. Operating under that assumption, you can query the highest sequence number from the table before an insert, add one to it, and then attempt the
INSERT statement. In a multi-user system, it’s possible that somebody beats you to the finish line with their
INSERT statement. Your insert would then have a duplicate surrogate key value for the
one_id column, and fail on an
ORA-00001 error for a uniqueness violation on a primary key column.
A database trigger can help you avoid a race condition. The trigger would ensure sequence values are unique but it may also introduce problems. A common Oracle trigger with a pseudo automatic numbering paradigm is represented by the following trigger (found in APEX generated code).
CREATE OR REPLACE TRIGGER one_t1 BEFORE INSERT ON one FOR EACH ROW BEGIN :NEW.one_id := one_s1.NEXTVAL; END; /
Caution is required on this type of automated sequence trigger. There are two problems with this type of trigger.
One scenario is where you include a call to
sequence_name.NEXTVAL in your
INSERT statement. It then increments the sequence, and attempts to insert the value whereupon the trigger fires and repeats the behavior. Effectively, this type of logic creates a sequence that increments by one when you submit a null value in the values clause and by two when you submit a
Another scenario occurs when you attempt a bulk
INSERT operation on the table. The sequence call and substitution occurs on each row of the sequence.
You face another problem when you rewrite the trigger to only fire when a surrogate primary key isn’t provided, like this:
CREATE OR REPLACE TRIGGER one_t1 BEFORE INSERT ON one FOR EACH ROW WHEN (NEW.one_id IS NULL) -- Asynchronous with bulk insert operations when a value is provided by the bulk operation to the surrogate key column. BEGIN :NEW.one_id := one_s1.NEXTVAL; END; /
This trigger design causes a problem only with bulk
INSERT statements. Effectively, the sequence remains unaltered when you provide surrogate key values as part of inserting an array of values. The next non-bulk
INSERT statement would then grab the
.NEXTVAL value, attempt to use it, and raise a unique constraint violation because the bulk operation probably already used the value from the sequence.
The fix to bulk operations requires that you lock the table, disable a trigger like this, and get the
.NEXTVAL value. Then, you assign the
.NEXTVAL value to two local variables. One of these remains unchanged while the other increments as you populate the array for the bulk insert operation. After assigning the result from the
.NEXTVAL, you drop the sequence and find the highest key value for the bulk insertion operation, add one to the highest key value, and store it in another locally stored variable. You perform the bulk insert operation and then recreate the sequence with a value one greater than the highest value in the table, which should already be in a locally scored variable. Don’t forget that you’d locked the table, so unlock it now.
You should note that database triggers run in a subshell with access only to the immediate shell that fired them. Therefore, you can’t set a bind variable in a SQL*Plus session and subsequently reference it inside the trigger body because it doesn’t have access to the variable.
MySQL supports automatic numbering but not a default transactional mode like Oracle. You need to disable auto commit and start a transaction. You also need to assign the last automatic numbering value to a variable before using it in a subsequent
INSERT statement. You must also provide an overriding list of mandatory columns when you opt to exclude the automated numbering column value. The one thing that we should all appreciate about MySQL is their desire to stay close to and comply with ANSI standards.
-- Conditionally drop the tables. DROP TABLE IF EXISTS one; DROP TABLE IF EXISTS two; -- Create the tables with a surrogate key that automatically increments. CREATE TABLE one ( one_id INT PRIMARY KEY AUTO_INCREMENT , one_text VARCHAR(20)); CREATE TABLE two ( two_id INT PRIMARY KEY AUTO_INCREMENT , one_id INT , two_text VARCHAR(20)); -- Start transaction cycle. START TRANSACTION; -- Insert first row, transfer auto increment to memory. INSERT INTO one (one_text) VALUES ('One'); -- Assign last auto increment to local scope variable, the = works too. SET @one_fk := last_insert_id(); -- Insert second row with auto increment and local scope variable. INSERT INTO b (one_id, two_text) VALUES (@one_fk,'Two'); COMMIT; -- Display the values inserted with auto incremented values. SELECT o.one_id , o.one_text , t.two_id , t.two_text FROM one o JOIN two t ON o.one_id = t.one_id;
SQL Server supports automatic numbering but they call it the identity value. There are two ways to use it but the one I’m showing is for SQL Server 2005 or newer. You can replace the older
@@identity for the
SCOPE_IDENTITY() function call but Microsoft has already removed first level support from SQL Server 2000. While they’ve not said
@@identity is deprecated, it sure appears that’s possible in a future release.
USE student; BEGIN TRAN; -- Conditionally drop tables when they exist. IF OBJECT_ID('dbo.one','U') IS NOT NULL DROP TABLE dbo.one; IF OBJECT_ID('dbo.two','U') IS NOT NULL DROP TABLE dbo.two; -- Create auto incrementing tables. CREATE TABLE one ( one_id INT NOT NULL IDENTITY(1,1) CONSTRAINT pk_one PRIMARY KEY , one_text VARCHAR(10) NOT NULL ); CREATE TABLE two ( two_id INT NOT NULL IDENTITY(1,1) CONSTRAINT pk_two PRIMARY KEY , one_id INT NOT NULL , two_text VARCHAR(10) NOT NULL ); -- Insert the values, and magically no override signature required. INSERT INTO one VALUES ('One!'); INSERT INTO one VALUES ('Two!'); INSERT INTO two VALUES (SCOPE_IDENTITY(),'Other Two!'); -- Query the results. SELECT o.one_id , o.one_text , t.two_id , t.two_text FROM one o JOIN two t ON o.one_id = t.one_id; COMMIT TRAN;
You should note that T-SQL doesn’t require an override signature when you use an automatic numbering column. This is different, isn’t it?
While the prior example works with two tables, it doesn’t scale to a series of tables. You should consider the following assignment pattern when you’ll have multiple last identity values in a single transaction scope.
DECLARE @one_pk AS INT; SET @one_pk = SCOPE_IDENTITY();
As mentioned, this style is important when you’ve got a series of primary and foreign keys to map in the scope of a single transaction. Also, I’d suggest that you put all the declarations at the beginning of the transaction’s scope.
As always, I hope this helps some folks.