MacLochlainns Weblog

Michael McLaughlin's Technical Blog

Site Admin

Archive for the ‘T-SQL’ Category

SQL Server XQuery

with 2 comments

I promised my students an example of writing xquery statements in Microsoft SQL Server. This post builds on two earlier posts. The first qualifies how to build a marvel table with source data, and the second qualifies how you can create an XML Schema Collection and insert relational data into an XML structure.

You can query a sequence with xquery as follows:

DECLARE @x xml;
SET @x = N'';
SELECT @x.query('(1,2,(10,11,12,13,14,15)),-6');

It returns:

1 2 10 11 12 13 14 15 -6

You can query a sequence with an xquery FLOWR statement. FLOWR stands for: FOR, LET, ORDER BY, WHERE, and RETURN. A sample xquery with a FLOWER statement is:

DECLARE @x xml;
SET @x = N'';
SELECT @x.query('for $i in ((1,2,(10,11,12,13,14,15)),-6)
                 order by $i
                 return $i');

It returns:

-6 1 2 10 11 12 13 14 15

You can query the entire node tree with the following xquery statement because it looks for the occurrence of any node with the /* search string:

DECLARE @x xml;
SET @x = N'<marvel>
             <avenger_name>Captain America</avenger_name>
           </marvel>';
SELECT @x.query('/*');

You can query the avenger_name elements from the marvel_xml table with the following syntax:

SELECT xml_table.query('/marvel/avenger_name')
FROM   marvel_xml;

It returns the following set of avenger_name elements:

<avenger_name>Hulk</avenger_name>
<avenger_name>Iron Man</avenger_name>
<avenger_name>Black Widow</avenger_name>
<avenger_name>Thor</avenger_name>
<avenger_name>Captain America</avenger_name>
<avenger_name>Hawkeye</avenger_name>
<avenger_name>Winter Soldier</avenger_name>
<avenger_name>Iron Patriot</avenger_name>

You can query the fourth avenger_name element from the marvel_xml table with the following xquery statement:

SELECT xml_table.query('/marvel[4]/avenger_name')
FROM   marvel_xml;

It returns the following avenger_name element:

<avenger_name>Thor</avenger_name>

You can use the value() function to verify an avenger_name element exists:

SELECT CASE
          WHEN xml_table.value('(/marvel[4]/avenger_name)','nvarchar') = 'T' THEN
            'Exists'
       END AS "Verified"
FROM   marvel_xml
WHERE  id = 3;

The query returns the Exists string when it finds a valid avenger_name element. You have a number of other tools to query results sets from the XML node tree.

I hope this helps my students and anybody interested in writing xquery-enable queries.

Written by maclochlainn

February 19th, 2015 at 1:54 am

Functions disallow NDS

with 2 comments

My students asked if you could embed an OFFSET x ROWS FETCH NEXT y ROWS ONLY clause in a SQL Server T-SQL user-defined function. The answer is no, it isn’t Oracle (yes, you can do that in Oracle Database 12c with an NDS statement). There’s an example in Chapter 2 of my Oracle Database 12c PL/SQL Programming book if you’re interested. I also demonstrate a different approach to SQL Server T-SQL table functions in this older post. However, an attempt to add the clause to a SQL Server T-SQL function, like this:

CREATE FUNCTION studentdb.getBatch
(@rows   AS INT
,@offset AS INT) RETURNS @output TABLE
( marvel_id    INT 
, avenger_name VARCHAR(30)
, first_name   VARCHAR(20)
, last_name    VARCHAR(20)) AS
BEGIN
  /* Insert the results into the table variable. */
  INSERT @output
  SELECT marvel_id
  ,      avenger_name
  ,      first_name
	,      last_name
  FROM   studentdb.marvel
  OFFSET (@offset - 1) ROWS FETCH NEXT @rows ROWS ONLY;
 
  /* Return the table variable from the function. */  
  RETURN;
END;

Throws the following errors trying to compile the function:

Msg 102, Level 15, State 1, Procedure getBatch, Line 16
Incorrect syntax near '@offset'.
Msg 153, Level 15, State 2, Procedure getBatch, Line 16
Invalid usage of the option NEXT in the FETCH statement.

If you have a strong background in Oracle and can sort through the dynamic SQL syntax for T-SQL, you might try re-writing the function to use the EXEC SP_EXECUTESQL @var; command. That rewrite that attempts to use NDS (Native Dynamic SQL) would look like this:

CREATE FUNCTION studentdb.getBatch
(@rows   AS INT
,@offset AS INT) RETURNS @output TABLE
( marvel_id    INT 
, avenger_name VARCHAR(30)
, first_name   VARCHAR(20)
, last_name    VARCHAR(20)) AS
BEGIN
  DECLARE
    /* Declare a variable for a dynamic SQL statement. */
    @stmt          VARCHAR(400);
 
    /* Assign the SQL statement to a variable. */
    SET @stmt = N'SELECT marvel_id ' +
                N',      avenger_name ' +
                N',      first_name ' +
                N',      last_name ' +
                N'FROM   studentdb.marvel ' +
                N'OFFSET ' + (@offset - 1) + N' ' +
                N'ROWS FETCH NEXT ' + @rows + N' ROWS ONLY;';
  BEGIN
    /* Insert the results into the table variable. */
    INSERT @output
    EXEC sp_executesql @stmt; 
  END;
 
  /* Return the table variable from the function. */  
  RETURN;
END;

Throws the following exception because you can’t use dynamic dispatch inside a T-SQL function:

Msg 443, Level 16, State 14, Procedure getBatch, Line 23
Invalid use of a side-effecting operator 'INSERT EXEC' within a function.

On the other hand you can rewrite the statement with a BETWEEN operator and it works somewhat like an OFFSET and FETCH operation. That refactored function would be written as follows:

CREATE FUNCTION studentdb.getBatch
(@rowa  AS INT
,@rowb AS INT) RETURNS @output TABLE
( marvel_id    INT 
, avenger_name VARCHAR(30)
, first_name   VARCHAR(20)
, last_name    VARCHAR(20)) AS
BEGIN
  /* Insert the results into the table variable. */
  INSERT @output
  SELECT marvel_id
  ,      avenger_name
  ,      first_name
  ,      last_name
  FROM   studentdb.marvel
  WHERE  marvel_id BETWEEN @rowa AND @rowb;
 
  /* Return the table variable from the function. */  
  RETURN;
END;

It doesn’t raise an exception. You can call the table function like this:

SELECT * FROM getBatch(2,3);

It returns the two rows for Iron Man and Black Widow. As always, I hope this helps.

If you want to create the test case, here’s the script you need:

SELECT 'Conditionally drop studentdb.marvel table.' AS "Statement";
IF OBJECT_ID('studentdb.marvel','U') IS NOT NULL
  DROP TABLE studentdb.marvel;
 
SELECT 'Create studentdb.marvel table.' AS "Statement";
CREATE TABLE studentdb.marvel
( marvel_id     INT          NOT NULL IDENTITY(1,1) CONSTRAINT marvel_pk PRIMARY KEY
, avenger_name  VARCHAR(30)  NOT NULL
, first_name    VARCHAR(20)  NOT NULL
, last_name     VARCHAR(20)  NOT NULL);
 
/* Insert the rows. */
INSERT INTO studentdb.marvel (avenger_name, first_name, last_name) VALUES ('Hulk','Bruce','Banner');
INSERT INTO studentdb.marvel (avenger_name, first_name, last_name) VALUES ('Iron Man','Tony','Stark');
INSERT INTO studentdb.marvel (avenger_name, first_name, last_name) VALUES ('Black Widow','Natasha','Romanoff');
INSERT INTO studentdb.marvel (avenger_name, first_name, last_name) VALUES ('Thor','Thor','Odinsson');
INSERT INTO studentdb.marvel (avenger_name, first_name, last_name) VALUES ('Captain America','Steve','Rogers');
INSERT INTO studentdb.marvel (avenger_name, first_name, last_name) VALUES ('Hawkeye','Clint','Barton');
INSERT INTO studentdb.marvel (avenger_name, first_name, last_name) VALUES ('Winter Soldier','Bucky','Barnes');
INSERT INTO studentdb.marvel (avenger_name, first_name, last_name) VALUES ('Iron Patriot','James','Rhodey');
 
/* Query the contents of the MARVEL table. */
SELECT * FROM studentdb.marvel;

Written by maclochlainn

February 14th, 2015 at 5:20 pm

Convert to SQL Server?

with 2 comments

I’m always amazed at the questions that pop up for me. For example, how do you convert an Oracle script that creates my Video Store model to a Microsoft SQL Server script. It’s not very hard but there’s one big caveat, and that’s the fact that system_user is a reserved word. That means you can’t create the Access Control List (ACL) table with a system_user name. The alternative, would be to convert the system_user table name to database_user. That’s what I’ve done in this example.

It’s also important to note that this example uses Microsoft SQL Server’s sqlcmd in batch mode. Naturally, it presumes that you’ve created a student user with a trivial password of student, and a studentdb schema. Also, that you’ve granted privileges so everything works (if you need help on that check my earlier post on how to setup a studentdb schema).

The following is an example of conditionally dropping and then creating a system_user table in an Oracle schema. It uses a CASCADE CONSTRAINTS clause to eliminate dependencies with foreign key values.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
-- Conditionally drop the table and sequence.
BEGIN
  FOR i IN (SELECT NULL
            FROM   user_tables
            WHERE  table_name = 'SYSTEM_USER') LOOP
    EXECUTE IMMEDIATE 'DROP TABLE system_user CASCADE CONSTRAINTS';
  END LOOP;
  FOR i IN (SELECT NULL
            FROM   user_sequences
            WHERE  sequence_name = 'SYSTEM_USER_S1') LOOP
    EXECUTE IMMEDIATE 'DROP SEQUENCE system_user_s1';
  END LOOP;
END;
/
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
--  Create SYSTEM_USER table.
CREATE TABLE system_user
( system_user_id            NUMBER        CONSTRAINT system_user_pk PRIMARY KEY
, system_user_name          VARCHAR2(20)  CONSTRAINT system_user_nn1 NOT NULL
, system_user_group_id      NUMBER        CONSTRAINT system_user_nn2 NOT NULL
, system_user_type          NUMBER        CONSTRAINT system_user_nn3 NOT NULL
, first_name                VARCHAR2(20)
, middle_name               VARCHAR2(20)
, last_name                 VARCHAR2(20)
, created_by                NUMBER        CONSTRAINT system_user_nn4 NOT NULL
, creation_date             DATE          CONSTRAINT system_user_nn5 NOT NULL
, last_updated_by           NUMBER        CONSTRAINT system_user_nn6 NOT NULL
, last_update_date          DATE          CONSTRAINT system_user_nn7 NOT NULL
, CONSTRAINT system_user_fk1  FOREIGN KEY (created_by)
  REFERENCES system_user (system_user_id)
, CONSTRAINT system_user_fk2  FOREIGN KEY (last_updated_by)
  REFERENCES system_user (system_user_id));
 
-- Create SYSTEM_USER_S1 sequence with a start value of 1001.
CREATE SEQUENCE system_user_s1 START WITH 1001;
1
2
3
4
5
6
7
8
9
10
11
12
13
14
-- Conditionally drop the table and sequence.
BEGIN
  FOR i IN (SELECT NULL
            FROM   user_tables
            WHERE  table_name = 'COMMON_LOOKUP') LOOP
    EXECUTE IMMEDIATE 'DROP TABLE common_lookup CASCADE CONSTRAINTS';
  END LOOP;
  FOR i IN (SELECT NULL
            FROM   user_sequences
            WHERE  sequence_name = 'COMMON_LOOKUP_S1') LOOP
    EXECUTE IMMEDIATE 'DROP SEQUENCE common_lookup_s1';
  END LOOP;
END;
/
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
--  Create COMMON_LOOKUP table.
CREATE TABLE common_lookup
( common_lookup_id            NUMBER
, common_lookup_context       VARCHAR2(30) CONSTRAINT nn_clookup_1 NOT NULL
, common_lookup_type          VARCHAR2(30) CONSTRAINT nn_clookup_2 NOT NULL
, common_lookup_meaning       VARCHAR2(30) CONSTRAINT nn_clookup_3 NOT NULL
, created_by                  NUMBER       CONSTRAINT nn_clookup_4 NOT NULL
, creation_date               DATE         CONSTRAINT nn_clookup_5 NOT NULL
, last_updated_by             NUMBER       CONSTRAINT nn_clookup_6 NOT NULL
, last_update_date            DATE         CONSTRAINT nn_clookup_7 NOT NULL
, CONSTRAINT pk_c_lookup_1    PRIMARY KEY(common_lookup_id)
, CONSTRAINT fk_c_lookup_1    FOREIGN KEY(created_by) REFERENCES system_user(system_user_id)
, CONSTRAINT fk_c_lookup_2    FOREIGN KEY(last_updated_by) REFERENCES system_user(system_user_id));
 
-- Create a non-unique index on a single column.
CREATE INDEX common_lookup_n1
  ON common_lookup(common_lookup_context);
 
-- Create a unique index based on two columns.
CREATE UNIQUE INDEX common_lookup_u2
  ON common_lookup(common_lookup_context,common_lookup_type);
 
-- Create COMMON_LOOKUP_S1 sequence with a start value of 1001.
CREATE SEQUENCE common_lookup_s1 START WITH 1001;

You can do the same thing for a database_user table in Microsoft SQL Server with the following syntax. Unfortunately, there isn’t a CASCADE CONSTRAINTS clause that we can append in Microsoft SQL Server. The script uses a dynamic SQL statement with a Common Table Expression (CTE) to generate a list of ALTER statements that drop foreign key constraints in the schema.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
/* Drop all foreign keys. */
USE studentdb;
 
/* Create a session variable to hold a command list. */
SELECT 'Create a session variable.' AS "Statement";
DECLARE @sql NVARCHAR(MAX) = N'';
 
/* Generate the command list to drop foreign key constraints. */
SELECT 'Generate dynamic SQL statements.' AS "Statement";
;WITH x AS
(SELECT N'ALTER TABLE ' +
          OBJECT_SCHEMA_NAME(parent_object_id) + N'.' +
          OBJECT_NAME(parent_object_id) + N' ' +
        N'DROP CONSTRAINT ' + name + N';' AS sqlstmt
 FROM     sys.foreign_keys)
SELECT @sql += sqlstmt FROM x;
 
/* Call the dynamically generated statements. */
SELECT 'Execute dynamic SQL statements.' AS "Statement";
EXEC sp_executesql @sql;
 
/* Conditionally drop tables. */
SELECT 'Conditionally drop studentdb.common_lookup table.' AS "Statement";
IF OBJECT_ID('studentdb.database_user','U') IS NOT NULL
  DROP TABLE studentdb.database_user;
 
/* Create a table with self-referencing foreign key constraints. */
SELECT 'Create studentdb.common_lookup table.' AS "Statement";
CREATE TABLE studentdb.database_user
( database_user_id        INT          NOT NULL IDENTITY(1,1) CONSTRAINT database_user_pk PRIMARY KEY
, database_user_name      VARCHAR(20)  NOT NULL
, database_user_group_id  INT          NOT NULL
, database_user_type      INT          NOT NULL
, first_name              VARCHAR(20)
, middle_name             VARCHAR(20)
, last_name               VARCHAR(20)
, created_by              INT          NOT NULL
, creation_date           DATE         NOT NULL
, last_updated_by         INT          NOT NULL
, last_update_date        DATE         NOT NULL
, CONSTRAINT database_user_fk1 FOREIGN KEY (created_by)
  REFERENCES studentdb.database_user (database_user_id)
, CONSTRAINT database_user_fk2 FOREIGN KEY (created_by)
  REFERENCES studentdb.database_user (database_user_id));
 
/* Conditionally drop common_lookup table. */
SELECT 'Conditionally drop studentdb.common_lookup table.' AS "Statement";
IF OBJECT_ID('studentdb.common_lookup','U') IS NOT NULL
  DROP TABLE studentdb.common_lookup;
 
/* Create a table with external referencing foreign key constraints. */  
SELECT 'Create studentdb.common_lookup table.' AS "Statement";
CREATE TABLE studentdb.common_lookup
( common_lookup_id            INT          NOT NULL  IDENTITY(1,1) CONSTRAINT common_lookup_pk PRIMARY KEY
, common_lookup_context       VARCHAR(30)  CONSTRAINT nn_clookup_1 NOT NULL
, common_lookup_type          VARCHAR(30)  CONSTRAINT nn_clookup_2 NOT NULL
, common_lookup_meaning       VARCHAR(30)  CONSTRAINT nn_clookup_3 NOT NULL
, created_by                  INT          CONSTRAINT nn_clookup_4 NOT NULL
, creation_date               DATE         CONSTRAINT nn_clookup_5 NOT NULL
, last_updated_by             INT          CONSTRAINT nn_clookup_6 NOT NULL
, last_update_date            DATE         CONSTRAINT nn_clookup_7 NOT NULL
, CONSTRAINT common_lookup_fk1 FOREIGN KEY(created_by)
  REFERENCES studentdb.database_user (database_user_id)
, CONSTRAINT common_lookup_fk2 FOREIGN KEY(last_updated_by)
  REFERENCES studentdb.database_user (database_user_id));

You can run it from a file by calling the sqlcmd utility. You’ll need to know several things to run it. First, you need to know your database instance. You can capture that from a query against the data dictionary or catalog. Just run the following from inside the Microsoft SQL Server Management Studio (SSMS):

SELECT @@SERVERNAME;

In my case, it shows the following, which is the machine’s hostname a backslash and SQLEXPRESS:

MCLAUGHLINSQL\SQLEXPRESS

The script uses sqltest.sql as a file name, and you can call it from the Windows shell environment like this:

sqlcmd -S MCLAUGHLINSQL\SQLEXPRESS -U student -P student -i C:\Data\MicrosoftSQL\sqltest.sql -o C:\Data\Microsoft\sqltest.out

As always, I hope this helps.

Written by maclochlainn

January 15th, 2015 at 7:18 pm

A T-SQL Table Function

with 3 comments

I had an interesting conversation about table functions in Oracle’s PL/SQL; and the fact that they’re not available in MySQL. When I explained they’re available in Microsoft T-SQL User-Defined Functions (UDFs), my students wanted a small example. One of them said they’d tried to do it but couldn’t get it to work because they found the Microsoft web pages difficult to read and use. Specifically, they didn’t like the sparseness of this one on how to create a function.

Here’s a quick definition of a UDF table function that runs in the studentdb schema (created in this post for migrating SQL Server into a MySQL database). The following getConquistador function takes a single string, which acts to filter the result set from a query positioned as the return value of the function. You should note that this is an implementation of Microsoft’s Common Language Infrastructure (CLI).

CREATE FUNCTION studentdb.getConquistador
(@nationality AS VARCHAR(30))
RETURNS TABLE
RETURN SELECT * FROM studentdb.conquistador WHERE nationality = @nationality;

Unlike Oracle SQL, where you need to use the TABLE function to read the content of a table result from a function, you don’t need anything other than the function call in the FROM clause of a T-SQL query. Here’s an example of calling the table function:

SELECT * FROM studentdb.getConquistador('German');

The complete result from the query would produce these results when run from the sqlcmd command-line interface:

conquistador_id conquistador          actual_name          nationality
--------------- --------------------- -------------------- ------------
             11 Nicolas de Federman   Nikolaus Federmann   German
             13 Jorge de la Espira    George von Speyer    German
 
(2 rows affected)

However, you also have the ability to query only rows of interest without any specialized syntax, like this:

1> USE studentdb;
2> SELECT conquistador AS "Conquistador"
3> ,      actual_name AS "Name"
4> FROM   studentdb.getConquistador('German');
5> GO

This produces the following two-column result set:

Conquistador          Name
--------------------- --------------------
Nicolas de Federman   Nikolaus Federmann
Jorge de la Espira    George von Speyer
 
(2 rows affected)

Hope this helps those interested in T-SQL UDFs.

Written by maclochlainn

February 12th, 2013 at 1:16 am

Why Stored Programs?

with 2 comments

Why should you use stored programs? Great question, here’s my little insight into a situation that I heard about in a large organization.

A very large organization is having a technology argument. In someway, like politics, half-truth drives this type of discussion. This company has hundreds of databases and they’re about half SQL Server and Oracle. The argument (half-truth) states that using T-SQL or PL/SQL yields “spaghetti” code!

It seems like an old argument from my perspective. After all, I’ve been working with T-SQL and PL/SQL for a long time. Spaghetti code exists in every language when unskilled programmers solve problems but the point here is one of software architecture, and an attempt to malign stored programming in general. Let’s examine the merit of the argument against stored programs.

First of all, the argument against stored programs is simply not true. SQL DML statements, like the INSERT, UPDATE, and DELETE statements should maintain ACID compliant interactions with a single table in a database. Unfortunately, the same statements create anomalies (errors) in a poorly designed database.

Stored programs provide the ability to perform ACID compliant interactions across a series of tables in a database. They may also hide database design errors and protect the data from corruption. The same can’t be said for Java or C# developers. Java and C# developers frequently fail to see database design errors or they overlook them as inconsequential. This type of behavior results in corrupt data.

It typically raises cost, errors, and overall application complexity when key logic migrates outside the database. If you’re asking why, that’s great. Here are my thoughts on why:

  1. Making a Java or C# programmer responsible for managing the transaction scope across multiple tables in a database is not trivial. It requires a Java programmer that truly has mastered SQL. As a rule, it means a programmer writes many more lines of logic in their code because they don’t understand how to use SQL. It often eliminates joins from being performed in the database where they would considerably outperform external language operations.
  2. Identifying bottlenecks and poor usage of data becomes much more complex for DBAs because small queries that avoid joins don’t appear problematic inside the database. DBAs don’t look at the execution or scope of transactions running outside of the database and you generally are left with anecdotal customer complaints about the inefficiency of the application. Therefore, you have diminished accountability.
  3. Developing a library of stored procedures (and functions) ensures the integrity of transaction management. It also provides a series of published interfaces to developers writing the application logic. The published interface provides a modular interface, and lets developers focus on delivering quality applications without worrying about the database design. It lowers costs and increases quality by focusing developers on their strengths rather than trying to make them generalists. That having been said, it should never mask a poorly designed database!
  4. Service level agreements are critical metrics in any organization because they compel efficiency. If you mix the logic of the database and the application layer together, you can’t hold the development team responsible for the interface or batch processing metrics because they’ll always “blame” the database. Likewise, you can’t hold the database team responsible for performance when their metrics will only show trivial DML statement processing. Moreover, the DBA team will always show you that it’s not their fault because they’ve got metrics!
  5. Removing transaction controls from the database server generally means you increase the analysis and design costs. That’s because few developers have deep understanding of a non-database programming language and the database. Likewise, input from DBAs is marginalized because the solution that makes sense is disallowed by design fiat. Systems designed in this type of disparate way often evolve into extremely awkward application models.

Interestingly, the effective use of T-SQL or PL/SQL often identifies, isolates, and manages issues in poorly designed database models. That’s because they focus on the integrity of transactions across tables and leverage native database features. They also act like CSS files, effectively avoiding the use of inline style or embedded SQL and transaction control statements.

Let’s face this fact; any person who writes something like “spaghetti” code in the original context is poorly informed. They’re typically trying to sidestep blame for an existing bad application design or drive a change of platform without cost justification.

My take on this argument is two fold. Technologists in the organization may want to dump what they have and play with something else; or business and IT management may want to sidestep the wrath of angry users by blaming their failure on technology instead of how they didn’t design, manage, or deliver it.

Oh, wait … isn’t that last paragraph the reason for the existence of pre-package software? 😉 Don’t hesitate to chime in, after all it’s just my off-the-cuff opinion.

Written by maclochlainn

October 6th, 2012 at 3:48 pm

SQL Automated Numbers

with 4 comments

I’ve begun putting together an online database tutorial and expanded this entry and added horizontal scrolling to it. You can find the improved version of the blog post as blog page here.

Surrogate keys are interesting structures in databases. They’re essential if you want to make sure you optimize your design. They’re also very useful when you want to capture the automatic numbering value for a prior INSERT statement and reuse the automatic numbering value as the foreign key value in a subsequent statement. It was interesting to see how they’re implemented differently across Oracle, MySQL, and SQL Server while providing the same utility.

Below is a synopsis of how you implement these in Oracle, MySQL, and SQL Server.

Oracle

The first thing to qualify is that Oracle is generally always in a transactional mode. That means you don’t need to do anything special to set this example up.

Oracle doesn’t support automated numbering in tables prior to Oracle 12c. Oracle 12c introduces identity columns, and the mechanics change. However, you can use sequences to mimic automated numbering prior to Oracle 12c and without identity columns in Oracle 12c. A sequence is a structure in the database that holds a current value, increments by a fixed value – typically 1. Sequences are available in SQL and PL/SQL scopes through two pseudo columns. The pseudo columns are .nextval and .currval (note the two r’s because it’s not a stray dog).

The sequence_name.nextval call in any session places the next number from the sequence into your Personal Global Area (PGA), which is a memory context. After you’ve called the sequence into memory, you can access it again by using sequence_name.currval. The sequence only changes when you call it again with the .nextval pseudo column.

-- Conditionally drop data sturctures - tables and sequences.
BEGIN
  FOR i IN (SELECT TABLE_NAME
            FROM   user_tables
			WHERE  TABLE_NAME IN ('ONE','TWO')) LOOP
    EXECUTE IMMEDIATE 'DROP TABLE '||i.table_name||' CASCADE CONSTRAINT';
  END LOOP;
  FOR i IN (SELECT sequence_name
            FROM   user_sequences
			WHERE  sequence_name IN ('ONE_S1','TWO_S1')) LOOP
    EXECUTE IMMEDIATE 'DROP SEQUENCE '||i.sequence_name;
  END LOOP;
END;
/
 
-- Create base table and sequence.
CREATE TABLE one
( one_id   INT         NOT NULL CONSTRAINT pk_one PRIMARY KEY
, one_text VARCHAR(10) NOT NULL );
 
CREATE SEQUENCE one_s1;
 
-- Create dependent table and sequence.
CREATE TABLE two
( two_id   INT         NOT NULL CONSTRAINT pk_two PRIMARY KEY
, one_id   INT         NOT NULL
, two_text VARCHAR(10) NOT NULL );
 
CREATE SEQUENCE two_s1;
 
-- Insert rows into the tables with sequence values.
INSERT INTO one VALUES (one_s1.nextval,'One!');
INSERT INTO one VALUES (one_s1.nextval,'Two!');
INSERT INTO two VALUES (two_s1.nextval, one_s1.currval,'Other Two!');
 
-- Display the values inserted with sequences.
SELECT   o.one_id
,        o.one_text
,        t.two_id
,        t.two_text
FROM     one o JOIN two t ON o.one_id = t.one_id;

If you mimic automatic numbering with database triggers, you may not have access to the .currval value for the second INSERT statement. This occurs when you provide a NULL value expecting the trigger to manage .NEXTVAL call for you.

Transactions require that you keep the primary key value for the first table in a locally scoped variable for reuse. Then, you can pass it to the next INSERT statement. You do that with the .CURRVAL value.

You can make a potentially erroneous assumption that you’re the only user updating the table. Operating under that assumption, you can query the highest sequence number from the table before an insert, add one to it, and then attempt the INSERT statement. In a multi-user system, it’s possible that somebody beats you to the finish line with their INSERT statement. Your insert would then have a duplicate surrogate key value for the one_id column, and fail on an ORA-00001 error for a uniqueness violation on a primary key column.

A database trigger can help you avoid a race condition. The trigger would ensure sequence values are unique but it may also introduce problems. A common Oracle trigger with a pseudo automatic numbering paradigm is represented by the following trigger (found in APEX generated code).

CREATE OR REPLACE TRIGGER one_t1 
BEFORE INSERT ON one
FOR EACH ROW
BEGIN
  :NEW.one_id := one_s1.nextval;
END;
/

Caution is required on this type of automated sequence trigger. There are two problems with this type of trigger.

One scenario is where you include a call to sequence_name.NEXTVAL in your INSERT statement. It then increments the sequence, and attempts to insert the value whereupon the trigger fires and repeats the behavior. Effectively, this type of logic creates a sequence that increments by one when you submit a null value in the values clause and by two when you submit a sequence_name.NEXTVAL value.

Another scenario occurs when you attempt a bulk INSERT operation on the table. The sequence call and substitution occurs on each row of the sequence.

You face another problem when you rewrite the trigger to only fire when a surrogate primary key isn’t provided, like this:

CREATE OR REPLACE TRIGGER one_t1 
BEFORE INSERT ON one
FOR EACH ROW
WHEN (NEW.one_id IS NULL)  -- Asynchronous with bulk insert operations when a value is provided by the bulk operation to the surrogate key column.
BEGIN
  :NEW.one_id := one_s1.nextval;
END;
/

This trigger design causes a problem only with bulk INSERT statements. Effectively, the sequence remains unaltered when you provide surrogate key values as part of inserting an array of values. The next non-bulk INSERT statement would then grab the .NEXTVAL value, attempt to use it, and raise a unique constraint violation because the bulk operation probably already used the value from the sequence.

The fix to bulk operations requires that you lock the table, disable a trigger like this, and get the .NEXTVAL value. Then, you assign the .NEXTVAL value to two local variables. One of these remains unchanged while the other increments as you populate the array for the bulk insert operation. After assigning the result from the .NEXTVAL, you drop the sequence and find the highest key value for the bulk insertion operation, add one to the highest key value, and store it in another locally stored variable. You perform the bulk insert operation and then recreate the sequence with a value one greater than the highest value in the table, which should already be in a locally scored variable. Don’t forget that you’d locked the table, so unlock it now.

You should note that database triggers run in a subshell with access only to the immediate shell that fired them. Therefore, you can’t set a bind variable in a SQL*Plus session and subsequently reference it inside the trigger body because it doesn’t have access to the variable.

MySQL

MySQL supports automatic numbering but not a default transactional mode like Oracle. You need to disable auto commit and start a transaction. You also need to assign the last automatic numbering value to a variable before using it in a subsequent INSERT statement. You must also provide an overriding list of mandatory columns when you opt to exclude the automated numbering column value. The one thing that we should all appreciate about MySQL is their desire to stay close to and comply with ANSI standards.

-- Conditionally drop the tables.
DROP TABLE IF EXISTS one;
DROP TABLE IF EXISTS two;
 
-- Create the tables with a surrogate key that automatically increments.
CREATE TABLE one ( one_id   INT PRIMARY KEY AUTO_INCREMENT
                 , one_text VARCHAR(20));
 
CREATE TABLE two ( two_id INT PRIMARY KEY AUTO_INCREMENT
                 , one_id INT
                 , two_text VARCHAR(20));
 
-- Start transaction cycle.
START TRANSACTION;
 
-- Insert first row, transfer auto increment to memory.
INSERT INTO one (one_text) VALUES ('One');
 
-- Assign last auto increment to local scope variable, the = works too.
SET @one_fk := last_insert_id();
 
-- Insert second row with auto increment and local scope variable.
INSERT INTO b (one_id, two_text) VALUES (@one_fk,'Two');
 
COMMIT;
 
-- Display the values inserted with auto incremented values.
SELECT   o.one_id
,        o.one_text
,        t.two_id
,        t.two_text
FROM     one o JOIN two t ON o.one_id = t.one_id;

SQL Server

SQL Server supports automatic numbering but they call it the identity value. There are two ways to use it but the one I’m showing is for SQL Server 2005 or newer. You can replace the older @@identity for the SCOPE_IDENTITY() function call but Microsoft has already removed first level support from SQL Server 2000. While they’ve not said @@identity is deprecated, it sure appears that’s possible in a future release.

USE student;
 
BEGIN TRAN;
 
-- Conditionally drop tables when they exist.
IF OBJECT_ID('dbo.one','U') IS NOT NULL DROP TABLE dbo.one;
IF OBJECT_ID('dbo.two','U') IS NOT NULL DROP TABLE dbo.two;
 
-- Create auto incrementing tables.
CREATE TABLE one
( one_id   INT         NOT NULL IDENTITY(1,1) CONSTRAINT pk_one PRIMARY KEY
, one_text VARCHAR(10) NOT NULL );
 
CREATE TABLE two
( two_id   INT         NOT NULL IDENTITY(1,1) CONSTRAINT pk_two PRIMARY KEY
, one_id   INT         NOT NULL
, two_text VARCHAR(10) NOT NULL );
 
-- Insert the values, and magically no override signature required.
INSERT INTO one VALUES ('One!');
INSERT INTO one VALUES ('Two!');
INSERT INTO two VALUES (SCOPE_IDENTITY(),'Other Two!');
 
-- Query the results.
SELECT   o.one_id
,        o.one_text
,        t.two_id
,        t.two_text
FROM     one o JOIN two t ON o.one_id = t.one_id;
 
COMMIT TRAN;

You should note that T-SQL doesn’t require an override signature when you use an automatic numbering column. This is different, isn’t it?

While the prior example works with two tables, it doesn’t scale to a series of tables. You should consider the following assignment pattern when you’ll have multiple last identity values in a single transaction scope.

DECLARE @one_pk AS INT;
SET @one_pk = SCOPE_IDENTITY();

As mentioned, this style is important when you’ve got a series of primary and foreign keys to map in the scope of a single transaction. Also, I’d suggest that you put all the declarations at the beginning of the transaction’s scope.

As always, I hope this helps some folks.

Written by maclochlainn

May 16th, 2009 at 11:04 pm

T-SQL Hierarchical Query

with 5 comments

Playing around with Microsoft SQL Server 2008 Express edition, I’ve sorted through a bunch of tidbits. One that I thought was interesting, is how to perform a recursive or hierarchical query. This describes how you can perform the magic.

The official name of the WITH clause in Oracle’s lexicon (otherwise known as Oraclese) is a subquery factoring clause. You can find more on that in this earlier blog post. Microsoft has a different name for the WITH clause. They call it a Common Table Expression or CTE.

You perform recursive queries in Microsoft SQL Server 2008 by leveraging CTEs. I’ve modified the setup code from that earlier blog post to run in SQL Server 2008. You’ll find it at the bottom of this blog post.

Unless you want to write your own C# (.NET is the politically correct lingo) equivalent to Oracle’s SQL*Plus, you’ll need to run this script in the SQL Server Management Studio. Actually, you can use Microsoft SQL Server 2008’s command-line utility, which is called sqlcmd.exe but it is much less robust than SQL*Plus. In the Management Studio, you click File, then Open, and File… to load the file for execution, and then click the Execute button. You need to be careful you don’t click the Debug button, which is the green arrow to the right of the Execute button.

se2008executebutton

This is the magic query in the illustration. You can also find it in the source code. At the end of the day, I’m hard pressed to understand why they’d use a UNION ALL to support recursion.

tsql_recursivequery

The top-most CTE, or subquery factoring clause, simply joins the ORGANIZATION_NAME to the ORG_PARENT_ID and ORG_CHILD_ID columns to provide a single working source. The second CTE performs the recursion. The top-query sets the starting row, and the second query recursively navigates the tree. After all children are found, the first query moves to the next element in the table and recursively searches for its children.

You should note that the CTE self-references itself from inside the second query. Then, the external query (the non-CTE query) returns the results by querying the same CTE.

This logic behaves more like a nested loop, and actually fails to move down branches of the tree like a recursive program. Otherwise line 19 would be line 14 in the output. You could write another CTE to fix this shortfall, thereby mirroring a true recursive behavior, or you can write a stored procedure.

The illustrated query outputs the following hierarchical relationship, which navigates down the hierarchical tree:

tsql_recursiveresults

You can also go up any branch of the tree by changing some of the logic. You’ll find the query to navigate up the tree as the second query in the setup script at the end of the blog. It renders the following output:

tsql_recursiveresultsleafup

The blog will be updated if I discover the equivalent to the LEVEL in Oracle’s self-referencing semantics. If you know it, please share it with everybody.

Setup Script

Microsoft SQL Server 2008 Join Script

USE student;
 
BEGIN TRAN;
 
-- Conditionally drop tables when they exist.
IF OBJECT_ID('dbo.ORGANIZATION','U') IS NOT NULL DROP TABLE dbo.ORGANIZATION;
IF OBJECT_ID('dbo.ORG_STRUCTURE','U') IS NOT NULL DROP TABLE dbo.ORG_STRUCTURE;
 
-- Create the organization table. 
CREATE TABLE ORGANIZATION
( organization_id INT
, organization_name VARCHAR(10));
 
-- Seed the organizations. 
INSERT INTO dbo.ORGANIZATION VALUES
 (1,'One'), (2,'Two'), (3,'Three'), (4,'Four'), (5,'Five')
,(6,'Six'), (7,'Seven'), (8,'Eight'), (9,'Nine'), (10,'Ten')
,(11,'Eleven'), (12,'Twelve'), (13,'Thirteen'), (14,'Fourteen'), (15,'Fifteen')
,(16,'Sixteen'), (17,'Seventeen'), (18,'Eighteen'), (19,'Nineteen'), (20,'Twenty');
 
-- Create the organization structure table that holds the recursive key. 
CREATE TABLE org_structure
( org_structure_id INT
, org_parent_id INT
, org_child_id INT );
 
-- Seed the organization structures. 
INSERT INTO org_structure VALUES
 ( 1, 0, 1),( 1, 1, 2),( 1, 1, 3),( 1, 1, 4),( 1, 2, 5)
,( 1, 2, 6),( 1, 3, 7),( 1, 3, 8),( 1, 4, 9),( 1, 4,10)
,( 1, 5,11),( 1, 5,12),( 1, 6,13),( 1, 6,14),( 1, 7,15)
,( 1, 8,16),( 1, 8,17),( 1, 9,18),( 1, 9,19),( 1,14,20);
 
COMMIT TRAN;
 
-- Navigating down the tree from the root node.
WITH org_name AS
 (SELECT   os.org_parent_id AS org_parent_id
  ,        o1.organization_name AS org_parent_name
  ,        os.org_child_id AS org_child_id
  ,        o2.organization_name AS org_child_name
  FROM     dbo.organization o1 RIGHT JOIN dbo.org_structure os 
  ON       o1.organization_id = os.org_parent_id RIGHT JOIN dbo.organization o2
  ON       o2.organization_id = os.org_child_id)
, jn AS
 (SELECT   org_parent_id, org_parent_name
  ,        org_child_id, org_child_name
  FROM     org_name
  WHERE    org_parent_id = 1  
  UNION ALL
  SELECT   c.org_parent_id, c.org_parent_name
  ,        c.org_child_id, c.org_child_name
  FROM     jn AS p JOIN org_name AS c
  ON       c.org_parent_id = p.org_child_id)
SELECT   jn.org_parent_id, jn.org_parent_name
,        jn.org_child_id, jn.org_child_name
FROM     jn
ORDER BY 1;
 
-- Navigating up the tree from the 20th leaf-node child.
WITH org_name AS
 (SELECT   os.org_parent_id AS org_parent_id
  ,        o1.organization_name AS org_parent_name
  ,        os.org_child_id AS org_child_id
  ,        o2.organization_name AS org_child_name
  FROM     dbo.organization o1 RIGHT JOIN dbo.org_structure os 
  ON       o1.organization_id = os.org_parent_id RIGHT JOIN dbo.organization o2
  ON       o2.organization_id = os.org_child_id)
, jn AS
 (SELECT   org_parent_id, org_parent_name
  ,        org_child_id, org_child_name
  FROM     org_name
  WHERE    org_child_id = 20  
  UNION ALL
  SELECT   c.org_parent_id, c.org_parent_name
  ,        c.org_child_id, c.org_child_name
  FROM     jn AS p JOIN org_name AS c
  ON       c.org_child_id = p.org_parent_id)
SELECT   jn.org_parent_id, jn.org_parent_name
,        jn.org_child_id, jn.org_child_name
FROM     jn
ORDER BY 1 DESC;

Written by maclochlainn

April 3rd, 2009 at 8:28 pm