MySQL Self-Join
I’m switching to MySQL and leveraging Alan Beaulieu’s Learning SQL as a supporting reference for my Database Design and Development course. While reviewing Alan’s Chapter 5: Querying Multiple Tables, I found his coverage of using self-joins minimal.
In fact, he adds a prequel_film_id
column to the film
table in the sakila
database and then a single row to demonstrate a minimal self-join query. I wanted to show them how to view a series of rows interconnected by a self-join, like the following:
SELECT f.title AS film , fp.title AS prequel FROM film f LEFT JOIN film fp ON f.prequel_id = fp.film_id WHERE f.series_name = 'Harry Potter' ORDER BY f.series_number; |
It returns the following result set:
+----------------------------------------------+----------------------------------------------+ | film | prequel | +----------------------------------------------+----------------------------------------------+ | Harry Potter and the Chamber of Secrets | Harry Potter and the Sorcerer's Stone | | Harry Potter and the Prisoner of Azkaban | Harry Potter and the Chamber of Secrets | | Harry Potter and the Goblet of Fire | Harry Potter and the Prisoner of Azkaban | | Harry Potter and the Order of the Phoenix | Harry Potter and the Goblet of Fire | | Harry Potter and the Half Blood Prince | Harry Potter and the Order of the Phoenix | | Harry Potter and the Deathly Hallows: Part 1 | Harry Potter and the Half Blood Prince | | Harry Potter and the Deathly Hallows: Part 2 | Harry Potter and the Deathly Hallows: Part 1 | +----------------------------------------------+----------------------------------------------+ 7 rows in set (0.00 sec) |
Then, I thought about what questions the students might ask. For example, why doesn’t the query return the first film that doesn’t have a prequel. So, I took the self-join to the next level to display the first film having no prequel, like this:
SELECT f.title AS film , IFNULL( CASE WHEN NOT f.film_id = fp.film_id AND f.prequel_id = fp.film_id THEN fp.title END,'None') AS prequel FROM film f LEFT JOIN film fp ON f.prequel_id = fp.film_id WHERE f.series_name = 'Harry Potter' ORDER BY f.series_number; |
The CASE
operator in the SELECT
-list filters the result set by eliminating rows erroneously returned. Without the CASE
filter, the query would return the original Harry Potter and the Sorcerer’s Stone film matched agains a NULL
and all of the other sequels. The CASE
operator effectively limits the result set for the LEFT JOIN
to only the following data:
+----------------------------------------------+----------------------------------------------+ | film | prequel | +----------------------------------------------+----------------------------------------------+ | Harry Potter and the Sorcerer's Stone | NULL | +----------------------------------------------+----------------------------------------------+ |
The IFNULL()
built-in function lets you replace the NULL
value returned as the prequel’s title
value. The IFNULL()
function substitutes a 'None'
string literal for a NULL
value. The query returns the following result set:
+----------------------------------------------+----------------------------------------------+ | film | prequel | +----------------------------------------------+----------------------------------------------+ | Harry Potter and the Sorcerer's Stone | None | | Harry Potter and the Chamber of Secrets | Harry Potter and the Sorcerer's Stone | | Harry Potter and the Prisoner of Azkaban | Harry Potter and the Chamber of Secrets | | Harry Potter and the Goblet of Fire | Harry Potter and the Prisoner of Azkaban | | Harry Potter and the Order of the Phoenix | Harry Potter and the Goblet of Fire | | Harry Potter and the Half Blood Prince | Harry Potter and the Order of the Phoenix | | Harry Potter and the Deathly Hallows: Part 1 | Harry Potter and the Half Blood Prince | | Harry Potter and the Deathly Hallows: Part 2 | Harry Potter and the Deathly Hallows: Part 1 | +----------------------------------------------+----------------------------------------------+ 8 rows in set (0.01 sec) |
Alan’s modification of the sakila.film
table had the following two related design flaws:
- It didn’t provide a way to guarantee the ordering of films with prequels because relational databases don’t guarantee ordered result sets unless you use an
ORDER BY
clause, which typically requires a column to order. - It didn’t provide a way to isolate a series of films.
I modified the film
table differently by adding the series_name
, series_number
, and prequel_id
columns. The series_name
column lets you group results and the series_number
column lets you order by a preserved sequence that you store as part of the data The prequel_id
column lets you connect to the prequel film, much like the backward portion of a doubly linked list.
The new sakila.film
table is:
+----------------------+---------------------------------------------------------------------+------+-----+-------------------+-----------------------------------------------+ | Field | Type | Null | Key | Default | Extra | +----------------------+---------------------------------------------------------------------+------+-----+-------------------+-----------------------------------------------+ | film_id | smallint unsigned | NO | PRI | NULL | auto_increment | | title | varchar(255) | NO | MUL | NULL | | | description | text | YES | | NULL | | | release_year | year | YES | | NULL | | | language_id | tinyint unsigned | NO | MUL | NULL | | | original_language_id | tinyint unsigned | YES | MUL | NULL | | | rental_duration | tinyint unsigned | NO | | 3 | | | rental_rate | decimal(4,2) | NO | | 4.99 | | | length | smallint unsigned | YES | | NULL | | | replacement_cost | decimal(5,2) | NO | | 19.99 | | | rating | enum('G','PG','PG-13','R','NC-17') | YES | | G | | | special_features | set('Trailers','Commentaries','Deleted Scenes','Behind the Scenes') | YES | | NULL | | | last_update | timestamp | NO | | CURRENT_TIMESTAMP | DEFAULT_GENERATED on update CURRENT_TIMESTAMP | | series_name | varchar(20) | YES | | NULL | | | series_number | int unsigned | YES | | NULL | | | prequel | int unsigned | YES | | NULL | | +----------------------+---------------------------------------------------------------------+------+-----+-------------------+-----------------------------------------------+ 16 rows in set (0.21 sec) |
After adding the three new columns, I inserted eight rows for the original Harry Potter films. You can use the following script in the MySQL client (mysql
) to add the columns and insert the data to test the preceding queries:
-- Use sakila database. USE sakila; -- Add a prequel_id column to the sakila.film table. ALTER TABLE film ADD (series_name varchar(20)), ADD (series_number int unsigned), ADD (prequel_id int unsigned); -- Set primary to foreign key local variable. SET @sv_film_id = 0; -- Insert Harry Potter films in sakila.film table with classic values clause. INSERT INTO film ( title , description , release_year , language_id , original_language_id , rental_duration , rental_rate , length , replacement_cost , rating , special_features , last_update , series_name , series_number , prequel_id ) VALUES ('Harry Potter and the Sorcerer''s Stone' ,'A film about a young boy who on his eleventh birthday discovers, he is the orphaned boy of two powerful wizards and has unique magical powers.' , 2001 , 1 , NULL , 3 , 0.99 , 152 , 19.99 ,'PG' ,'Trailers' ,'2001-11-04' ,'Harry Potter' , 1 , NULL ); -- Assign the last generated primary key value to the local variable. SET @sv_film_id := last_insert_id(); -- Insert 2nd film in sakila.film table with classic values clause. INSERT INTO film ( title , description , release_year , language_id , original_language_id , rental_duration , rental_rate , length , replacement_cost , rating , special_features , last_update , series_name , series_number , prequel_id ) VALUES ('Harry Potter and the Chamber of Secrets' ,'A film where Harry returning to Hogwarts, still famous and a hero, when strange things start to happen ... people are turning to stone and no-one knows what, or who, is doing it.' , 2002 , 1 , NULL , 3 , 0.99 , 160 , 19.99 ,'PG' ,'Trailers' ,'2002-11-15' ,'Harry Potter' , 2 , @sv_film_id ); -- Assign the last generated primary key value to the local variable. SET @sv_film_id := last_insert_id(); -- Insert 3rd film in sakila.film table with classic values clause. INSERT INTO film ( title , description , release_year , language_id , original_language_id , rental_duration , rental_rate , length , replacement_cost , rating , special_features , last_update , series_name , series_number , prequel_id ) VALUES ('Harry Potter and the Prisoner of Azkaban' ,'A film where Harry, Ron, and Hermione return for their third year at Hogwarts and are forced to face escaped prisoner, Sirius Black.' , 2004 , 1 , NULL , 3 , 0.99 , 141 , 19.99 ,'PG' ,'Trailers' ,'2004-06-04' ,'Harry Potter' , 3 , @sv_film_id ); -- Assign the last generated primary key value to the local variable. SET @sv_film_id := last_insert_id(); -- Insert 4th film in sakila.film table with classic values clause. INSERT INTO film ( title , description , release_year , language_id , original_language_id , rental_duration , rental_rate , length , replacement_cost , rating , special_features , last_update , series_name , series_number , prequel_id ) VALUES ('Harry Potter and the Goblet of Fire' ,'A film where where Harry Potter''s name emerges from the Goblet of Fire, and he becomes a competitor in a grueling battle for glory among three wizarding schools - the Triwizard Tournament.' , 2005 , 1 , NULL , 3 , 0.99 , 157 , 19.99 ,'PG' ,'Trailers' ,'2005-11-18' ,'Harry Potter' , 4 , @sv_film_id ); -- Assign the last generated primary key value to the local variable. SET @sv_film_id := last_insert_id(); -- Insert 5th film in sakila.film table with classic values clause. INSERT INTO film ( title , description , release_year , language_id , original_language_id , rental_duration , rental_rate , length , replacement_cost , rating , special_features , last_update , series_name , series_number , prequel_id ) VALUES ('Harry Potter and the Order of the Phoenix' ,'A film where Lord Voldemort has returned, but the Ministry of Magic is doing everything it can to keep the wizarding world from knowing the truth.' , 2007 , 1 , NULL , 3 , 0.99 , 138 , 19.99 ,'PG-13' ,'Trailers' ,'2007-07-12' ,'Harry Potter' , 5 , @sv_film_id ); -- Assign the last generated primary key value to the local variable. SET @sv_film_id := last_insert_id(); -- Insert 6th film in sakila.film table with classic values clause. INSERT INTO film ( title , description , release_year , language_id , original_language_id , rental_duration , rental_rate , length , replacement_cost , rating , special_features , last_update , series_name , series_number , prequel_id ) VALUES ('Harry Potter and the Half Blood Prince' ,'A film where Voldemort is tightening his grip on Hogwarts and it is no longer the safe haven it once was. Harry and Dumbledore work to find the key to unlock the Dark Lord''s defenses.' , 2009 , 1 , NULL , 3 , 0.99 , 153 , 19.99 ,'PG' ,'Trailers' ,'2009-07-15' ,'Harry Potter' , 6 , @sv_film_id ); -- Assign the last generated primary key value to the local variable. SET @sv_film_id := last_insert_id(); -- Insert 7th film in sakila.film table with classic values clause. INSERT INTO film ( title , description , release_year , language_id , original_language_id , rental_duration , rental_rate , length , replacement_cost , rating , special_features , last_update , series_name , series_number , prequel_id ) VALUES ('Harry Potter and the Deathly Hallows: Part 1' ,'A film where Harry, Ron and Hermione set out on their perilous mission to track down and destroy the Horcruxes - the keys to Voldemort''s immortality.' , 2010 , 1 , NULL , 3 , 0.99 , 146 , 19.99 ,'PG-13' ,'Trailers' ,'2010-11-19' ,'Harry Potter' , 7 , @sv_film_id ); -- Assign the last generated primary key value to the local variable. SET @sv_film_id := last_insert_id(); -- Insert 8th film in sakila.film table with classic values clause. INSERT INTO film ( title , description , release_year , language_id , original_language_id , rental_duration , rental_rate , length , replacement_cost , rating , special_features , last_update , series_name , series_number , prequel_id ) VALUES ('Harry Potter and the Deathly Hallows: Part 2' ,'A film where Harry, Ron and Hermione set out on their perilous mission to track down and destroy the Horcruxes - the keys to Voldemort''s immortality.' , 2011 , 1 , NULL , 3 , 0.99 , 130 , 19.99 ,'PG-13' ,'Trailers' ,'2011-07-15' ,'Harry Potter' , 8 , @sv_film_id ); |
You can put the following commands into a SQL script file to revert the sakila.film
table to its base configuration:
DELETE FROM film WHERE film_id > 1000; ALTER TABLE film DROP COLUMN series_name; ALTER TABLE film DROP COLUMN series_number; ALTER TABLE film DROP COLUMN prequel_id; ALTER TABLE film AUTO_INCREMENT = 1000; |
As always, I hope this helps those looking for how to solve a new problem.