Archive for the ‘cartesian products’ tag
You may like the standard syntax of listing tables as comma-delimited lists, rather than using an inner, left, right, full, natural, and cross join syntax. I do! The former method eliminates matching result sets and generally is less tedious.
The natural join is my least favorite of these because it can create problems long after the query is written. The problem is an outcome of its design. A natural join matches rows from one table to another by using all matching column names. This query makes a simplifying assumption: Nobody includes a column of the same name in two tables where they may be joined that isn’t a foreign key.
The following is quick example that creates two tables. The
VEHICLE table uses the
VEHICLE_ID column as a primary key. The
WORK_ORDER table uses the
VEHICLE_ID column as a foreign key.
CREATE TABLE vehicle ( vehicle_id NUMBER , vin VARCHAR2(20)); CREATE TABLE work_order ( work_order_id NUMBER , vehicle_id NUMBER); INSERT INTO vehicle VALUES (1,'VIN_NUMBER1'); INSERT INTO vehicle VALUES (2,'VIN_NUMBER2'); INSERT INTO work_order VALUES (1, 1); INSERT INTO work_order VALUES (2, 2); INSERT INTO work_order VALUES (3, 1);
The following queries resolve the relationship by using an explicit join:
SELECT * FROM vehicle v, work_order wo WHERE v.vehicle_id = wo.vehicle_id; SELECT * FROM vehicle v JOIN work_order wo ON v.vehicle_id = wo.vehicle_id; SELECT * FROM vehicle v JOIN work_order wo USING(vehicle_id);
A natural query also resolves the relationship by implicitly finding the columns that share column names, as follows:
SELECT * FROM vehicle v NATURAL JOIN work_order wo;
All of these return a three row result set. The unnatural part of the natural query arrives during maintenance programming when somebody adds a
VIN column name to the
WORK_ORDER table (let’s say to simplify the ETL to the data warehouse). The code using a natural join isn’t invalidated but its runtime behavior is altered because it now joins both the
VIN columns. Continuing the example, you’d change the table by:
ALTER TABLE work_order ADD (vin VARCHAR2(20));
The explicit queries still work after the change. The implicit (natural) query now returns no rows because the
VIN column contains all null values, and it is automatically added to the implicit join statement. After updating the
VIN column, the query resolves when the data matches but not when the
VIN column contains different data. What if the
VIN column in
WORK_ORDER only contains the make and model portion of the vehicle identification number while the
VIN column in
VEHICLE contains the complete unique vehicle identifier. You would receive no rows selected from the natural query.
It appears the best thing to do, is avoid natural joins.
I subsequently wrote a post about SQL join semantics. You can find it here.