MacLochlainns Weblog

Michael McLaughlin's Technical Blog

Site Admin

GROUP BY Quirk

without comments

It’s always interesting to see how others teach SQL courses. It can be revealing as to whether they understand SQL or only understand a dialect of SQL. In this case, one of my old students was taking a graduate course in SQL and the teacher was using MySQL. The teacher made an issue of using ANSI SQL:1999 or SQL3 and asked the following question, which I suspect is a quiz bank question from a textbook:

“How would you get all students’ names and for each student the number of courses that the
student has registered for?”

They referenced the MySQL 5.7 documentation for the GROUP BY and SQL:1999 as if MySQL implemented the ANSI SQL:1999 specification defined the standard. I didn’t know whether to laugh or cry because they were referring to MySQL 5.7 when we’re all using MySQL 8 and anybody who’s worked in more than MySQL knows that the behavior for a GROUP BY in MySQL can work without listing the necessary non-aggregated columns in the SELECT-list.

For example, their working solution, which is from the instructor and the author of their MySQL textbook the correct perspective of ANSI:1999 behavior. It doesn’t matter that their solution is actually based on ANSI:1992 not ANSI:1999 because it will only succeed because of a quirk of MySQL:

SELECT   a.studentname
,        COUNT(b.courseid)
FROM     students a INNER JOIN registeredcourses b
ON       a.studentid = b.studentid
GROUP BY a.studentid;

While it works in MySQL, it doesn’t work because it conforms to an ANSI standard. It works in MySQL, notwithstanding that standard because it violates the standard.

In Oracle, PostgreSQL, and SQL Server, it raises an exception. For example, Oracle raises the following exception:

SELECT   a.studentname
         *
ERROR at line 1:
ORA-00979: not a GROUP BY expression

The correct way to write the GROUP BY is:

SELECT   a.studentname
,        COUNT(b.courseid)
FROM     students a INNER JOIN registeredcourses b
ON       a.studentid = b.studentid INNER JOIN courses c
ON       b.courseid = c.courseid
GROUP BY a.studentname;

Then, it would return:

Student Name                   Course IDs
------------------------------ ----------
Montgomery Scott                        1
Leonard McCoy                           2
James Tiberus Kirk                      3

For reference, here’s a complete test case for MySQL:

/* Drop table conditionally. */
DROP TABLE IF EXISTS students;
 
/* Create table. */
CREATE TABLE students
( studentID    int unsigned primary key auto_increment
, studentName  varchar(30));
 
/* Drop table conditionally. */
DROP TABLE IF EXISTS courses;
 
/* Create table. */
CREATE TABLE courses
( courseid    int unsigned primary key auto_increment
, coursename  varchar(40));
 
/* Drop table conditionally. */
DROP TABLE IF EXISTS registeredcourses;
 
/* Create table. */
CREATE TABLE registeredcourses
( courseid    int unsigned
, studentid   int unsigned );
 
/* Insert into students. */
INSERT INTO students
( studentName )
VALUES
 ('James Tiberus Kirk')
,('Leonard McCoy')
,('Montgomery Scott');
 
/* Insert into courses. */
INSERT INTO courses
( coursename )
VALUES
 ('English Literature')
,('Physics')
,('English Composition')
,('Botany')
,('Mechanical Engineering');
 
/* Insert into registeredcourses. */
INSERT INTO registeredcourses
( studentid
, courseid )
VALUES
 (1,1)
,(1,3)
,(1,4)
,(2,2)
,(2,5)
,(3,4); 
 
/* Check global sql_mode to ensure only_full_group_by is set. */
SELECT @@GLOBAL.SQL_MODE;
 
/* Query with a column not found in the SELECT-list. */
SELECT   a.studentname
,        COUNT(b.courseid)
FROM     students a INNER JOIN registeredcourses b
ON       a.studentid = b.studentid
GROUP BY a.studentid;
 
/* Query consistent with ANSI SQL:1992 */
SELECT   a.studentname
,        COUNT(b.courseid)
FROM     students a INNER JOIN registeredcourses b
ON       a.studentid = b.studentid INNER JOIN courses c
ON       b.courseid = c.courseid
GROUP BY a.studentname;

and, another complete test case for Oracle:

/* Drop tabhe unconditionallly. */
DROP TABLE students;
 
/* Create table. */
CREATE TABLE students
( studentID    NUMBER PRIMARY KEY
, studentName  VARCHAR(30));
 
/* Drop table unconditionally. */
DROP TABLE courses;
 
/* Create table. */
CREATE TABLE courses
( courseid    NUMBER PRIMARY KEY
, coursename  VARCHAR(40));
 
/* Drop table unconditionally. */
DROP TABLE registeredcourses;
 
/* Create table. */
CREATE TABLE registeredcourses
( courseid    NUMBER
, studentid   NUMBER );
 
/* Insert values in student. */
INSERT INTO students ( studentid, studentName ) VALUES (1,'James Tiberus Kirk');
INSERT INTO students ( studentid, studentName ) VALUES (2,'Leonard McCoy');
INSERT INTO students ( studentid, studentName ) VALUES (3,'Montgomery Scott');
 
/* Insert values in courses. */
INSERT INTO courses ( courseid, coursename ) VALUES (1,'English Literature');
INSERT INTO courses ( courseid, coursename ) VALUES (2,'Physics');
INSERT INTO courses ( courseid, coursename ) VALUES (3,'English Composition');
INSERT INTO courses ( courseid, coursename ) VALUES (4,'Botany');
INSERT INTO courses ( courseid, coursename ) VALUES (5,'Mechanical Engineering');
 
/* Insert values into registeredcourses. */
INSERT INTO registeredcourses ( studentid, courseid ) VALUES (1,1);
INSERT INTO registeredcourses ( studentid, courseid ) VALUES (1,3);
INSERT INTO registeredcourses ( studentid, courseid ) VALUES (1,4);
INSERT INTO registeredcourses ( studentid, courseid ) VALUES (2,2);
INSERT INTO registeredcourses ( studentid, courseid ) VALUES (2,5);
INSERT INTO registeredcourses ( studentid, courseid ) VALUES (3,4); 
 
/* Non-ANSI SQL GROUP BY statement. */
SELECT   a.studentname
,        COUNT(b.courseid)
FROM     students a INNER JOIN registeredcourses b
ON       a.studentid = b.studentid
GROUP BY a.studentid;
 
/* ANSI SQL GROUP BY statement. */
SELECT   a.studentname AS "Student Name"
,        COUNT(b.courseid) AS "Course IDs"
FROM     students a INNER JOIN registeredcourses b
ON       a.studentid = b.studentid INNER JOIN courses c
ON       b.courseid = c.courseid
GROUP BY a.studentname;

I hope this helps those learning the correct way to write SQL.

Written by maclochlainn

January 12th, 2023 at 11:30 pm