MacLochlainns Weblog

Michael McLaughlin's Technical Blog

4 – GROUP BY Clause

`GROUP BY` Clause

Learning Outcomes

• Learn how to group non-aggregated and aggregated columns.
• Learn how to use ordinary columns in the `GROUP BY` clause.
• Learn how to use formula columns in the `GROUP BY` clause.

Lesson Materials

As explained in the `SELECT` statement page, the minimum `SELECT` statement has a `GROUP BY` clause e. The `GROUP BY` clause is a comma delimited set of literals, column names, calculations of literals and column values that match those non-aggregated values in the `SELECT`-list. The `GROUP BY` clause differs from the `SELECT`-list because the comma delimited values can’t have aliases.

`SELECT`-list

The `SELECT`-list may include non-aggregated and aggregated columns. The non-aggregated columns may be literals, calculations of literals and columns. Aggregated columns use functions like these:

• AVG
• COUNT
• MAX
• MIN
• SUM
• STD

Aggregating functions add the values together or apply mathematical formula across a series of rows. They typically reduce the number of rows to a smaller set of rows. The `GROUP BY` clause determines how the aggregating functions will reduce the number of rows.

Alternatively, you can have a `SELECT`-list that contains one or a comma delimited set of aggregating functions without any non-aggregating elements and it will return one row. You get one row because the aggregating functions work across the entire set of rows returned and return an aggregated result or results in the single row for all of the rows returned by the query.

`SELECT`-list with only aggregated functions

The most common approach to a `SELECT` statement that uses only one or more aggregated functions is a query that counts the number of rows in a table. This type of query doesn’t require a `GROUP BY` clause occurs when you want to count the number of rows in a table or result set from two or more tables. You can query the number of rows in the cereal table with this `SELECT` statement:

```SELECT COUNT(*) FROM cereal;```

It returns the following:

``` COUNT(*) ---------- 70```

`SELECT`-list with a `GROUP BY` clause

A `SELECT` statement that has a combination of non-aggregated and aggregated function columns requires that you add a `GROUP BY` clause. You should note that its possible a `GROUP BY` may allow a query to work while failing to reduce the number of rows returned.

A query fails to reduce the number of rows when the list of non-aggregated columns are unique. A unique set of non-aggregated columns can’t be reduced by an aggregating function because the reduction is set by the non-aggregated columns.

The following query uses a non-aggregated set of columns that are unique, which means that the aggregating function will not reduce the number of rows returned by the query. The query builds on concepts introduced in the `SELECT` clause web page and it uses the same `cereal` table:

```1 2 3 4 5 6 7 8 9 COL brand FORMAT A30 HEADING "Brand" COL ounces FORMAT A10 HEADING "Ounces" COL avg_per_ounce FORMAT A10 HEADING "Average|Cost|Ounces" SELECT brand , LPAD(TO_CHAR(ounces,'00.00'),10,' ') AS ounces , LPAD(TO_CHAR(AVG(suggested_retail/ounces),'\$0.00'),10,' ') AS avg_per_ounce FROM cereal GROUP BY brand , LPAD(TO_CHAR(ounces,'00.00'),10,' ');```

It returns the following sixty-eight rows:

``` Average Cost Brand Ounces Ounces ------------------------------ ---------- ---------- Strawberry Toast Crunch 18.25 \$0.27 Lucky Charm 16.00 \$0.25 Cinnamon Vanilla Lucky Charm 11.50 \$0.22 ... Blueberry Rice Chex 12.00 \$0.21 Blueberry Rice Chex 18.00 \$0.22 Raisin Brand Crunch 38.25 \$0.12 Cheerios 12.00 \$0.23```

The next example removes one of the non-aggregated columns from the query, which makes the non-aggregated `brand` column non-unique and enables the aggregating function to reduce the number of rows:

```1 2 3 4 5 6 COL brand FORMAT A30 HEADING "Brand" COL avg_per_ounce FORMAT A10 HEADING "Average|Cost|Ounces" SELECT brand , LPAD(TO_CHAR(AVG(suggested_retail/ounces),'\$0.00'),10,' ') AS avg_per_ounce FROM cereal GROUP BY brand;```

It returns the following thirty-six rows from the original sixty-eight rows:

``` Average Cost Brand Ounces ------------------------------ ---------- Blueberry Toast Crunch \$0.27 Apple Jacks \$0.16 Honey Nut Cheerios \$0.21 ... Chocolate Frosted Flakes \$0.20 Trix \$0.24 Rice Chex \$0.22 Blueberry Rice Chex \$0.22```

You have options that allow you to filter aggregated sets with the `HAVING` clause.

Written by maclochlainn

December 24th, 2017 at 10:27 am