MacLochlainns Weblog

Michael McLaughlin's Technical Blog

Site Admin

MySQL Query from JSON

with one comment

One of my students asked how you could get JSON data out in tabular format. I said they should look at Øystein Grøvlen’s JSON_TABLE – Best of Both Worlds blog post from 2018. Unfortunately, the student wanted another example with the Video Store model that we use in class.

For clarity, all path definitions start with a $ followed by other selectors:

  • A period followed by a name, such as $.website
  • [N] where N is the position in a zero-indexed array
  • The .[*] wildcard evaluates all members of an object
  • The [*] wildcard evaluates all members of an array
  • The prefix and suffix wildcard, **, evaluates to all paths that begin with the named prefix and end with the named suffix

So, here’s a quick supplement to what’s already there. It assumes you created an example table based on my prior blog post that looks like this:

+----+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| id | struct                                                                                                                                                                                          |
+----+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|  1 | {"contact": [{"last_name": "Winn", "first_name": "Randi"}, {"last_name": "Winn", "first_name": "Brian"}], "account_number": "US00001"}                                                          |
|  2 | {"contact": [{"last_name": "Vizquel", "first_name": "Oscar"}, {"last_name": "Vizquel", "first_name": "Doreen"}], "account_number": "US00002"}                                                   |
|  3 | {"contact": [{"last_name": "Sweeney", "first_name": "Meaghan"}, {"last_name": "Sweeney", "first_name": "Matthew"}, {"last_name": "Sweeney", "first_name": "Ian"}], "account_number": "US00003"} |
|  4 | {"contact": [{"last_name": "Clinton", "first_name": "Goeffrey"}], "account_number": "US00004"}                                                                                                  |
|  5 | {"contact": [{"last_name": "Moss", "first_name": "Wendy"}], "account_number": "US00005"}                                                                                                        |
|  6 | {"contact": [{"last_name": "Gretelz", "first_name": "Simon"}], "account_number": "US00006"}                                                                                                     |
|  7 | {"contact": [{"last_name": "Royal", "first_name": "Elizabeth"}], "account_number": "US00007"}                                                                                                   |
|  8 | {"contact": [{"last_name": "Smith", "first_name": "Brian"}], "account_number": "US00008"}                                                                                                       |
|  9 | {"contact": [{"last_name": "Potter", "first_name": "Harry"}, {"last_name": "Potter", "first_name": "Ginny"}, {"last_name": "Potter", "first_name": "Lily"}], "account_number": "US00011"}       |
+----+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
9 rows in set (0.01 sec)

You can query the account_number key value like this:

SELECT id
,      JSON_EXTRACT(struct, "$.account_number") AS account_no
FROM   example;

It returns:

+----+------------+
| id | account_no |
+----+------------+
|  1 | "US00001"  |
|  2 | "US00002"  |
|  3 | "US00003"  |
|  4 | "US00004"  |
|  5 | "US00005"  |
|  6 | "US00006"  |
|  7 | "US00007"  |
|  8 | "US00008"  |
|  9 | "US00011"  |
+----+------------+
9 rows in set (0.00 sec)

You use the JSON_TABLE function to get the embedded array elements of first and last name, like:

SELECT id
,      contact.*
FROM   example CROSS JOIN
       JSON_TABLE(
           struct
         ,"$.contact[*]"
         COLUMNS(
             lname JSON PATH "$.last_name"
           , fname JSON PATH "$.first_name")) AS contact;

It returns:

+----+-----------+-------------+
| id | lname     | fname       |
+----+-----------+-------------+
|  1 | "Winn"    | "Randi"     |
|  1 | "Winn"    | "Brian"     |
|  2 | "Vizquel" | "Oscar"     |
|  2 | "Vizquel" | "Doreen"    |
|  3 | "Sweeney" | "Meaghan"   |
|  3 | "Sweeney" | "Matthew"   |
|  3 | "Sweeney" | "Ian"       |
|  4 | "Clinton" | "Goeffrey"  |
|  5 | "Moss"    | "Wendy"     |
|  6 | "Gretelz" | "Simon"     |
|  7 | "Royal"   | "Elizabeth" |
|  8 | "Smith"   | "Brian"     |
|  9 | "Potter"  | "Harry"     |
|  9 | "Potter"  | "Ginny"     |
|  9 | "Potter"  | "Lily"      |
+----+-----------+-------------+
15 rows in set (0.00 sec)

You can combine both approaches, as shown below.

SELECT id
,      JSON_EXTRACT(struct, "$.account_number") AS account_no
,      contact.*
FROM   example CROSS JOIN
       JSON_TABLE(
           struct
         ,"$.contact[*]"
         COLUMNS(
             lname JSON PATH "$.last_name"
           , fname JSON PATH "$.first_name")) AS contact;

It returns:

+----+------------+-----------+-------------+
| id | account_no | lname     | fname       |
+----+------------+-----------+-------------+
|  1 | "US00001"  | "Winn"    | "Randi"     |
|  1 | "US00001"  | "Winn"    | "Brian"     |
|  2 | "US00002"  | "Vizquel" | "Oscar"     |
|  2 | "US00002"  | "Vizquel" | "Doreen"    |
|  3 | "US00003"  | "Sweeney" | "Meaghan"   |
|  3 | "US00003"  | "Sweeney" | "Matthew"   |
|  3 | "US00003"  | "Sweeney" | "Ian"       |
|  4 | "US00004"  | "Clinton" | "Goeffrey"  |
|  5 | "US00005"  | "Moss"    | "Wendy"     |
|  6 | "US00006"  | "Gretelz" | "Simon"     |
|  7 | "US00007"  | "Royal"   | "Elizabeth" |
|  8 | "US00008"  | "Smith"   | "Brian"     |
|  9 | "US00011"  | "Potter"  | "Harry"     |
|  9 | "US00011"  | "Potter"  | "Ginny"     |
|  9 | "US00011"  | "Potter"  | "Lily"      |
+----+------------+-----------+-------------+
15 rows in set (0.00 sec)

Lastly, if you want to get rid of the enclosing double quotes you can do the following:

WITH raw AS
(SELECT id
 ,      JSON_EXTRACT(struct, "$.account_number") AS account_no
 ,      contact.*
 FROM   example CROSS JOIN
        JSON_TABLE(
            struct
          ,"$.contact[*]"
          COLUMNS(
              lname JSON PATH "$.last_name"
            , fname JSON PATH "$.first_name")) AS contact)
SELECT  id
,       REGEXP_REPLACE(account_no,'"','') AS account_no
,       REGEXP_REPLACE(lname,'"','') AS lname
,       REGEXP_REPLACE(fname,'"','') AS fname
FROM    raw;

It’s also possible to use the JSON_UNQUOTE function to cleanup the double quotes. I hope this helps those extracting JSON data into tabular result sets.

Written by maclochlainn

June 26th, 2022 at 12:38 am