MySQL Query from JSON
One of my students asked how you could get JSON data out in tabular format. I said they should look at Øystein Grøvlen’s JSON_TABLE – Best of Both Worlds blog post from 2018. Unfortunately, the student wanted another example with the Video Store model that we use in class.
For clarity, all path definitions start with a $ followed by other selectors:
- A period followed by a name, such as $.website
- [N] where N is the position in a zero-indexed array
- The .[*] wildcard evaluates all members of an object
- The [*] wildcard evaluates all members of an array
- The prefix and suffix wildcard, **, evaluates to all paths that begin with the named prefix and end with the named suffix
So, here’s a quick supplement to what’s already there. It assumes you created an example table based on my prior blog post that looks like this:
+----+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | id | struct | +----+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 1 | {"contact": [{"last_name": "Winn", "first_name": "Randi"}, {"last_name": "Winn", "first_name": "Brian"}], "account_number": "US00001"} | | 2 | {"contact": [{"last_name": "Vizquel", "first_name": "Oscar"}, {"last_name": "Vizquel", "first_name": "Doreen"}], "account_number": "US00002"} | | 3 | {"contact": [{"last_name": "Sweeney", "first_name": "Meaghan"}, {"last_name": "Sweeney", "first_name": "Matthew"}, {"last_name": "Sweeney", "first_name": "Ian"}], "account_number": "US00003"} | | 4 | {"contact": [{"last_name": "Clinton", "first_name": "Goeffrey"}], "account_number": "US00004"} | | 5 | {"contact": [{"last_name": "Moss", "first_name": "Wendy"}], "account_number": "US00005"} | | 6 | {"contact": [{"last_name": "Gretelz", "first_name": "Simon"}], "account_number": "US00006"} | | 7 | {"contact": [{"last_name": "Royal", "first_name": "Elizabeth"}], "account_number": "US00007"} | | 8 | {"contact": [{"last_name": "Smith", "first_name": "Brian"}], "account_number": "US00008"} | | 9 | {"contact": [{"last_name": "Potter", "first_name": "Harry"}, {"last_name": "Potter", "first_name": "Ginny"}, {"last_name": "Potter", "first_name": "Lily"}], "account_number": "US00011"} | +----+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 9 rows in set (0.01 sec) |
You can query the account_number key value like this:
SELECT id , JSON_EXTRACT(struct, "$.account_number") AS account_no FROM example; |
It returns:
+----+------------+ | id | account_no | +----+------------+ | 1 | "US00001" | | 2 | "US00002" | | 3 | "US00003" | | 4 | "US00004" | | 5 | "US00005" | | 6 | "US00006" | | 7 | "US00007" | | 8 | "US00008" | | 9 | "US00011" | +----+------------+ 9 rows in set (0.00 sec) |
You use the JSON_TABLE function to get the embedded array elements of first and last name, like:
SELECT id , contact.* FROM example CROSS JOIN JSON_TABLE( struct ,"$.contact[*]" COLUMNS( lname JSON PATH "$.last_name" , fname JSON PATH "$.first_name")) AS contact; |
It returns:
+----+-----------+-------------+ | id | lname | fname | +----+-----------+-------------+ | 1 | "Winn" | "Randi" | | 1 | "Winn" | "Brian" | | 2 | "Vizquel" | "Oscar" | | 2 | "Vizquel" | "Doreen" | | 3 | "Sweeney" | "Meaghan" | | 3 | "Sweeney" | "Matthew" | | 3 | "Sweeney" | "Ian" | | 4 | "Clinton" | "Goeffrey" | | 5 | "Moss" | "Wendy" | | 6 | "Gretelz" | "Simon" | | 7 | "Royal" | "Elizabeth" | | 8 | "Smith" | "Brian" | | 9 | "Potter" | "Harry" | | 9 | "Potter" | "Ginny" | | 9 | "Potter" | "Lily" | +----+-----------+-------------+ 15 rows in set (0.00 sec) |
You can combine both approaches, as shown below.
SELECT id , JSON_EXTRACT(struct, "$.account_number") AS account_no , contact.* FROM example CROSS JOIN JSON_TABLE( struct ,"$.contact[*]" COLUMNS( lname JSON PATH "$.last_name" , fname JSON PATH "$.first_name")) AS contact; |
It returns:
+----+------------+-----------+-------------+ | id | account_no | lname | fname | +----+------------+-----------+-------------+ | 1 | "US00001" | "Winn" | "Randi" | | 1 | "US00001" | "Winn" | "Brian" | | 2 | "US00002" | "Vizquel" | "Oscar" | | 2 | "US00002" | "Vizquel" | "Doreen" | | 3 | "US00003" | "Sweeney" | "Meaghan" | | 3 | "US00003" | "Sweeney" | "Matthew" | | 3 | "US00003" | "Sweeney" | "Ian" | | 4 | "US00004" | "Clinton" | "Goeffrey" | | 5 | "US00005" | "Moss" | "Wendy" | | 6 | "US00006" | "Gretelz" | "Simon" | | 7 | "US00007" | "Royal" | "Elizabeth" | | 8 | "US00008" | "Smith" | "Brian" | | 9 | "US00011" | "Potter" | "Harry" | | 9 | "US00011" | "Potter" | "Ginny" | | 9 | "US00011" | "Potter" | "Lily" | +----+------------+-----------+-------------+ 15 rows in set (0.00 sec) |
Lastly, if you want to get rid of the enclosing double quotes you can do the following:
WITH raw AS (SELECT id , JSON_EXTRACT(struct, "$.account_number") AS account_no , contact.* FROM example CROSS JOIN JSON_TABLE( struct ,"$.contact[*]" COLUMNS( lname JSON PATH "$.last_name" , fname JSON PATH "$.first_name")) AS contact) SELECT id , REGEXP_REPLACE(account_no,'"','') AS account_no , REGEXP_REPLACE(lname,'"','') AS lname , REGEXP_REPLACE(fname,'"','') AS fname FROM raw; |
It’s also possible to use the JSON_UNQUOTE function to cleanup the double quotes. I hope this helps those extracting JSON data into tabular result sets.