Archive for the ‘Python’ Category
AlmaLinux MySQL+Python
After installing and configuring MySQL 8.0.30, I installed the Python connector. During that process on AlmaLinux, there were several changes since I last installed the Python’s mysql module. Here are the step-by-step instructions after installing and configuring MySQL Server (blog for those steps).
Using the MySQL Connector/Python X DevAPI Reference, you must install the pip utility before you install the library. You install the pip library as a sudoer user with the following command:
sudo yum install -y pip |
Then, using the pip utility as a sudoer user install the mysql-connector-python module with the following command:
sudo pip install mysql-connector-python |
Please note that this type of library installation can cause problems when you maintain multiple testing environments. If you plan to maintain multiple testing environments, please install this library in a virtual environment.
You create a bare-bones Python connection test program, like connect.py below:
# Import the library. import mysql.connector from mysql.connector import errorcode try: # Open connection. cnx = mysql.connector.connect(user='student', password='student', host='127.0.0.1', database='studentdb') # Print the value. print("Database connection resolved.") # Handle exception and close connection. except mysql.connector.Error as e: if e.errno == errorcode.ER_ACCESS_DENIED_ERROR: print("Something is wrong with your user name or password") elif e.errno == errorcode.ER_BAD_DB_ERROR: print("Database does not exist") else: print(e) # Close the connection when the try block completes. else: cnx.close() |
You test the connect.py program with this command:
python connect.py |
It’ll return the following provided you provisioned the student user and studentdb database:
Database connection resolved. |
If you’d like to run it without calling the python executable, you can add the following as the first line:
#/usr/bin/python |
and, change the default file permissions from
-rw-rw-r--. 1 student student 717 Oct 30 13:57 connect.py |
with the chmod command to
-rwxr-xr-x. 1 student student 717 Oct 30 13:57 connect.py |
These instructions should set you up to develop Python code against your AlmaLinux MySQL studentdb database. You use this syntax, assuming a default $PATH environment variable that excludes the present working directory.
./connect.py |
As always, I hope this helps those trying to get a complete solution.
Quick Python Bit #1
A student asked for a Python example that would find a string starting with a vowel and prepend an “An” or starting with a non-vowel and prepend an “A”. As a rule, this one seems to make sense to most beginning programmers.
#/usr/bin/python # Import the regular expression library. import re # Create a list of words to check. haystack = ['anything','something','nothing'] # Create a regular expression for a leading vowel. needle = "^[AEIOUaeiou].*$" # Read through the haystacks and use the right article. for i in range(0,len(haystack),1): if re.search(needle,haystack[i],re.IGNORECASE): print("An",haystack[i]+".") else: print("A",haystack[i]+".") |
However, this loop is much cleaner:
#/usr/bin/python # Import the regular expression library. import re # Create a list of words to check. haystack = ['anything','something','nothing'] # Create a regular expression for a leading vowel. needle = "^[AEIOUaeiou].*$" # Read through the haystacks and use the right article. for i in haystack: if re.search(needle,i,re.IGNORECASE): print("An",i+".") else: print("A",i+".") |
Both programs print the following:
An anything. A something. A nothing. |
As always, happy to help write another example.
MySQL JSON Tricks
Are they really tricks or simply basic techniques combined to create a solution. Before writing these mechanics for using native MySQL to create a compound JSON object, let me point out that the easiest way to get one is to use the MySQL Node.js library, as shown recently in my “Is SQL Programming” blog post.
Moving data from a relational model output to a JSON structure isn’t as simple as a delimited list of columns in a SQL query. Let’s look at it in stages based on the MySQL Server 12.18.2 Functions that create JSON values.
Here’s how you return single row as a JSON object, which is quite straightforward:
SELECT JSON_OBJECT('first_name',c.first_name,'last_name',c.last_name) AS json_result FROM contact c WHERE first_name = 'Harry' AND last_name = 'Potter'; |
It returns:
+------------------------------------------------+ | json_result | +------------------------------------------------+ | {"last_name": "Potter", "first_name": "Harry"} | +------------------------------------------------+ 1 row in set (0.00 sec) |
With a GROUP_CONCAT function, let’s capture a JSON array of all three Potter family members:
SELECT CONCAT('[' , GROUP_CONCAT( JSON_OBJECT('first_name',first_name ,'last_name',last_name ) SEPARATOR ',') ,']') AS json_result FROM contact c WHERE c.last_name = 'Potter'; |
It returns an array of JSON objects:
+-----------------------------------------------------------------------------------------------------------------------------------------------+ | [{"last_name": "Potter", "first_name": "Harry"},{"last_name": "Potter", "first_name": "Ginny"},{"last_name": "Potter", "first_name": "Lily"}] | +-----------------------------------------------------------------------------------------------------------------------------------------------+ 1 row in set (0.01 sec) |
Next, let’s put a 1:many relationship between the member and contact table into a JSON structure with a single account number and an array of contact. It requires a second call to the JSON_OBJECT function and the addition of a GROUP BY clause in the query.
SELECT JSON_OBJECT( 'account_number', account_number ,'contact', CONCAT('[' , GROUP_CONCAT( JSON_OBJECT('first_name',first_name ,'last_name',last_name ) SEPARATOR ',') ,']') ) AS json_result FROM member m INNER JOIN contact c ON m.member_id = c.member_id WHERE c.last_name = 'Potter' GROUP BY m.account_number; |
It returns the following string with an annoying set of backslashes. It also inverts the column order, which appears unavoidable but it shouldn’t matter because the order of name-value pairs in JSON is immaterial.
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | {"contact": "[{\"last_name\": \"Potter\", \"first_name\": \"Harry\"},{\"last_name\": \"Potter\", \"first_name\": \"Ginny\"},{\"last_name\": \"Potter\", \"first_name\": \"Lily\"}]", "account_number": "US00011"} | +-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 1 row in set (0.00 sec) |
The following quick little Python code cleans up the JSON string by removing the backslashes and extraneous quotes around the array of contacts.
# Import the library. import mysql.connector from mysql.connector import errorcode try: # Open connection. cnx = mysql.connector.connect(user='student', password='student', host='127.0.0.1', database='studentdb') # Create cursor. cursor = cnx.cursor() # Set the query statement. query = ("SELECT JSON_OBJECT( " "'account_number', m.account_number " ",'contact', CONCAT('[' " " , GROUP_CONCAT( " " JSON_OBJECT('first_name', c.first_name " " ,'last_name', c.last_name ) SEPARATOR ',') " " ,']')) AS json_result " "FROM contact c INNER JOIN member m " "ON c.member_id = m.member_id " "WHERE c.last_name = %s " "GROUP BY account_number") # Execute cursor. cursor.execute(query,["Potter"]) # Display the column returned by the query stripped of backslashes and # extraneous quotes. for (row) in cursor: for column in range(len(row)): print(row[column].replace("\\","").replace("\"[","[").replace("]\"","]")) # Close cursor. cursor.close() # ------------------------------------------------------------ # Handle exception and close connection. except mysql.connector.Error as e: if e.errno == errorcode.ER_ACCESS_DENIED_ERROR: print("Something is wrong with your user name or password") elif e.errno == errorcode.ER_BAD_DB_ERROR: print("Database does not exist") else: print("Error code:", e.errno) # error number print("SQLSTATE value:", e.sqlstate) # SQLSTATE value print("Error message:", e.msg) # error message # Close the connection when the try block completes. else: cnx.close() |
It returns:
{"contact": [{"last_name": "Potter", "first_name": "Harry"},{"last_name": "Potter", "first_name": "Ginny"},{"last_name": "Potter", "first_name": "Lily"}], "account_number": "US00011"} |
I hope this helps exhibit less well known MySQL syntax. Check this post to see how to insert a result set without Python as an intermediary.
Install Python on Windows
A number of my students want to run the databases and development environments on Windows rather than Linux. Some of the students have various problems configuring a virtual machine or Docker environment on their laptops.
Installing Python on Windows is quite straightforward. You simply open a Windows Command Line Interface (CLI) with Administrator privileges and type python at the command line. It will launch a dialog that lets you download and install Python.
After the installation, you are returned to the Windows CLI where you can type python to launch the Python IDE. You’ll learn that it’s an older version when it opens.
Python 3.10.2 (tags/v3.10.2:a58ebcc, Jan 17 2022, 14:12:15) [MSC v.1929 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> |
After installing Python, I started to install drivers with the Python package-management utility – PIP. I loaded the psycopg2 driver first and discovered that Windows installed an older version of PIP, as qualified when I installed the Python library:
pip install psycopg2 Collecting psycopg2 Downloading psycopg2-2.9.3-cp310-cp310-win_amd64.whl (1.2 MB) |████████████████████████████████| 1.2 MB 1.6 MB/s Installing collected packages: psycopg2 Successfully installed psycopg2-2.9.3 WARNING: You are using pip version 21.2.4; however, version 22.0.3 is available. You should consider upgrading via the 'C:\Users\mclaughlinm\AppData\Local\Microsoft\WindowsApps\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\python.exe -m pip install --upgrade pip' command. |
So, I immediately installed the new version of PIP and discovered that I have to manually reset the %PATH% environment variable.
python -m pip install --upgrade pip Requirement already satisfied: pip in c:\program files\windowsapps\pythonsoftwarefoundation.python.3.10_3.10.752.0_x64__qbz5n2kfra8p0\lib\site-packages (21.2.4) Collecting pip Downloading pip-22.0.3-py3-none-any.whl (2.1 MB) |████████████████████████████████| 2.1 MB 656 kB/s Installing collected packages: pip WARNING: The scripts pip.exe, pip3.10.exe and pip3.exe are installed in 'C:\Users\mclaughlinm\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\Scripts' which is not on PATH. Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. Successfully installed pip-22.0.3 |
The default location for the PIP.exe and PIP3.exe are found in the:
C:\Users\mclaughlinm\AppData\Local\Microsoft\WindowsApps |
Rather than put the newer directory in the System Environment %PATH% variable, I created a batch file that lets me set it only when I need it.
SET PATH=C:\Users\mclaughlinm\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\Scripts;%PATH% |
After installing the psycopg2 driver, I tested the new_hire.sql and new_hire.py files from this earlier Linux post successfully. I hope this helps those looking to work with Python on Windows.
Python on PostgreSQL
The ODBC library you use when connecting Python to PostgreSQL is the psycopg2 Python library. This blog post will show use how to use it in Python and install it on your Fedora Linux installation. It leverages a videodb database that I show you how to build in this earlier post on configuring PostgreSQL 14.
You would import psycopg2 as follows in your Python code:
import psycopg2 |
Unfortunately, that only works on Linux servers when you’ve installed the library. That library isn’t installed with generic Python libraries. You get the following error when the psycopg2 library isn’t installed on your server.
Traceback (most recent call last): File "python_new_hire.sql", line 1, in <module> import psycopg2 ModuleNotFoundError: No module named 'psycopg2' |
You can install it on Fedora Linux with the following command:
yum install python3-psycopg2 |
It will install:
==================================================================================== Package Architecture Version Repository Size ==================================================================================== Installing: python3-psycopg2 x86_64 2.7.7-1.fc30 fedora 160 k Transaction Summary ==================================================================================== Install 1 Package Total download size: 160 k Installed size: 593 k Is this ok [y/N]: y Downloading Packages: python3-psycopg2-2.7.7-1.fc30.x86_64.rpm 364 kB/s | 160 kB 00:00 ------------------------------------------------------------------------------------ Total 167 kB/s | 160 kB 00:00 Running transaction check Transaction check succeeded. Running transaction test Transaction test succeeded. Running transaction Preparing : 1/1 Installing : python3-psycopg2-2.7.7-1.fc30.x86_64 1/1 Running scriptlet: python3-psycopg2-2.7.7-1.fc30.x86_64 1/1 Verifying : python3-psycopg2-2.7.7-1.fc30.x86_64 1/1 Installed: python3-psycopg2-2.7.7-1.fc30.x86_64 Complete! |
Here’s a quick test case that you can run in PostgreSQL and Python to test all the pieces. The first SQL script creates a new_hire table and inserts two rows, and the Python program queries data from the new_hire table.
The new_hire.sql file creates the new_hire table and inserts two rows:
-- Environment settings for the script. SET SESSION "videodb.table_name" = 'new_hire'; SET CLIENT_MIN_MESSAGES TO ERROR; -- Verify table name. SELECT current_setting('videodb.table_name'); -- ------------------------------------------------------------------ -- Conditionally drop table. -- ------------------------------------------------------------------ DROP TABLE IF EXISTS new_hire CASCADE; -- ------------------------------------------------------------------ -- Create table. -- ------------------------------------------------------------------- CREATE TABLE new_hire ( new_hire_id SERIAL , first_name VARCHAR(20) NOT NULL , middle_name VARCHAR(20) , last_name VARCHAR(20) NOT NULL , hire_date TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT CURRENT_TIMESTAMP , PRIMARY KEY (new_hire_id)); -- Alter the sequence by restarting it at 1001. ALTER SEQUENCE new_hire_new_hire_id_seq RESTART WITH 1001; -- Display the table organization. SELECT tc.table_catalog || '.' || tc.constraint_name AS constraint_name , tc.table_catalog || '.' || tc.table_name AS table_name , kcu.column_name , ccu.table_catalog || '.' || ccu.table_name AS foreign_table_name , ccu.column_name AS foreign_column_name FROM information_schema.table_constraints AS tc JOIN information_schema.key_column_usage AS kcu ON tc.constraint_name = kcu.constraint_name AND tc.table_schema = kcu.table_schema JOIN information_schema.constraint_column_usage AS ccu ON ccu.constraint_name = tc.constraint_name AND ccu.table_schema = tc.table_schema WHERE tc.constraint_type = 'FOREIGN KEY' AND tc.table_name = current_setting('videodb.table_name') ORDER BY 1; SELECT c1.table_name , c1.ordinal_position , c1.column_name , CASE WHEN c1.is_nullable = 'NO' AND c2.column_name IS NOT NULL THEN 'PRIMARY KEY' WHEN c1.is_nullable = 'NO' AND c2.column_name IS NULL THEN 'NOT NULL' END AS is_nullable , CASE WHEN data_type = 'character varying' THEN data_type||'('||character_maximum_length||')' WHEN data_type = 'numeric' THEN CASE WHEN numeric_scale != 0 AND numeric_scale IS NOT NULL THEN data_type||'('||numeric_precision||','||numeric_scale||')' ELSE data_type||'('||numeric_precision||')' END ELSE data_type END AS data_type FROM information_schema.columns c1 LEFT JOIN (SELECT trim(regexp_matches(column_default,current_setting('videodb.table_name'))::text,'{}')||'_id' column_name FROM information_schema.columns) c2 ON c1.column_name = c2.column_name WHERE c1.table_name = current_setting('videodb.table_name') ORDER BY c1.ordinal_position; -- Display primary key and unique constraints. SELECT constraint_name , lower(constraint_type) AS constraint_type FROM information_schema.table_constraints WHERE table_name = current_setting('videodb.table_name') AND constraint_type IN ('PRIMARY KEY','UNIQUE'); -- Insert two test records. INSERT INTO new_hire ( first_name, middle_name, last_name, hire_date ) VALUES ('Malcolm','Jacob','Lewis','2018-2-14') ,('Henry',null,'Chabot','1990-07-31'); |
You can put it into a local directory, connect as the student user to a videodb database, and run the following command (or any database you’ve created).
\i new_hire.sql |
The new_hire.py file creates the new_hire table and inserts two rows:
# Import the PostgreSQL connector library. import psycopg2 try: # Open a connection to the database. connection = psycopg2.connect( user="student" , password="student" , port="5432" , dbname="videodb") # Open a cursor. cursor = connection.cursor() # Assign a static query. query = "SELECT new_hire_id, first_name, last_name " \ "FROM new_hire" # Parse and execute the query. cursor.execute(query) # Fetch all rows from a table. records = cursor.fetchall() # Read through and print the rows as tuples. for row in range(0, len(records)): print(records[row]) except (Exception, psycopg2.Error) as error : print("Error while fetching data from PostgreSQL", error) finally: # Close the database connection. if (connection): cursor.close() connection.close() |
You run it from the command line, like:
python3 ./new_hire.py |
It should print:
(1001, 'Malcolm', 'Lewis') (1002, 'Henry', 'Chabot') |
As always, I hope this helps those trying to sort out how to connect Python to PostgreSQL.
PostgreSQL CLI Error
Problems get reported to me all the time on installations for my students, this one was interesting. They got an error complaining about a missing libpq.so.5 library.
psql: /usr/pgsql-11/lib/libpq.so.5: no version information available (required by psql) psql: /usr/pgsql-11/lib/libpq.so.5: no version information available (required by psql) could not change directory to "/root": Permission denied psql (11.7, server 11.8) Type "help" for help. postgres=# |
It appeared as a mismatch of libraries but it’s not that. For reference, this was a Fedora instance. I ran the rpm utility:
rpm -qa | grep postgres |
It returned:
postgresql11-libs-11.8-1PGDG.f30.x86_64 postgresql-11.7-2.fc30.x86_64 postgresql-server-11.7-2.fc30.x86_64 |
Then, I had them run the rpm utility again looking for the Python driver for PostgreSQL:
rpm -qa | grep psycopg2 |
It returned:
python3-psycopg2-2.7.7-1.fc30.x86_64 |
Then, it was easy to explain. The Python psycopg2 library uses both PostgreSQL 11.7 and 11.8 dependent libraries and the libpq.so.5 library is missing version information. You must ignore the error, which is really only a warning message, when you want to work on Fedora, PostgreSQL 11, and Python 3.
MySQL WITH Clause
When I went over my example of using the WITH clause to solve how to use a series of literal values in data sets, some students got it right away and some didn’t. The original post showed how to solve a problem where one value in the data set is returned in the SELECT-list and two values are used as the minimum and maximum values with a BETWEEN operator. It used three approaches with literal values:
- A list of Python dictionaries that require you to filter the return set from the database through a range loop and if statement that mimics a SQL BETWEEN operator.
- A WITH clause that accepts the literals as bind variables to filter the query results inside the query.
- A table design that holds the literals values that an analyst might use for reporting.
It was the last example that required elaboration. I explained you might build a web form that uses a table, and the table could allow a data analyst to enter parameter sets. That way the analyst could submit a flag value to use one or another set of values. I threw out the idea on the whiteboard of introducing a report column to the prior post’s level table. The student went off to try it.
Two problems occurred. The first was in the design of the new table and the second was how to properly use the MySQL Python driver.
Below is a formal table design that supports this extension of the first blog post as a list of parameter values. It uses a report column as a super key to return a set of possible values. One value will show in the SELECT-list and the other two values deploy as the minimum and maximum values in a BETWEEN operator. It is seeded with two sets of values. One of the report possibilities is Summary level with three possibilities and the other is the Detail level with five possibilities.
-- Conditionally drop the levels table. DROP TABLE IF EXISTS levels; -- Create the levels list. CREATE TABLE levels ( level VARCHAR(16) , report ENUM('Summary','Detail') , min_roles INT , max_roles INT ); -- Insert values into the list table. INSERT INTO levels ( level, report, min_roles, max_roles ) VALUES ('Hollywood Star','Summary', 30, 99999) ,('Prolific Actor','Summary', 20, 29) ,('Newcommer','Summary', 1, 19) ,('Hollywood Star','Detail', 30, 99999) ,('Prolific Actor','Detail', 20, 29) ,('Regular Actor','Detail', 10, 19) ,('Actor','Detail', 5, 9) ,('Newcommer','Detail', 1, 4); |
The foregoing table design uses an ENUM type because reporting parameter sets are typically fewer than 64 possibilities. If you use the table to support multiple reports, you should add a second super key column like report_type. The report_type column key would let you use the table to support a series of different report parameter lists.
While the student used a %s inside the query, they created a runtime error when trying to pass the single bind variable into the query. The student misunderstood how to convert a report column input parameter variable into a tuple, which shows up when the student calls the Python MySQL Driver, like this:
59 | cursor.execute(query, (report)) |
The student’s code generated the following error stack:
Traceback (most recent call last): File "./python-with-clause.py", line 59, in <module> cursor.execute(query,(report)) File "/usr/lib/python3.7/site-packages/mysql/connector/cursor_cext.py", line 248, in execute prepared = self._cnx.prepare_for_mysql(params) File "/usr/lib/python3.7/site-packages/mysql/connector/connection_cext.py", line 632, in prepare_for_mysql raise ValueError("Could not process parameters") ValueError: Could not process parameters |
The ValueError should indicate to the developer that they’ve used a wrong data type in the call to the method:
cursor.execute(<class 'str'>,<class 'tuple'>) |
This clearly was a misunderstanding of how to cast a single string to a tuple. A quick explanation of how Python casts a single string into a tuple can best be illustrated inside an interactive Python shell, like:
>>> # Define a variable. >>> x = 'Detail' >>> # An incorrect attempt to make a string a tuple. >>> y = (x) >>> # Check type of y after assignment. >>> print(type(y)) <class 'str'> >>> # A correct attempt to make a string a tuple. >>> y = tuple(x) >>> # Check type of y after assignment. >>> print(type(y)) <class 'tuple'> >>> # An alternative to make a string a tuple. >>> z = (x,) >>> # Check type of z after assignment. >>> print(type(z)) <class 'tuple'> |
So, the fix was quite simple to line 59:
59 | cursor.execute(query, (report,)) |
The student started with a copy of a Python program that I provided. I fixed the argument handling and added some comments. The line 59 reference above maps to this code example.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 | # Import the library. import sys import mysql.connector from mysql.connector import errorcode # Capture argument list. fullCmdArguments = sys.argv # Assign argument list to variable. argumentList = fullCmdArguments[1:] # Define a standard report variable. report = "Summary" # Check and process argument list. # ============================================================ # If there are less than two arguments provide default values. # Else enumerate and convert strings to dates. # ============================================================ if (len(argumentList) == 1): # Set a default start date. if (isinstance(report,str)): report = argumentList[0] # Attempt the query. # ============================================================ # Use a try-catch block to manage the connection. # ============================================================ try: # Open connection. cnx = mysql.connector.connect(user='student', password='student', host='127.0.0.1', database='sakila') # Create cursor. cursor = cnx.cursor() # Set the query statement. query = ("WITH actors AS " "(SELECT a.actor_id " " , a.first_name " " , a.last_name " " , COUNT(fa.actor_id) AS num_roles " " FROM actor a INNER JOIN film_actor fa " " ON a.actor_id = fa.actor_id " " GROUP BY a.actor_id " " , a.first_name " " , a.last_name ) " " SELECT a.first_name " " , a.last_name " " , l.level " " , a.num_roles " " FROM actors a CROSS JOIN levels l " " WHERE a.num_roles BETWEEN l.min_roles AND l.max_roles " " AND l.report = %s " " ORDER BY a.last_name " " , a.first_name") # Execute cursor. cursor.execute(query,(report,)) # Display the rows returned by the query. for (first_name, last_name, level, num_roles) in cursor: print('{0} {1} is a {2} with {3} films.'.format( first_name.title() , last_name.title() , level.title() , num_roles)) # Close cursor. cursor.close() # ------------------------------------------------------------ # Handle exception and close connection. except mysql.connector.Error as e: if e.errno == errorcode.ER_ACCESS_DENIED_ERROR: print("Something is wrong with your user name or password") elif e.errno == errorcode.ER_BAD_DB_ERROR: print("Database does not exist") else: print("Error code:", e.errno) # error number print("SQLSTATE value:", e.sqlstate) # SQLSTATE value print("Error message:", e.msg) # error message # Close the connection when the try block completes. else: cnx.close() |
A Linux shell program like the following (provided the name of the shell script and Python program are the same) can run the Python program with or without a parameter. It works without a parameter because it sets a default value for the report variable.
# Switch the file extension and run the python program. file=${0/%sh/py} python3 ${file} "${@}" |
You call the shell script like this:
./python-with-clause.sh Detail |
As always, I hope this helps those looking for a solution.
MySQL with CTEs
As an example for my class on the usefulness of Common Table Expressions (CTEs), I created three examples with Python. They extend an exercise in Chapter 9 on subqueries from Learning SQL by Alan Beaulieu. All of the examples work with the sakila sample database.
These bullets describe the examples:
- Uses local variables and a range for loop and if statement that uses the variables to evaluate and add an element to the derived table (or query result set) from MySQL.
- Uses a CTE with substitution variables from the Python program, which eliminates the need to evaluate and add an element to the query result set because the query does that.
- Uses a table to hold the variables necessary to evaluate and add the element to the query result set.
This is the first Python program:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 | # Import the library. import sys import mysql.connector from mysql.connector import errorcode # Declare a list of tuples. dict = [{'level':'Hollywood Star','min_roles':30,'max_roles':99999} ,{'level':'Prolific Actor','min_roles':20,'max_roles':29} ,{'level':'Newcomer','min_roles':1,'max_roles':19}] # Attempt the query. # ============================================================ # Use a try-catch block to manage the connection. # ============================================================ try: # Open connection. cnx = mysql.connector.connect(user='student', password='student', host='127.0.0.1', database='sakila') # Create cursor. cursor = cnx.cursor() # Set the query statement. query = ("SELECT a.actor_id " ", a.first_name " ", a.last_name " ", COUNT(fa.actor_id) AS films " "FROM actor a INNER JOIN film_actor fa " "ON a.actor_id = fa.actor_id " "GROUP BY a.actor_id " ", a.first_name " ", a.last_name " "ORDER BY a.last_name " ", a.first_name") # Execute cursor. cursor.execute(query) # Display the rows returned by the query. for (actor_id, first_name, last_name, films) in cursor: for i in range(len(dict)): if films >= dict[i]["min_roles"] and films <= dict[i]["max_roles"]: print('{0} {1} is a {2} with {3} films.'.format( first_name.title() , last_name.title() , dict[i]["level"] , films)) # Close cursor. cursor.close() # ------------------------------------------------------------ # Handle exception and close connection. except mysql.connector.Error as e: if e.errno == errorcode.ER_ACCESS_DENIED_ERROR: print("Something is wrong with your user name or password") elif e.errno == errorcode.ER_BAD_DB_ERROR: print("Database does not exist") else: print("Error code:", e.errno) # error number print("SQLSTATE value:", e.sqlstate) # SQLSTATE value print("Error message:", e.msg) # error message # Close the connection when the try block completes. else: cnx.close() |
The Python dictionary on lines 7 thru 9 and range for loop and if statement on lines 41 and 42 can be eliminated by putting the literal values in a Common Table Expression (CTE). That’s because a CROSS JOIN matches all rows in the CTE against the base table before filtering them.
The match of all rows in the CTE against the base table effectively replaces the range for loop in the original code. The WHERE clause replaces the if statement in the original code.
Another optimization for readability of the final query puts the grouped query into a CTE as well. That way the final query simply demonstrates the filtering process.
This is the second Python program, and it converts the Python dictionary to a list of lists and assigns the lists to param tuple:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 | # Import the library. import sys import mysql.connector from mysql.connector import errorcode # Declare a list of lists. list = [['Hollywood Star',30,99999] ,['Prolific Actor',20,29] ,['Newcomer',1,19]] # Declare a tuple of the set of lists. param = (list[0] + list[1] + list[2]) # Attempt the query. # ============================================================ # Use a try-catch block to manage the connection. # ============================================================ try: # Open connection. cnx = mysql.connector.connect(user='student', password='student', host='127.0.0.1', database='sakila') # Create cursor. cursor = cnx.cursor() # Set the query statement. query = ("WITH actors AS " "(SELECT a.actor_id " " , a.first_name " " , a.last_name " " , COUNT(fa.actor_id) AS num_roles " " FROM actor a INNER JOIN film_actor fa " " ON a.actor_id = fa.actor_id " " GROUP BY a.actor_id " " , a.first_name " " , a.last_name ) " " , levels AS " "(SELECT %s AS level " " , %s AS min_roles " " , %s AS max_roles " " UNION ALL " " SELECT %s AS level " " , %s AS min_roles " " , %s AS max_roles " " UNION ALL " " SELECT %s AS level " " , %s AS min_roles " " , %s AS max_roles) " " SELECT a.first_name " " , a.last_name " " , l.level " " , a.num_roles " " FROM actors a CROSS JOIN levels l " " WHERE a.num_roles BETWEEN l.min_roles AND l.max_roles " " ORDER BY a.last_name " " , a.first_name") # Execute cursor. cursor.execute(query, param) # Display the rows returned by the query. for (first_name, last_name, level, num_roles) in cursor: print('{0} {1} is a {2} with {3} films.'.format( first_name.title() , last_name.title() , level.title() , num_roles)) # Close cursor. cursor.close() # ------------------------------------------------------------ # Handle exception and close connection. except mysql.connector.Error as e: if e.errno == errorcode.ER_ACCESS_DENIED_ERROR: print("Something is wrong with your user name or password") elif e.errno == errorcode.ER_BAD_DB_ERROR: print("Database does not exist") else: print("Error code:", e.errno) # error number print("SQLSTATE value:", e.sqlstate) # SQLSTATE value print("Error message:", e.msg) # error message # Close the connection when the try block completes. else: cnx.close() |
This is the third Python program requires some SQL setup. You should run this script inside the sakila database first. It basically takes the variables out of the code and stores them in a table. This is more likely what you would do to ensure maintainability of ever changing range values like these if you built a solution like this in a real application. It leaves the aggregation process inside a CTE and simplifies the final query.
-- Conditionally drop the levels table. DROP TABLE IF EXISTS levels; -- Create the levels list. CREATE TABLE levels ( level VARCHAR(16) , min_roles INT , max_roles INT ); -- Insert values into the list table. INSERT INTO levels ( level, min_roles, max_roles ) VALUES ('Hollywood Star', 30, 99999) ,('Prolific Actor', 20, 29) ,('Newcommer',1,19); |
After seeding the data in the levels table, you can test the query natively in MySQL, like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | -- Query the data. WITH actors AS (SELECT a.actor_id , a.first_name , a.last_name , COUNT(*) AS num_roles FROM actor a INNER JOIN film_actor fa ON a.actor_id = fa.actor_id GROUP BY actor_id) SELECT a.first_name , a.last_name , l.level , a.num_roles FROM actors a CROSS JOIN levels l WHERE a.num_roles BETWEEN l.min_roles AND l.max_roles ORDER BY a.last_name , a.first_name; |
There’s also a syntax that makes this type of query appear to be an INNER JOIN when it’s actually a filtered CROSS JOIN. If you adopt that syntax, you would rewrite lines 14 and 15:
14 15 | FROM actors a INNER JOIN levels l WHERE a.num_roles BETWEEN l.min_roles AND l.max_roles; |
Then, you can run this version without the second CTE element:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 | # Import the library. import sys import mysql.connector from mysql.connector import errorcode # Attempt the query. # ============================================================ # Use a try-catch block to manage the connection. # ============================================================ try: # Open connection. cnx = mysql.connector.connect(user='student', password='student', host='127.0.0.1', database='sakila') # Create cursor. cursor = cnx.cursor() # Set the query statement. query = ("WITH actors AS " "(SELECT a.actor_id " " , a.first_name " " , a.last_name " " , COUNT(fa.actor_id) AS num_roles " " FROM actor a INNER JOIN film_actor fa " " ON a.actor_id = fa.actor_id " " GROUP BY a.actor_id " " , a.first_name " " , a.last_name ) " " SELECT a.first_name " " , a.last_name " " , l.level " " , a.num_roles " " FROM actors a CROSS JOIN levels l " " WHERE a.num_roles BETWEEN l.min_roles AND l.max_roles " " ORDER BY a.last_name " " , a.first_name") # Execute cursor. cursor.execute(query) # Display the rows returned by the query. for (first_name, last_name, level, num_roles) in cursor: print('{0} {1} is a {2} with {3} films.'.format( first_name.title() , last_name.title() , level.title() , num_roles)) # Close cursor. cursor.close() # ------------------------------------------------------------ # Handle exception and close connection. except mysql.connector.Error as e: if e.errno == errorcode.ER_ACCESS_DENIED_ERROR: print("Something is wrong with your user name or password") elif e.errno == errorcode.ER_BAD_DB_ERROR: print("Database does not exist") else: print("Error code:", e.errno) # error number print("SQLSTATE value:", e.sqlstate) # SQLSTATE value print("Error message:", e.msg) # error message # Close the connection when the try block completes. else: cnx.close() |
As always, I hope this helps those trying to understand how CTEs can solve problems that would otherwise be coded in external imperative languages like Python.
Installing PL/Python Extension
While PL/Python is an untrusted language inside PostgreSQL, I was installing it to test some of its features. First, we check to see if PL/Python is installed by attempting to create a PL/Python function:
CREATE FUNCTION pima(a integer, b integer) RETURNS integer AS $$ if a > b: return a return b $$ LANGUAGE plpython3u; |
It likely should raise an error like this because PL/Python is an untrusted language. It’s untrusted because it runs with root
privileges rather than a restricted user’s privilege. More or less, with PL/Python you can access the entire database. This makes PL/Python more of a threat than tool beyond experimentation in a test database.
ERROR: language "plpython3u" does not exist HINT: Use CREATE EXTENSION to load the language into the database. |
A quick query as the postgres user tells you whether or not PL/Python is installed:
SELECT * FROM pg_language; |
It returns the following:
lanname | lanowner | lanispl | lanpltrusted | lanplcallfoid | laninline | lanvalidator | lanacl ----------+----------+---------+--------------+---------------+-----------+--------------+-------- internal | 10 | f | f | 0 | 0 | 2246 | c | 10 | f | f | 0 | 0 | 2247 | sql | 10 | f | t | 0 | 0 | 2248 | plpgsql | 10 | t | t | 14088 | 14089 | 14090 | (4 rows) |
I attempted to add PL/Python with the following command:
CREATE EXTENSION plpython3u; |
It raised the following error:
ERROR: could not open extension control file "/usr/pgsql-11/share/extension/plpython3u.control": No such file or directory |
The error basically appeared to occur because there’s a missing Python 3 package from what I could find on the web. I installed the missing postgresql-plpython3
package as the root
superuser, which you also could install as a member of the sudoer list. The following shows how to install it as a sudoer member:
sudo yum install -y postgresql-plpython3 |
You should see something close to the following console output:
Last metadata expiration check: 0:43:53 ago on Fri 06 Nov 2020 10:42:28 AM MST. Dependencies resolved. ============================================================================================================ Package Architecture Version Repository Size ============================================================================================================ Installing: postgresql-plpython3 x86_64 11.7-2.fc30 updates 86 k Installing dependencies: postgresql-server x86_64 11.7-2.fc30 updates 5.3 M Transaction Summary ============================================================================================================ Install 2 Packages Total download size: 5.3 M Installed size: 23 M Downloading Packages: (1/2): postgresql-plpython3-11.7-2.fc30.x86_64.rpm 218 kB/s | 86 kB 00:00 (2/2): postgresql-server-11.7-2.fc30.x86_64.rpm 3.0 MB/s | 5.3 MB 00:01 ------------------------------------------------------------------------------------------------------------ Total 2.1 MB/s | 5.3 MB 00:02 Running transaction check Transaction check succeeded. Running transaction test Transaction test succeeded. Running transaction Preparing : 1/1 Running scriptlet: postgresql-server-11.7-2.fc30.x86_64 1/2 Installing : postgresql-server-11.7-2.fc30.x86_64 1/2 warning: /var/lib/pgsql/.bash_profile created as /var/lib/pgsql/.bash_profile.rpmnew Running scriptlet: postgresql-server-11.7-2.fc30.x86_64 1/2 Installing : postgresql-plpython3-11.7-2.fc30.x86_64 2/2 Running scriptlet: postgresql-plpython3-11.7-2.fc30.x86_64 2/2 Verifying : postgresql-plpython3-11.7-2.fc30.x86_64 1/2 Verifying : postgresql-server-11.7-2.fc30.x86_64 2/2 Installed: postgresql-plpython3-11.7-2.fc30.x86_64 postgresql-server-11.7-2.fc30.x86_64 Complete! |
While I thought this might fix the problem, it didn’t and raised the following error:
ERROR: could not open extension control file "/usr/pgsql-11/share/extension/plpython3u.control": No such file or directory |
I found the plpython3u.control
file on GitHub and put the following plpython3u.control
file, after comparing it against the plpgsql.control
file, into the /usr/pgsql-11/share/extension
directory. At this point, I began wondering why it’s looking in the /usr/pgsql-11/share/extension
directory instead of a /usr/plpython3u/share/extension
directory (does not exist).
# plpython3u extension comment = 'PL/Python3U untrusted procedural language' default_version = '1.0' module_pathname = '$libdir/plpython3' relocatable = false schema = pg_catalog superuser = true |
I retried creating the plpython3u
extension:
CREATE EXTENSION plpython3u; |
It raised the following error:
ERROR: extension "plpython3u" has no installation script nor update path for version "1.0" |
At this point, I could find no further help on the Internet. I did notice that there were these two *.sql
files in the /usr/pgsql-11/share/extension
directory:
- plpgsql–1.0.sql
- plpgsql–unpackaged–1.0.sql
I found this plpython3u--1.0.sql
file on GitHub:
/* src/pl/plpython/plpython3u--1.0.sql */ CREATE FUNCTION plpython3_call_handler() RETURNS language_handler LANGUAGE c AS 'MODULE_PATHNAME'; CREATE FUNCTION plpython3_inline_handler(internal) RETURNS void STRICT LANGUAGE c AS 'MODULE_PATHNAME'; CREATE FUNCTION plpython3_validator(oid) RETURNS void STRICT LANGUAGE c AS 'MODULE_PATHNAME'; CREATE LANGUAGE plpython3u HANDLER plpython3_call_handler INLINE plpython3_inline_handler VALIDATOR plpython3_validator; COMMENT ON LANGUAGE plpython3u IS 'PL/Python3U untrusted procedural language'; |
I retried creating the plpython3u
extension, as a member of the sudoer list and got a new error:
ERROR: permission denied to create extension "plpython3u" HINT: Must be superuser to create this extension. |
I retried creating the plpython3u
extension as the postgres
user, who is the owning user:
CREATE EXTENSION plpython3u; |
It raised the following error:
ERROR: could not access file "$libdir/plpython3": No such file or directory |
While I intend to finish this, that’s as far as I got. It appears from some of the things I’ve read I need to recompile or configure items that would destabilize what I have working at the moment. Finishing this will need to wait for me to build another test environment from scratch. If you catch this post and know the remaining steps, I invite you to add them in the comments.
Quick Python Flask
A quick example of installing and running a sample hello.py
file with the Flask application with the Command Line Interface (CLI) documentation on the Flask website. The first thing you need to do is install the flask module with the pip3
utility on Fedora Linux (or most other platforms):
pip3 install flask --user student |
You should see a successful log like this for student
user:
Requirement already satisfied: flask in /usr/local/lib64/python3.7/site-packages (1.1.2) Collecting student Downloading https://files.pythonhosted.org/packages/b5/af/be416c18e4fe63a582e06fb0d47bf059bd0f4f413e5a6cfe893747ebaf79/Student-0.0.1-py3-none-any.whl Requirement already satisfied: click>=5.1 in /usr/lib/python3.7/site-packages (from flask) (7.1.1) Requirement already satisfied: itsdangerous>=0.24 in /usr/local/lib/python3.7/site-packages (from flask) (1.1.0) Requirement already satisfied: Werkzeug>=0.15 in /usr/local/lib/python3.7/site-packages (from flask) (1.0.1) Requirement already satisfied: Jinja2>=2.10.1 in /usr/lib/python3.7/site-packages (from flask) (2.10.1) Requirement already satisfied: MarkupSafe>=0.23 in /usr/lib64/python3.7/site-packages (from Jinja2>=2.10.1->flask) (1.1.1) Installing collected packages: student Successfully installed student-0.0.1 |
The smallest footprint hello.py
program is:
# Import statement. from flask import Flask # Application defintion. app = Flask(__name__) # Basic URI rotuing. @app.route("/") # Define a hello function. def hello(): return "Hello World!" # Run the program. if __name__ == "__main__": app.run() |
You can run the program from the command line with two commands:
export FLASK_APP=hello.py flask run |
It will show you the following in the terminal session:
* Serving Flask app "hello.py" * Environment: production WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead. * Debug mode: off * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit) |
Type the following URL in a local browser:
http://localhost:5000/ |
Flask will add the following to the console log:
127.0.0.1 - - [26/Oct/2020 00:37:49] "GET / HTTP/1.1" 200 - |
You can jazz the hello.py
program up with a an argument list, like this modified example based on this comparison article on Python frameworks:
# Import libraries. from flask import Flask, escape, request # Define the application. app = Flask(__name__) # Define a base URI route and function. @app.route('/') def index(): return "Index Page" # Define an application URI route and function. @app.route("/hello") def hello(): name = request.args.get("name","Simon") return f'Hello {escape(name)}!' # Define an about URI route and function. @app.route("/about") def about(): return "About Page" # Run the file. if __name__ == "__main__": app.run() |
It prints 'Hello Simon!'
in a web page. If you try to stop your Flask server with a Ctrl+Z
instead of a Ctrl+C
, the next time you go to start it you will receive an error message like:
OSError: [Errno 98] Address already in use |
You’ll need to find the Linux process ID and kill the process with prejudice. You can find the process with the following Linux command:
sudo netstat -nlp | grep 5000 [sudo] password for student: |
It prompts you for your sudoer password, and then returns a line like:
tcp 0 0 127.0.0.1:5000 0.0.0.0:* LISTEN 76802/python3 |
You kill the process with the following Linux command:
kill -9 76802 |
As always, I hope this helps those looking for the missing pieces.