MacLochlainns Weblog

Michael McLaughlin's Technical Blog

Site Admin

Data Engineer?

without comments

Students often ask me about data engineering. I try to explain some of the aspects, and how the tasks can be organized but I never laid out all the titles. I really like this illustration (click on image for larger size) from the Gartner Group because it does that. You can download the full “What Are the Essential Roles for Data and Analytics” paper here).

An excerpt from Gartner’s paper:

Data Engineer

Data engineering is the practice of making the appropriate data available to various data consumers (including data scientists, data and business analysts, citizen integrators, and line-of-business users). It is a discipline that involves collaboration across business and IT units. This key discipline requires skilled data engineers to support both IT and business teams.

Data engineers are primarily responsible for building, managing and operationalizing data pipelines in support of key D&A use cases. They are also primarily responsible for leading the tedious (and often complex) task of:

  • Curating datasets and data pipelines created by nontechnical users (e.g., through self-service data preparation tools), data scientists or even IT resources.
  • Operationalizing data delivery for production-level deployments.

I hope the summary is helpful and Gartner’s paper interesting.

Written by maclochlainn

February 7th, 2023 at 1:07 pm

Updating SQL_MODE

without comments

This is an update for MySQL 8 Stored PSM to add the ONLY_FULL_GROUP_BY mode to the global SQL_MODE variable when it’s not set during a session. Here’s the code:

/* Drop procedure conditionally on whether it exists already. */
DROP PROCEDURE IF EXISTS set_full_group_by;
 
/* Reset delimter to allow semicolons to terminate statements. */
DELIMITER $$
 
/* Create a procedure to verify and set connection parameter. */
CREATE PROCEDURE set_full_group_by()
  LANGUAGE SQL
  NOT DETERMINISTIC
  SQL SECURITY DEFINER
  COMMENT 'Set connection parameter when not set.'
BEGIN
 
  /* Check whether full group by is set in the connection and
     if unset, set it in the scope of the connection. */
  IF EXISTS
    (SELECT TRUE
     WHERE NOT REGEXP_LIKE(@@SESSION.SQL_MODE,'ONLY_FULL_GROUP_BY'))
  THEN
    SET @@GLOBAL.SQL_MODE := CONCAT(@@SESSION.sql_mode,',ONLY_FULL_GROUP_BY');
  END IF;
END;
$$
 
/* Reset the default delimiter. */
DELIMITER ;

You can call the set_full_group_by procedure with the CALL command:

CALL set_full_group_by();

You can see the SQL_MODE variable with the following query:

SELECT @@GLOBAL.SQL_MODE;

It’ll return:

+---------------------------------------------------------------+
| @@GLOBAL.SQL_MODE                                             |
+---------------------------------------------------------------+
| ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ENGINE_SUBSTITUTION |
+---------------------------------------------------------------+
1 row in set (0.00 sec)

As always, I hope this helps those looking to solve this type of problem.

Written by maclochlainn

January 13th, 2023 at 12:06 am

GROUP BY Quirk

without comments

It’s always interesting to see how others teach SQL courses. It can be revealing as to whether they understand SQL or only understand a dialect of SQL. In this case, one of my old students was taking a graduate course in SQL and the teacher was using MySQL. The teacher made an issue of using ANSI SQL:1999 or SQL3 and asked the following question, which I suspect is a quiz bank question from a textbook:

“How would you get all students’ names and for each student the number of courses that the
student has registered for?”

They referenced the MySQL 5.7 documentation for the GROUP BY and SQL:1999 as if MySQL implemented the ANSI SQL:1999 specification defined the standard. I didn’t know whether to laugh or cry because they were referring to MySQL 5.7 when we’re all using MySQL 8 and anybody who’s worked in more than MySQL knows that the behavior for a GROUP BY in MySQL can work without listing the necessary non-aggregated columns in the SELECT-list.

For example, their working solution, which is from the instructor and the author of their MySQL textbook the correct perspective of ANSI:1999 behavior. It doesn’t matter that their solution is actually based on ANSI:1992 not ANSI:1999 because it will only succeed because of a quirk of MySQL:

SELECT   a.studentname
,        COUNT(b.courseid)
FROM     students a INNER JOIN registeredcourses b
ON       a.studentid = b.studentid
GROUP BY a.studentid;

While it works in MySQL, it doesn’t work because it conforms to an ANSI standard. It works in MySQL, notwithstanding that standard because it violates the standard.

In Oracle, PostgreSQL, and SQL Server, it raises an exception. For example, Oracle raises the following exception:

SELECT   a.studentname
         *
ERROR at line 1:
ORA-00979: not a GROUP BY expression

The correct way to write the GROUP BY is:

SELECT   a.studentname
,        COUNT(b.courseid)
FROM     students a INNER JOIN registeredcourses b
ON       a.studentid = b.studentid INNER JOIN courses c
ON       b.courseid = c.courseid
GROUP BY a.studentname;

Then, it would return:

Student Name                   Course IDs
------------------------------ ----------
Montgomery Scott                        1
Leonard McCoy                           2
James Tiberus Kirk                      3

For reference, here’s a complete test case for MySQL:

/* Drop table conditionally. */
DROP TABLE IF EXISTS students;
 
/* Create table. */
CREATE TABLE students
( studentID    int unsigned primary key auto_increment
, studentName  varchar(30));
 
/* Drop table conditionally. */
DROP TABLE IF EXISTS courses;
 
/* Create table. */
CREATE TABLE courses
( courseid    int unsigned primary key auto_increment
, coursename  varchar(40));
 
/* Drop table conditionally. */
DROP TABLE IF EXISTS registeredcourses;
 
/* Create table. */
CREATE TABLE registeredcourses
( courseid    int unsigned
, studentid   int unsigned );
 
/* Insert into students. */
INSERT INTO students
( studentName )
VALUES
 ('James Tiberus Kirk')
,('Leonard McCoy')
,('Montgomery Scott');
 
/* Insert into courses. */
INSERT INTO courses
( coursename )
VALUES
 ('English Literature')
,('Physics')
,('English Composition')
,('Botany')
,('Mechanical Engineering');
 
/* Insert into registeredcourses. */
INSERT INTO registeredcourses
( studentid
, courseid )
VALUES
 (1,1)
,(1,3)
,(1,4)
,(2,2)
,(2,5)
,(3,4); 
 
/* Check global sql_mode to ensure only_full_group_by is set. */
SELECT @@GLOBAL.SQL_MODE;
 
/* Query with a column not found in the SELECT-list. */
SELECT   a.studentname
,        COUNT(b.courseid)
FROM     students a INNER JOIN registeredcourses b
ON       a.studentid = b.studentid
GROUP BY a.studentid;
 
/* Query consistent with ANSI SQL:1992 */
SELECT   a.studentname
,        COUNT(b.courseid)
FROM     students a INNER JOIN registeredcourses b
ON       a.studentid = b.studentid INNER JOIN courses c
ON       b.courseid = c.courseid
GROUP BY a.studentname;

and, another complete test case for Oracle:

/* Drop tabhe unconditionallly. */
DROP TABLE students;
 
/* Create table. */
CREATE TABLE students
( studentID    NUMBER PRIMARY KEY
, studentName  VARCHAR(30));
 
/* Drop table unconditionally. */
DROP TABLE courses;
 
/* Create table. */
CREATE TABLE courses
( courseid    NUMBER PRIMARY KEY
, coursename  VARCHAR(40));
 
/* Drop table unconditionally. */
DROP TABLE registeredcourses;
 
/* Create table. */
CREATE TABLE registeredcourses
( courseid    NUMBER
, studentid   NUMBER );
 
/* Insert values in student. */
INSERT INTO students ( studentid, studentName ) VALUES (1,'James Tiberus Kirk');
INSERT INTO students ( studentid, studentName ) VALUES (2,'Leonard McCoy');
INSERT INTO students ( studentid, studentName ) VALUES (3,'Montgomery Scott');
 
/* Insert values in courses. */
INSERT INTO courses ( courseid, coursename ) VALUES (1,'English Literature');
INSERT INTO courses ( courseid, coursename ) VALUES (2,'Physics');
INSERT INTO courses ( courseid, coursename ) VALUES (3,'English Composition');
INSERT INTO courses ( courseid, coursename ) VALUES (4,'Botany');
INSERT INTO courses ( courseid, coursename ) VALUES (5,'Mechanical Engineering');
 
/* Insert values into registeredcourses. */
INSERT INTO registeredcourses ( studentid, courseid ) VALUES (1,1);
INSERT INTO registeredcourses ( studentid, courseid ) VALUES (1,3);
INSERT INTO registeredcourses ( studentid, courseid ) VALUES (1,4);
INSERT INTO registeredcourses ( studentid, courseid ) VALUES (2,2);
INSERT INTO registeredcourses ( studentid, courseid ) VALUES (2,5);
INSERT INTO registeredcourses ( studentid, courseid ) VALUES (3,4); 
 
/* Non-ANSI SQL GROUP BY statement. */
SELECT   a.studentname
,        COUNT(b.courseid)
FROM     students a INNER JOIN registeredcourses b
ON       a.studentid = b.studentid
GROUP BY a.studentid;
 
/* ANSI SQL GROUP BY statement. */
SELECT   a.studentname AS "Student Name"
,        COUNT(b.courseid) AS "Course IDs"
FROM     students a INNER JOIN registeredcourses b
ON       a.studentid = b.studentid INNER JOIN courses c
ON       b.courseid = c.courseid
GROUP BY a.studentname;

I hope this helps those learning the correct way to write SQL.

Written by maclochlainn

January 12th, 2023 at 11:30 pm

TDE on PostgreSQL

without comments

The scope of Transparent Data Encryption (TDE) in PostgreSQL only applies to columns. It does not encrypt other aspects of the database, like table-level and database-level encryption; and those who deploy PostgreSQL may need to implement additional security measures to protect these database components.

You need to know two key elements before exploring TDE in PostgreSQL: Scheme inside a database and extensions. Unlike many databases, PostgreSQL schemas are not synonymous with a database. You may have multiple scheme (or, alternatively schemas) inside any PostgreSQL database.

Creating an extension is a one time event. Therefore, it’s easier to show you that first. You create a pgcrypto extension with the following command:

CREATE EXTENSION pgcrypto;

The public schema is the one most users deploy but for the purpose of hiding our AES encryption key this example creates a hidden schema. Unless you change the default find setting the hidden schema is not visible when connecting to the database.

You create the hidden schema with the following idimpotent (re-runnable) set of commands:

/* Drop dependent objects before dropping the schema. */
DROP TABLE IF EXISTS hidden.aes_key;
DROP FUNCTION IF EXISTS hidden.get_aes_key;
 
/*
 *  Drop function with cascade to remove the
 *  film_character_t trigger at same time.
 */
DROP FUNCTION IF EXISTS hidden.film_character_dml_f CASCADE;
 
/* Drop the schema conditionally. */
DROP SCHEMA IF EXISTS hidden;
 
/* Create the schema. */
CREATE SCHEMA hidden;

Next, we need to create a aes_key table and get_aes_key function in the hidden schema. The table will store the AES encryption key and the function lets us create an AES encryption key.

/* Create an aes encryption key table. */
CREATE TABLE hidden.aes_key
( aes_key  text );
 
/* Create a hidden function to build an AES encryption key. */
CREATE OR REPLACE
  FUNCTION hidden.get_aes_key() RETURNS text AS
  $$
  BEGIN
    RETURN gen_random_bytes(16)::text;
  END;
  $$
  LANGUAGE plpgsql;

After creating the public get_key() function, you insert a single row to the aes_key table by prefacing it with the hidden schema name, like this:

/* Insert the AES encryption key into a table. */ 
INSERT INTO hidden.aes_key
( aes_key )
VALUES
( hidden.get_aes_key());

Having built the plumbing for our AES encryption key, let’s show you how to encrypt and decrypt string values. This example lets you create an idimpotent film_character table in the public schema, like:

/* Drop the table conditionally. */  
DROP TABLE IF EXISTS film_character;
 
/* Create the demonstration table for encrypting and decrypting strings. */ 
CREATE TABLE film_character
( character_id    serial PRIMARY KEY
, plain_text      text
, encrypted_text  bytea );

After creating the AES encryption key table, function, and inserting a row of data, you need to create a public get_key() function, like:

/* Create a public function to retrieve the AES encryption key. */
CREATE OR REPLACE
  FUNCTION get_key() RETURNS text AS
  $$
  DECLARE
    retval  text;
  BEGIN
    SELECT aes_key INTO retval FROM hidden.aes_key;
    RETURN retval;
  END;
  $$
  LANGUAGE plpgsql;

The following INSERT statement write a plain text column and encrypted text column into the film_character table. The get_key() function hides how the pgp_sym_encrypt function encrypts the string.

/* Insert plain and encrypted text into a table. */
INSERT INTO film_character
( plain_text
, encrypted_text )
VALUES
('Severus Snape'
, pgp_sym_encrypt('Slytherin',get_key()));

The following query displays the plain and encrypted text stored in a row of the film_character table.

/* Query plain and encrypted text from a table. */
SELECT character_id
,      plain_text
,      encrypted_text
FROM   film_character;

It displays:

 character_id |  plain_text   | encrypted_text
--------------+---------------+--------------------------------------------------------------------------------------------------------------------------------------------------------
            1 | Severus Snape | \xc30d04070302fa1c4eebd90204cc7bd23901f1d4fa91b2455c3ef2987a305aebe01a4d94f9ebb467d6cb7a3846342ccd09cb55ac5e82a71cbaef93728fbeb4aaa9bf71b6fb93457758d1
(1 row)

Last, the following query displays the plain and decrypted text with the pgp_sym_decrypt function in a query:

/* Query the plain and decrypted text from a table. */
SELECT character_id
,      plain_text
,      pgp_sym_decrypt(encrypted_text,get_key()) AS encrypted_text
FROM   film_character;

The query returns the plain and decrypted values:

 character_id |  plain_text   | encrypted_text
--------------+---------------+-----------------
            1 | Severus Snape | Slytherin
(1 row)

However, this approach exposes the method for encrypting the encrypted_text column’s string value. You can hide this by creating a film_character_dml_f function in the hidden schema and a film_character_t trigger in the public schema, like:

/* Create trigger function for insert or update. */
CREATE FUNCTION hidden.film_character_dml_f()
  RETURNS trigger AS
$$
DECLARE
  /* Declare local variable. */
  unencrypted_input  VARCHAR(30);
BEGIN
  unencrypted_input := new.encrypted_text::text;
  /* Encrypt the column. */
  new.encrypted_text := pgp_sym_encrypt(unencrypted_input,get_key());
 
  /* Return new record type. */
  RETURN NEW;
END;
$$
LANGUAGE plpgsql;
 
CREATE TRIGGER film_character_t
  BEFORE INSERT OR UPDATE ON film_character
  FOR EACH ROW EXECUTE FUNCTION hidden.film_character_dml_f();

Now, you can insert the plain text data in an INSERT statement and the encryption occurs without disclosing how it happens. Here’s a sample statement:

INSERT INTO film_character
( plain_text
, encrypted_text )
VALUES
('Harry Potter'
,'Gryffindor');

A query of the table shows you that both rows have an encrypted value in the encrypted_text column.

/* Query plain and encrypted text from a table. */
SELECT character_id
,      plain_text
,      encrypted_text
FROM   film_character;

Displayed like:

 character_id |  plain_text   |                                                                                   encrypted_text
--------------+---------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
            1 | Severus Snape | \xc30d040703026716034f140d83e76cd23a01f99168afebe50d760b85c69373e3947c74473115a939843887db8e102cd0b2524378f4d684e0ba91c20afc436a056cd983fc47794eef7d4904
            2 | Harry Potter  | \xc30d040703020d8cc71d1f84e1ef6fd24701fd308f669e28a6135beac130fc51a6ccb5cef3c5005f4f557207fe5c84c4aedbb5b098dc9a882a9b7d801c61e34cd90517b4628b5a18b96b3fc61663b48391146b8c0fa2a858
(2 rows)

As always, I hope this code complete solution helps those trying to work with this technical stack.

Written by maclochlainn

January 8th, 2023 at 10:23 am

DML Event Management

without comments

Data Manipulation Language (DML)

DML statements add data to, change data in, and remove data from tables. This section examines four DML statements—the INSERT, UPDATE, DELETE, and MERGE statements—and builds on concepts of data transactions. The INSERT statement adds new data, the UPDATE statement changes data, the DELETE statement removes data from the database, and the MERGE statement either adds new data or changes existing data.

Any INSERT, UPDATE, MERGE, or DELETE SQL statement that adds, updates, or deletes rows in a table locks rows in a table and hides the information until the change is committed or undone (that is, rolled back). This is the nature of ACID-compliant SQL statements. Locks prevent other sessions from making a change while a current session is working with the data. Locks also restrict other sessions from seeing any changes until they’re made permanent. The database keeps two copies of rows that are undergoing change. One copy of the rows with pending changes is visible to the current session, while the other displays committed changes only.

ACID Compliant Transactions

ACID compliance relies on a two-phase commit (2PC) protocol and ensures that the current session is the only one that can see new inserts, updated column values, and the absence of deleted rows. Other sessions run by the same or different users can’t see the changes until you commit them.

ACID Compliant INSERT Statements

The INSERT statement adds rows to existing tables and uses a 2PC protocol to implement ACID- compliant guarantees. The SQL INSERT statement is a DML statement that adds one or more rows to a table. Oracle supports a VALUES clause when adding a single-row, and support a subquery when adding one to many rows.

The figure below shows a flow chart depicting an INSERT statement. The process of adding one or more rows to a table occurs during the first phase of an INSERT statement. Adding the rows exhibits both atomic and consistent properties. Atomic means all or nothing: it adds one or more rows and succeeds, or it doesn’t add any rows and fails. Consistent means that the addition of rows is guaranteed whether the database engine adds them sequentially or concurrently in threads.

Concurrent behaviors happen when the database parallelizes DML statements. This is similar to the concept of threads as lightweight processes that work under the direction of a single process. The parallel actions of a single SQL statement delegate and manage work sent to separate threads. Oracle supports all ACID properties and implements threaded execution as parallel operations. All tables support parallelization.

After adding the rows to a table, the isolation property prevents any other session from seeing the new rows—that means another session started by the same user or by another user with access to the same table. The atomic, consistent, and isolation properties occur in the first phase of any INSERT statement. The durable property is exclusively part of the second phase of an INSERT statement, and rows become durable when the COMMIT statement ratifies the insertion of the new data.

ACID Compliant UPDATE Statements

An UPDATE statement changes column values in one-to-many rows. With a WHERE clause, you update only rows of interest, but if you forget the WHERE clause, an UPDATE statement would run against all rows in a table. Although you can update any column in a row, it’s generally bad practice to update a primary or foreign key column because you can break referential integrity. You should only update non-key data in tables—that is, the data that doesn’t make a row unique within a table.

Changes to column values are atomic when they work. For scalability reasons, the database implementation of updates to many rows is often concurrent, in threads through parallelization. This process can span multiple process threads and uses a transaction paradigm that coordinates changes across the threads. The entire UPDATE statement fails when any one thread fails.

Similar to the INSERT statement, UPDATE statement changes to column values are also hidden until they are made permanent with the application of the isolation property. The changes are hidden from other sessions, including sessions begun by the same database user.

It’s possible that another session might attempt to lock or change data in a modified but uncommitted row. When this happens, the second DML statement encounters a lock and goes into a wait state until the row becomes available for changes. If you neglected to set a timeout value for the wait state, such as this clause, the FOR UPDATE clause waits until the target rows are unlocked:

WAIT n

As the figure below shows, actual updates are first-phase commit elements. While an UPDATE statement changes data, it changes only the current session values until it is made permanent by a COMMIT statement. Like the INSERT statement, the atomic, consistent, and isolation properties of an UPDATE statement occur during the first phase of a 2PC process. Changes to column values are atomic when they work. Any column changes are hidden from other sessions until the UPDATE statement is made permanent by a COMMIT or ROLLBACK statement, which is an example of the isolation property.

Any changes to column values can be modified by an ON UPDATE trigger before a COMMIT statement. ON UPDATE triggers run inside the first phase of the 2PC process. A COMMIT or ROLLBACK statement ends the transaction scope of the UPDATE statement.

The Oracle database engine can dispatch changes to many threads when an UPDATE statement works against many rows. UPDATE statements are consistent when these changes work in a single thread-of-control or across multiple threads with the same results.

As with the INSERT statement, the atomic, consistent, and isolation properties occur during the first phase of any UPDATE statement, and the COMMIT statement is the sole activity of the second phase. Column value changes become durable only with the execution of a COMMIT statement.

ACID Compliant DELETE Statements

A DELETE statement removes rows from a table. Like an UPDATE statement, the absence of a WHERE clause in a DELETE statement deletes all rows in a table. Deleted rows remain visible outside of the transaction scope where it has been removed. However, any attempts to UPDATE those deleted rows are held in a pending status until they are committed or rolled back.

You delete rows when they’re no longer useful. Deleting rows can be problematic when rows in another table have a dependency on the deleted rows. Consider, for example, a customer table that contains a list of cell phone contacts and an address table that contains the addresses for some but not all of the contacts. If you delete a row from the customer table that still has related rows in the address table, those address table rows are now orphaned and useless.

As a rule, you delete data from the most dependent table to the least dependent table, which is the opposite of the insertion process. Basically, you delete the child record before you delete the parent record. The parent record holds the primary key value, and the child record holds the foreign key value. You drop the foreign key value, which is a copy of the primary key, before you drop the primary key record. For example, you would insert a row in the customer table before you insert a row in the address table, and you delete rows from the address table before you delete rows in the customer table.

The figure below shows the logic behind a DELETE statement. Like the INSERT and UPDATE statements, acid, consistency, and isolation properties of the ACID-compliant transaction are managed during the first phase of a 2PC. The durability property is managed by the COMMIT or ROLLBACK statement.

There’s no discussion or diagrams for the MERGE statement because it does either an INSERT or UPDATE statement based on it’s internal logic. That means a MERGE statement is ACID compliant like an INSERT or UPDATE statement.

Written by maclochlainn

January 1st, 2023 at 8:05 pm

AlmaLinux Flask

with one comment

This post shows how to install and test Flask with Python on AlmaLinux. You install the flask Python libraries with the following commands as the student user. The student user is in the sudoer group.

pip3 install flask_sslify --user student

It produces the following log file:

You can use the following hello.py test program

# Import libraries.
from flask import Flask, escape, request
from markupsafe import escape
 
# Define the application.
app = Flask(__name__)
 
# Define a base URI route and function.
@app.route('/')
def index():
  return "Hello World!"
 
# Define an application URI route and function.
@app.route("/hello")
def hello():
  name = request.args.get("name","Simon")
  return f'Hello {escape(name)}!'
 
# Define an about URI route and function.
@app.route("/about")
def about():
  return "About Page."
 
# Define an <username> variable rule for a route.
@app.route("/user/<string:username>")
def show_user_profile(username):
  return 'User [%s].' % escape(username)
 
# Define an <username> variable rule for a route.
@app.route("/year/<int:year>")
def show_post(year):
  return 'Year [%d].' % year
 
# Run the file.
if __name__ == "__main__":
  app.run()

You can start the Flask server with the following two commands in a separate shell session. This allows you to monitor activities and writes an activity log:

export FLASK_APP=hello.py
flask run

It also writes a compiled version of the hello.py program to the __pycache__ directory. If you make changes to the base file, you must delete the cached version in the __pycache__ directory.

You can test it by typing any of the following URL in a browser:

  1. The index page without a routing label in the URL:

    http://localhost:5000/hello

    It’ll print the following:

    Hello World!
  2. The index page with a hello routing label in the URL:

    http://localhost:5000/hello

    It’ll print the following:

    Hello Simon!
  3. The index page with a about routing label in the URL:

    http://localhost:5000/about

    It’ll print the following:

    About Page.
  4. The index page with a user routing label in the URL:

    http://localhost:5000/user/Somebody

    It’ll print the following:

    User [Somebody].
  5. The index page with a year routing label in the URL:

    http://localhost:5000/year/1986

    It’ll print the following:

    Year [1986].

The activity log shows the following:

 * Serving Flask app 'hello.py'
 * Debug mode: off
WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
 * Running on http://127.0.0.1:5000
Press CTRL+C to quit
127.0.0.1 - - [29/Dec/2022 19:33:12] "GET / HTTP/1.1" 200 -
127.0.0.1 - - [29/Dec/2022 19:33:21] "GET /hello HTTP/1.1" 200 -
127.0.0.1 - - [29/Dec/2022 19:33:27] "GET /about HTTP/1.1" 200 -
127.0.0.1 - - [29/Dec/2022 19:33:48] "GET /year/1986 HTTP/1.1" 200 -
127.0.0.1 - - [29/Dec/2022 19:34:09] "GET /user/Somebody HTTP/1.1" 200 -

If you stop the process with the Ctrl+Z, the process will not stop but not the listener process. As a sudoer user, you can find the open listener process with the following command:

sudo netstat -nlp | grep 5000
[sudo] password for student:

It will return something like this:

tcp        0      0 127.0.0.1:5000          0.0.0.0:*               LISTEN      143363/python3

You kill the process without prejudice by using the following command:

kill -9 143363

As always, I hope this helps those who are looking for step-by-step instructions.

Written by maclochlainn

December 29th, 2022 at 4:35 pm

Posted in flask,Python,Python 3.x

MySQL PNG Files

with one comment

LAMP (Linux, Apache, MySQL, Perl/PHP/Python) Architecture is very flexible. All the components can be positioned on the same server or different servers. The servers are divided into two types. The types are known as the Application or database tiers. Generally, the application tier holds the Apache Server, any Apache Modules, and local copies of Server Side Includes (SSI) programs.

In many development environments, you also deploy the client to the same machine. This means a single machine runs the database server, the application server, and the browser. The lab for this section assumes these configurations.

Before you test an installation, you should make sure that you’ve started the database and Apache server. In an Oracle LAMP configuration (known as an OLAP – Oracle, Linux, Apache, Perl/PHP/Python), you must start both the Oracle Listener and database. MySQL starts the listener when you start the database. You must also start the Apache Server. The Apache Server also starts an Apache Listener, which listens for incoming HTTP/HTTPS requests. It listens on Port 80 unless you override that setting in the httpd.conf file.

The URI reaches the server and is redirected to an Apache Module based on configuration information found in the httpd.conf file. Spawned or child processes of the Apache Module then read programs into memory from the file system and run them. If you’ve uploaded a file the locally stored program can move it from a secure cache location to another local area for processing. The started programs can run independently or include other files as libraries, and they can communicate to the database server.

Working though PHP test cases against the MySQL database for my AlmaLinux installation and configuration, I discovered that the php-gd library weren’t installed by default. I had to add it to get my PHP programs to upload and display PNG files.

The log file for applying the php-gd packages:

The balance of this page demonstrates how to upload, store, and manage Text (Character Large Data Streams) and BLOBs (Binary Large Objects). It provides MySQL equivalent instructions to those for manaing LOBs in an Oracle database. As covered in Chapter 8 in my Oracle Database 11g PL/SQL Programming book.

Before you begin these steps, you should have already installed Zend Server Community Edition. If you haven’t done so, please click here for instructions.

If you find any problems, please let me know. I’ll be happy to fix them.

Written by maclochlainn

December 28th, 2022 at 10:59 pm

AlmaLinux Install & Configuration

without comments

This is a collection of blog posts for installing and configuring AlmaLinux with the Oracle, PostgreSQL, MySQL databases and several programming languages. Sample programs show how to connect PHP and Python to the MySQL database.

I used Oracle Database 11g XE in this instance to keep the footprint as small as possible. It required a few tricks and discovering the missing library that caused folks grief eleven years ago. I build another with a current Oracle Database XE after the new year.

If you see something that I missed or you’d like me to add, let me know. As time allows, I’ll try to do that. Naturally, the post will get updates as things are added later.

AlmaLinux+SQLDeveloper

with one comment

This post makes the assumption that you’ve installed the current version of Java and the Java SDK. If you haven’t, you can find instructions on my blog. You will also need to have an installation of the Oracle database running on your server or a remote service, or cloud micro service.

The remainder of this post deals with downloading, installing, and configuring Oracle’s SQL Developer for AlmaLinux 9, which is the GNU version of Red Hat Enterprise 9.

  1. Go to Oracle’s download site and download the sqldeveloper RPM file. You will need to provide your Oracle credentials to download SQL Developer. It will download to your sudoer account’s Download directory. In this example the sudoer user is the student user.

    You should see the following web page and click on the Download link, provided you’re installing on Linux it’ll look like the next image.

    Then, you need to accept the license and click the Download button. Oracle will prompt you for your credentials if you’re not logged in on the web page already.

  2. Next, you need to navigate to the Downloads directory and install the sqldeveloper RPM. Assuming your sudoer user is student, you can get to the Downloads directory with the following command.

    cd ~student/Downloads

    Assuming, you downloaded the SQL Developer package, you can use the following command to install any downloaded version of sqldeveloper package.

    sudo rpm -Uvh `ls sqldeveloper*.rpm`

    The log file for this is:

  3. Click on Activities link in the upper left corner and then the clustered nine dots to view applications. Choose the SQL Developer icon and double click and you should see the following dialog if you’re a first time user. Unless you’re upgrading, click the No button to proceed.

  4. The first official screen after checking whether you need to transfer existing settings wants to know whether or allow or disallow user tracking. Click the OK button if you don’t mind Oracle tracking your use, or click the Allow automated usage reporting to Oracle checkbox to disallow Oracle tracking your use before you click the OK button.

  5. The next screen lets you set up a TNS names file or use an existing file. I clicked on the XE existing database to continue.

    This is a password prompt for the TNS name resolution of XE, which should point to the Oracle Database 11g Express Edition. (I’d use a more current version but I couldn’t resist using the smaller footprint of the pre-containerized Oracle databases.)

    Replace XE with the name of a sandboxed user, like student, and the password for the student user before you click the OK button. (If you don’t know what a sandboxed user is, you should. It’s a user with limited access to a database of the same name in the context of an Oracle database. A non-sandboxed user has global privileges like the system user.)

  6. The next screen lets you enter SQL statements agains the student database. You can click the X button in the top right corner to close the application.

You’ve now installed SQL Developer. However, sometimes I want to start SQL Developer from the command-line interface (CLI) but you’ll get a bunch of warnings and unnecessary Java non-critical errors. So, I create an alias to avoid the extraneous noise. I create the sqldeveloper alias in the .bashrc file for it. You can create a sqldeveloper alias by adding the following line to your .bashrc file:

The unnecessary noise when you don’t create a sqldeveloper alias.

You create the alias like this:

alias  sqldeveloper="sqldeveloper 2>/dev/null &"

That completes the instructions. Good luck with SQL Developer. It’s a great tool.

Written by maclochlainn

December 21st, 2022 at 2:12 pm

AlmaLinux+Java+MySQL

with 2 comments

AlmaLinux generally has Java installed. You can check whether java is installed with this command:

which -a java

It should return:

/usr/bin/java

Then, you can check the Java version with this command:

java -version

For AlmaLinux 9, it should return:

openjdk version "11.0.17" 2022-10-18 LTS
OpenJDK Runtime Environment (Red_Hat-11.0.17.0.8-2.el9_0) (build 11.0.17+8-LTS)
OpenJDK 64-Bit Server VM (Red_Hat-11.0.17.0.8-2.el9_0) (build 11.0.17+8-LTS, mixed mode, sharing)

Next, you check whether javac is installed. You can use the which command to determine whether it is installed. Generally, its not installed and you use this command to

sudo dnf search jdk | egrep -- '-17'

It should return:

Last metadata expiration check: 0:11:17 ago on Mon 19 Dec 2022 11:32:48 PM EST.
java-17-openjdk.x86_64 : OpenJDK 17 Runtime Environment
java-17-openjdk-demo.x86_64 : OpenJDK 17 Demos
java-17-openjdk-devel.x86_64 : OpenJDK 17 Development Environment
java-17-openjdk-headless.x86_64 : OpenJDK 17 Headless Runtime Environment
java-17-openjdk-javadoc.x86_64 : OpenJDK 17 API documentation
java-17-openjdk-javadoc-zip.x86_64 : OpenJDK 17 API documentation compressed in a single archive
java-17-openjdk-jmods.x86_64 : JMods for OpenJDK 17
java-17-openjdk-src.x86_64 : OpenJDK 17 Source Bundle
java-17-openjdk-static-libs.x86_64 : OpenJDK 17 libraries for static linking

Now, you can install the Java JDK with the following dnf command:

sudo dnf install java-17-openjdk java-17-openjdk-devel

The log file for this is:

After installing javac, you can verify it with the which command, and check the version with the javac utility.

which -a javac

It should return:

/usr/bin/javac

Then, you can check the Java version with this command:

java -version

For AlmaLinux 9, it should return:

javac 17.0.5

Now, you need to download and install the jdk with the dnf utility. You download the jdk package with the wget utility.

wget https://download.oracle.com/java/17/latest/jdk-17_linux-x64_bin.rpm

It will generate the following console output:

--2022-12-20 00:12:34--  https://download.oracle.com/java/17/latest/jdk-17_linux-x64_bin.rpm
Resolving download.oracle.com (download.oracle.com)... 23.192.208.88
Connecting to download.oracle.com (download.oracle.com)|23.192.208.88|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 163101336 (156M) [application/x-redhat-package-manager]
Saving to: ‘jdk-17_linux-x64_bin.rpm’
 
jdk-17_linux-x64_bi 100%[===================>] 155.54M  11.1MB/s    in 17s     
 
2022-12-20 00:12:51 (9.01 MB/s) - ‘jdk-17_linux-x64_bin.rpm’ saved [163101336/163101336]

You install the jdk package with the dnf utility.

sudo rpm -Uvh jdk-17_linux-x64_bin.rpm

The log file for this is:

After installing everything, you need to set the default Java. You can discover the available Java versions with the following command:

sudo alternatives --config java

It should return a list and the ability to select one by using the “Selection” number. I recommend you chose #2.

There are 3 programs which provide 'java'.
 
  Selection    Command
-----------------------------------------------
   1           java-11-openjdk.x86_64 (/usr/lib/jvm/java-11-openjdk-11.0.17.0.8-2.el9_0.x86_64/bin/java)
   2           java-17-openjdk.x86_64 (/usr/lib/jvm/java-17-openjdk-17.0.5.0.8-2.el9_0.x86_64/bin/java)
*+ 3           /usr/java/jdk-17.0.5/bin/java
 
Enter to keep the current selection[+], or type selection number: 2

Then, you can check the Java version with this command:

java -version

Now, it should return:

openjdk version "17.0.5" 2022-10-18 LTS
OpenJDK Runtime Environment (Red_Hat-17.0.5.0.8-2.el9_0) (build 17.0.5+8-LTS)
OpenJDK 64-Bit Server VM (Red_Hat-17.0.5.0.8-2.el9_0) (build 17.0.5+8-LTS, mixed mode, sharing)

You should define the $JAVA_HOME environment variable in the /etc/profile file. Sometimes, it is also appropriate to include it in your .bashrc file.

export set JAVA_HOME="/usr/lib/jvm/java-17-openjdk-17.0.5.0.8-2.el9_0.x86_64/bin"

You should also add it to your $PATH variable in your .bashrc file, like:

export set PATH=$PATH:$JAVA_HOME

Now, let’s write the basic “Hello World” program in Java:

// Define the Java program.
public class HelloWorld {
  public static void main(String[] args) {
    System.out.println("Hello World!");
  }
}

Compile it with:

javac HelloWorld.java

Run it with the java command:

java HelloWorld

It prints:

Hello World!

Let’s download the MySQL/J Connector from the MySQL website, where we con select the operating system and its version, as shown in the image.

After you download the mysql-connector-j RPM package, you can install it with the following dnf command:

sudo dnf install -y mysql-connector-j-8.0.31-1.el9.noarch.rpm

The log file for this is:

You should add the $CLASSPATH to your .bashrc file. Add the following line to your .bashrc file and resource your .bashrc file to test the JDBC driver.

export set CLASSPATH="/usr/share/java/mysql-connector-j.jar:."

Let’s write a slightly larger Java program that tests connectivity to the MySQL database, like:

// Import classes.
import java.sql.*;
 
/* You can't include the following on Linux without raising an exception. */
// import com.mysql.jdbc.Driver;
 
public class MySQLConnector {
  public MySQLConnector() {
    /* Declare variables that require explicit assignments because
       they're addressed in the finally block. */
    Connection conn = null;
    Statement stmt = null;
    ResultSet rset = null;
 
    /* Declare other variables. */
    String url;
    String username = "student";
    String password = "student";
    String database = "studentdb";
    String hostname = "localhost";
    String port = "3306";
    String sql;
 
    /* Attempt a connection. */
    try {
      // Set URL.
      url = "jdbc:mysql://" + hostname + ":" + port + "/" + database;
 
      // Create instance of MySQLDriver.
      Class.forName ("com.mysql.cj.jdbc.Driver").newInstance();
      conn = DriverManager.getConnection (url, username, password);
 
      // Query the version of the database.
      sql = "SELECT version()";
      stmt = conn.createStatement();
      rset = stmt.executeQuery(sql);
 
      System.out.println ("Database connection established");
 
      // Read row returns for one column.
      while (rset.next()) {
        System.out.println("MySQLDriver Version [" + rset.getString(1) + "]"); }
 
    }
    catch (SQLException e) {
      System.err.println ("Cannot connect to database server:");
      System.out.println(e.getMessage());
    }
    catch (ClassNotFoundException e) {
      System.err.println ("Cannot find MySQL driver class:");
      System.out.println(e.getMessage());
    }
    catch (InstantiationException e) {
      System.err.println ("Cannot instantiate class:");
      System.out.println(e.getMessage());
    }
    catch (IllegalAccessException e) {
      System.err.println ("Illegal access exception:");
      System.out.println(e.getMessage());
    }
    finally {
      if (conn != null) {
        try {
          rset.close();
          stmt.close();
          conn.close();
          System.out.println ("Database connection terminated");
        }
        catch (Exception e) { /* ignore close errors */ }
      }
    }
  }
  /* Unit test. */
  public static void main(String args[]) {
    new MySQLDriver();
  }
}

Just a note, there’s a deprecated behavior introduced in MySQL 8.0.24, which produces the following error message when compiled looking for deprecation:

javac -Xlint:deprecation  MySQLDriver.java

It produces the following error message, which I hope to sort and update later in the week:

MySQLDriver.java:30: warning: [deprecation] newInstance() in Class has been deprecated
      Class.forName ("com.mysql.cj.jdbc.Driver").newInstance();
                                                ^
  where T is a type-variable:
    T extends Object declared in class Class
1 warning

The code does compile because the deprecation is only a warning. When you run the program, like:

java MySQLDriver

It returns, confirming a connection to the MySQL database:

Database connection established
MySQLDriver Version [8.0.30]
Database connection terminated

As always, I hope this helps those trying to sort out new behaviors.

Written by maclochlainn

December 19th, 2022 at 9:51 pm