Archive for the ‘Microsoft Windows 10’ Category
MySQL @SQL_MODE
Installing MySQL Workbench 8 on Windows, we discovered that the default configuration no longer sets ONLY_FULL_GROUP_BY as part of the default SQL_MODE parameter value. While I’ve written a stored function to set the SQL_MODE parameter value for a session, some students didn’t understand that such a call is only valid in the scope of a connection to the database server. They felt the function didn’t work because they didn’t understand the difference between connecting to the MySQL CLI and clicking the lightening bolt in MySQL Workbench.
So, here are the instructions to reset the default SQL_MODE parameter value for Windows. You need to edit the setting in the my.ini file, which is in the C:\ProgramData\MySQL\MySQL Server 8.0 directory. The default installation will have the following:
# Set the SQL mode to strict sql-mode="STRICT_TRANS_TABLES,NO_ENGINE_SUBSTITUTION" |
You need to change it to the following in an editor with Administrative privileges:
# Set the SQL mode to strict sql-mode="STRICT_TRANS_TABLES,NO_ENGINE_SUBSTITUTION,ONLY_FULL_GROUP_BY" |
Then, you need to connect to the services by launching services.msc from the command prompt. In the list of services find MYSQL80 service and restart it. You can verify it by connecting to the MySQL 8.0.* server and running the following SQL query:
SELECT @@SQL_MODE: |
That’s how you convert Windows to use only traditional group by behaviors in SQL. As always, I hope this helps those looking for a solution.
Node.js MySQL Error
While I blogged about how to setup Node.js and MySQL almost two years ago, it was interesting when a student ran into a problem. The student said they’d configured the environment but were unable to use Node.js to access MySQL.
The error is caused by this import statement:
const mysql = require('mysql') |
The student got the following error, which simply says that they hadn’t installed the Node.js package for MySQL driver.
internal/modules/cjs/loader.js:638 throw err; ^ Error: Cannot find module 'mysql' at Function.Module._resolveFilename (internal/modules/cjs/loader.js:636:15) at Function.Module._load (internal/modules/cjs/loader.js:562:25) at Module.require (internal/modules/cjs/loader.js:692:17) at require (internal/modules/cjs/helpers.js:25:18) at Object.<anonymous> (/home/student/Data/cit325/oracle-s/lib/Oracle12cPLSQLCode/Introduction/query.js:4:15) at Module._compile (internal/modules/cjs/loader.js:778:30) at Object.Module._extensions..js (internal/modules/cjs/loader.js:789:10) at Module.load (internal/modules/cjs/loader.js:653:32) at tryModuleLoad (internal/modules/cjs/loader.js:593:12) at Function.Module._load (internal/modules/cjs/loader.js:585:3) |
I explained they could fix the problem with the following two Node.js Package Manager (NPM) commands:
npm init --y npm install --save mysql |
The student was able to retest the code with success. The issue was simply that the Node.js couldn’t find the NPM MySQL module.
MySQL Partitioned Tables
MySQL Partitioned Tables
Learning Outcomes
- Learn about List Partitioning.
- Learn about Range Partitioning.
- Learn about Columns Partitioning.
- Learn about Hash Partitioning.
- Learn about Key Partitioning.
- Learn about Subpartitioning.
Lesson Material
MySQL supports partitioning of tables. It supports range, list, hash, and key partitioning. Range partitioning lets you partition based on column values that fall within given ranges. List partitioning lets you partition based on columns matching one of a set of discrete values. Hash partitioning lets you partition based on the return value from a user-defined expression (the result from a stored SQL/PSM function). Key partitioning performs like hash partitioning, but it lets a user select one or more columns from the set of columns in a table; a hash manages the selection process for you. A hash is a method of organizing keys to types of data, and hashes speed access to read and change data in tables.
Each of the following subsections discusses one of the supported forms of partitioning in MySQL. Naturally, there are differences between other databases and MySQL’s implementation.
List Partitioning
A MySQL list partition works by identifying a column that contains an integer value, the franchise_number in the following example. Partitioning clauses follow the list of columns and constraints and require a partitioning key to be in the primary key or indexed.
The following list partition works with literal numeric values. MySQL uses the IN keyword for list partitions. Note that there’s no primary key designated and an index is on the auto-incrementing surrogate key column. A complete example is provided to avoid confusion on how to index the partitioning key:
CREATE TABLE franchise ( franchise_id INT UNSIGNED AUTO_INCREMENT , franchise_number INT UNSIGNED , franchise_name VARCHAR(20) , city VARCHAR(20) , state VARCHAR(20) , index idx (franchise_id)) PARTITION BY LIST(franchise_number) ( PARTITION offshore VALUES IN (49,50) , PARTITION west VALUES IN (34,45,48) , PARTITION desert VALUES IN (46,47) , PARTITION rockies VALUES IN (38,41,42,44)); |
The inclusion of a PRIMARY KEY constraint on the franchise_id column would trigger an ERROR 1503 when the partitioning key isn’t the primary key. The reason for the error message is that a primary key implicitly creates a unique index, and that index would conflict with the partitioning by list instruction. The use of a non-unique idx index on the franchise_id column is required when you want to partition on a non-primary key column.
Range Partitioning
Range partitioning works only with an integer value or an expression that resolves to an integer against the primary key column. The limitation of the integer drives the necessity of choosing an integer column for range partitioning. You can’t define a range-partitioned table with a PRIMARY KEY constraint unless the primary key becomes your partitioning key, like
the one below.
CREATE TABLE ordering ( ordering_id INT UNSIGNED AUTO_INCREMENT , item_id INT UNSIGNED , rental_amount DECIMAL(15,2) , rental_date DATE , index idx (ordering_id)) PARTITION BY RANGE(item_id) ( PARTITION jan2011 VALUES LESS THAN (10000) , PARTITION feb2011 VALUES LESS THAN (20000) , PARTITION mar2011 VALUES LESS THAN (30000)); |
Range partitioning is best suited to large tables that you want to break into smaller pieces based on the integer column. You can also use stored functions that return integers as the partitioning key instead of the numeric literals shown. Few other options are available in MySQL.
Columns Partitioning
Columns partitioning is a new variant of range and list partitioning. It is included in MySQL 5.5 and forward. Both range and list partitioning work on an integer-based column (using TINYINT, SMALLINT, MEDIUMINT, INT [alias INTEGER], and BIGINT). Columns partitioning extends those models by expanding the possible data types for the partitioning column to include CHAR, VARCHAR, BINARY, and VARBINARY string data types, and DATE, DATETIME, or TIMESTAMP data types. You still can’t use other number data types such as DECIMAL and FLOAT. The TIMESTAMP data type is also available only in range partitions with the caveat that you use a UNIX_TIMESTAMP function, according to MySQL Bug 42849.
Hash Partitioning
Hash partitions ensure an even distribution of rows across a predetermined number of partitions. It is probably the easiest way to partition a table quickly to test the result of partitioning on a large table. You should base hash partitions on a surrogate or natural primary key.
The following provides a modified example of the ordering table:
CREATE TABLE ordering ( ordering_id INT UNSIGNED PRIMARY KEY AUTO_INCREMENT , item_id INT UNSIGNED , rental_amount DECIMAL(15,2) , rental_date DATE) PARTITION BY HASH(ordering_id) PARTITIONS 8; |
This is the partitioning type that benefits from a PRIMARY KEY constraint because it automatically creates a unique index that can be used by the hash. A non-unique index such as the list partitioning example doesn’t work for a hash partition.
Key Partitioning
Key partitioning is valuable because you can partition on columns that aren’t integers. It performs along the line of hash partitioning, except the MySQL Server uses its own hashing expression.
CREATE TABLE orders_list ( order_list_id INT UNSIGNED AUTO_INCREMENT , customer_surname VARCHAR(30) , store_id INT UNSIGNED , salesperson_id INT UNSIGNED , order_date DATE , index idx (order_list_id)) PARTITION BY KEY (order_date) PARTITIONS 8; |
This is the only alternative when you want to partition by date ranges. Like the hash partition, it’s easy to deploy. The only consideration is the number of slices that you want to make of the data in the table.
Subpartitioning
The concept of subpartitioning is also known as composite partitioning. You can subpartition range or list partitions with a hash, linear hash, or linear key.
A slight change to the previously created ordering table is required to demonstrate composite partitioning: we’ll add a store_id column to the table definition. The following is an example of a range partition subpartitioned by a hash:
CREATE TABLE ordering INT UNSIGNED INT UNSIGNED DATE ( ordering_id INT UNSIGNED AUTO_INCREMENT , item_id INT UNSIGNED , store_id INT UNSIGNED , rental_amount DECIMAL(15,2) , rental_date DATE , index idx (ordering_id)) PARTITION BY RANGE(item_id) SUBPARTITION BY HASH(store_id) SUBPARTITIONS 4 ( PARTITION jan2011 VALUES LESS THAN (10000) , PARTITION feb2011 VALUES LESS THAN (20000) , PARTITION mar2011 VALUES LESS THAN (30000)); |
Composite partitioning is non-trivial and might require some experimentation to achieve optimal results. Plan on making a few tests of different scenarios before you deploy a solution.
Logging Table Function
It is interesting when somebody remembers a presentation from 10 years ago. They asked if it was possible in PL/pgSQL to write an autonomous procedure to log data when calling a table view function. The answer is two fold. PL/pgSQL doesn’t support autonomous functions or procedures like the Oracle database but it doesn’t need to because unless you invoke a transaction it auto commits writes.
Logging table functions are important for security auditing and compliance management against laws, like SOX, HIPAA, and FERPA. All too many systems lack the basic ability to audit who queries records without raising an error and blocking the access. That means the bad actor or actress gains the ability to probe the system for weaknesses before determining an attack vector. It’s often better to capture the unauthorized access and take direct action to protect both the the data and systems.
While the example lets an unauthorized person access the information in the first version of the student_query, it blocks access by reporting no rows returned in the latter. Both versions of the query log the data and thereby collect the evidence necessary to act against the hack.
This blog post shows you how to write it and test it. Follow the following steps:
- Create the necessary tables and data to work with a logging PL/pgSQL table view function:
/* Conditionally drop and create table. */ DROP TABLE IF EXISTS student; CREATE TABLE student ( student_id SERIAL , first_name VARCHAR(20) , last_name VARCHAR(20) , hogwarts_house VARCHAR(10)); /* Conditionally drop and create table. */ DROP TABLE IF EXISTS logger; CREATE TABLE logger ( logger_id SERIAL , app_user VARCHAR(30) , queried_student VARCHAR(30) , query_time TIMESTAMP ); /* Insert one record into table. */ INSERT INTO student ( first_name, last_name, hogwarts_house ) VALUES ( 'Harry', 'Potter', 'Gryffindor' ) ,( 'Hermione', 'Granger', 'Gryffindor' ) ,( 'Ronald', 'Weasily', 'Gryffindor' ) ,( 'Draco', 'Malfoy', 'Slytherin' ) ,( 'Vincent', 'Crabbe', 'Slytherin' ) ,( 'Susan', 'Bones', 'Hufflepuff' ) ,( 'Hannah', 'Abbott', 'Hufflepuff' ) ,( 'Luna', 'Lovegood', 'Ravenclaw' ) ,( 'Cho', 'Chang', 'Ravenclaw' ) ,( 'Gilderoy', 'Lockhart', 'Ravenclaw' );
- While not necessary if you’re very familiar with PL/pgSQL, it may be helpful to review:
- The SET command that lets you assign a value to a session-level variable, which you can later use in a PL/pgSQL block.
- The SELECT-INTO statement in a DO-block.
Here’s a test script that demonstrates both:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
/* Set a session-level variable. */ SET credential.app_user = 'Draco Malfoy'; /* Secure the value from a session-level variable. */ SELECT current_setting('credential.app_user'); /* DO $$ DECLARE input VARCHAR(30) := 'Hermione'; output VARCHAR(30); BEGIN /* Sample for partial name construction of full name. */ SELECT CONCAT(s.first_name, ' ', s.last_name) AS student_name INTO output FROM student s WHERE CONCAT(s.first_name, ' ', s.last_name) LIKE '%'||input||'%'; /* Show result of local assignment via a query. */ RAISE NOTICE '[%][%]', current_setting('credential.app_user'), output; END; $$;
There’s an important parsing trick to this sample program. It uses the LIKE operator rather than the SIMILAR TO operator because the parser fails to recognize the SIMILAR TO operator.
The DO-block returns the following output:
NOTICE: [Draco Malfoy][Hermione Granger]
- This creates the student_query logging table function, which takes a partial portion of a students first and last name to return the student information. While the example only returns the name and the Hogwarts House it lays a foundation for a more complete solution.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34
CREATE OR REPLACE FUNCTION student_query (partial_name VARCHAR) RETURNS TABLE ( first_naem VARCHAR(20) , last_name VARCHAR(20) , hogwarts_house VARCHAR(10) ) AS $$ DECLARE queried VARCHAR; by_whome VARCHAR; BEGIN /* Query separately because embedding in insert statement fails. */ SELECT CONCAT(s.first_name, ' ', s.last_name) AS student_name FROM student s INTO queried WHERE CONCAT(s.first_name, ' ', s.last_name) LIKE '%'||partial_name||'%'; /* Log the query with the credentials of the user. */ INSERT INTO logger ( app_user , queried_student , query_time ) VALUES ( current_setting('credential.app_user') , queried , NOW()); /* Return the result set without disclosing the query was recorded. */ RETURN QUERY SELECT s.first_name , s.last_name , s.hogwarts_house FROM student s WHERE CONCAT(s.first_name, ' ', s.last_name) LIKE '%'||partial_name||'%'; END; $$ LANGUAGE plpgsql;
- You can test the function by calling it, like this:
SELECT * FROM student_query('Hermione');
It displays:
first_naem | last_name | hogwarts_house ------------+-----------+---------------- Hermione | Granger | Gryffindor (1 row)
You can check the logging table and discover who looked up another student’s records.
SELECT * FROM logger;
It displays:
logger_id | app_user | queried_student | query_time -----------+--------------+------------------+---------------------------- 1 | Draco Malfoy | Hermione Granger | 2022-05-29 22:51:50.398987 (1 row)
- Assuming you’ve built an authorized_user function that returns a Boolean, you can add a call to it in the WHERE clause. For simplicity, let’s implement the function to deny all users, like:
1 2 3 4 5 6 7 8 9 10
CREATE OR REPLACE FUNCTION authorized_user (user_name VARCHAR) RETURNS BOOLEAN AS $$ DECLARE lv_retval BOOLEAN := FALSE; BEGIN RETURN lv_retval; END; $$ LANGUAGE plpgsql;
You can now replace the query on lines 28 through 32 with the new one below. The added clause on line 33 denies access to unauthorized users because there aren’t any.
28 29 30 31 32 33
SELECT s.first_name , s.last_name , s.hogwarts_house FROM student s WHERE CONCAT(s.first_name, ' ', s.last_name) LIKE '%'||partial_name||'%' AND authorized_user(current_setting('credential.app_user'));
While it returns:
first_naem | last_name | hogwarts_house ------------+-----------+---------------- (0 rows)
The logger table shows two entries. One for the query that returned a value and one for the version that didn’t.
logger_id | app_user | queried_student | query_time -----------+--------------+------------------+---------------------------- 1 | Draco Malfoy | Hermione Granger | 2022-05-29 23:23:39.82063 2 | Draco Malfoy | Hermione Granger | 2022-05-29 23:23:40.736945 (2 rows)
In both cases the bad actor Draco Malfoy’s unauthorized access is captured and he was denied any information without alerting him to the security precaution in a logging table function.
As always, I hope this helps those looking for this type of solution.
PL/SQL Table Function
An Oracle example was requested as a comparison against the quick tutorial I wrote on how to do this in PostgreSQL’s PL/pgSQL. Unfortunately, there are many more moving parts to deliver this type of solution in Oracle’s PL/SQL.
The functions is same and simple. It returns the list of conquistadors that were originally German. It does that by filtering on the lang column in the table. For example, you use ‘de‘ for German. The additional moving parts are the required User-Defined Types (UDTs); one is a record structure and the other is a list (or Oracle parlance table).
The drops are unconditional and as such will trigger errors the first time they’re run but including PL/SQL blocks to make them conditional would have made the code much larger. It’s already larger because Oracle doesn’t support comma-delimited lists in the VALUES clause.
I’ll stage this with the same conquistador table used in the last post. Then, connect to the psql shell and run the following script file:
/* Drop the conquistador table. */ DROP TABLE conquistador; /* Create the conquistador table. */ CREATE TABLE conquistador ( conquistador_id NUMBER , conquistador VARCHAR(30) , actual_name VARCHAR(30) , nationality VARCHAR(30) , lang VARCHAR(2)); /* Drop the conquistador sequence. */ DROP SEQUENCE conquistador_seq; /* Create the conquistador_seq with a 1001 start value. */ CREATE SEQUENCE conquistador_seq START WITH 1001; /* Insert 9 rows into the table. */ INSERT INTO conquistador ( conquistador_id, conquistador, actual_name, nationality, lang ) VALUES (conquistador_seq.NEXTVAL,'Juan de Fuca','Ioánnis Fokás','Greek','el'); INSERT INTO conquistador ( conquistador_id, conquistador, actual_name, nationality, lang ) VALUES (conquistador_seq.NEXTVAL,'Nicolás de Federmán','Nikolaus Federmann','German','de'); INSERT INTO conquistador ( conquistador_id, conquistador, actual_name, nationality, lang ) VALUES (conquistador_seq.NEXTVAL,'Sebastián Caboto','Sebastiano Caboto','Venetian','it'); INSERT INTO conquistador ( conquistador_id, conquistador, actual_name, nationality, lang ) VALUES (conquistador_seq.NEXTVAL,'Jorge de la Espira','Georg von Speyer','German','de'); INSERT INTO conquistador ( conquistador_id, conquistador, actual_name, nationality, lang ) VALUES (conquistador_seq.NEXTVAL,'Eusebio Francisco Kino','Eusebius Franz Kühn','Italian','it'); INSERT INTO conquistador ( conquistador_id, conquistador, actual_name, nationality, lang ) VALUES (conquistador_seq.NEXTVAL,'Wenceslao Linck','Wenceslaus Linck','Bohemian','cs'); INSERT INTO conquistador ( conquistador_id, conquistador, actual_name, nationality, lang ) VALUES (conquistador_seq.NEXTVAL,'Fernando Consag','Ferdinand Konšcak','Croatian','sr'); INSERT INTO conquistador ( conquistador_id, conquistador, actual_name, nationality, lang ) VALUES (conquistador_seq.NEXTVAL,'Américo Vespucio','Amerigo Vespucci','Italian','it'); INSERT INTO conquistador ( conquistador_id, conquistador, actual_name, nationality, lang ) VALUES (conquistador_seq.NEXTVAL,'Alejo García','Aleixo Garcia','Portuguese','pt'); |
While unnecessary in PL/pgSQL, you must create User-Defined Types (UDTs) to write a table function. You must also create a local procedure to avoid allocating memory before assigning values to the list. These are the UDTs required:
/* Drop the dependency before the dependent type. */ DROP TYPE conquistador_table; DROP TYPE conquistador_struct; /* Create the UDT for a record structure accessible in SQL. */ CREATE OR REPLACE TYPE conquistador_struct IS OBJECT ( conquistador VARCHAR(30) , actual_name VARCHAR(30) , nationality VARCHAR(30)); / /* Create a list of the UDT. */ CREATE OR REPLACE TYPE conquistador_table IS TABLE OF conquistador_struct; / |
Drop any existing function or procedure of the same name before you try to build it. Oracle’s OR REPLACE fails when you try to use it for a function when there is already a procedure using the same name, and vice versa.
/* Drop the function to avoid any conflict with a procedure of the same name. */ DROP FUNCTION getConquistador; |
Now, you can build another script file to create the getConquistador function, like:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 | /* Create the function. */ CREATE OR REPLACE FUNCTION getConquistador (pv_lang IN VARCHAR) RETURN conquistador_table IS /* Declare a return variable. */ lv_retval CONQUISTADOR_TABLE := conquistador_table(); /* Declare a dynamic cursor. */ CURSOR get_conquistador ( cv_lang VARCHAR2 ) IS SELECT c.conquistador , c.actual_name , c.nationality FROM conquistador c WHERE c.lang = cv_lang; /* Local procedure to add to the song. */ PROCEDURE ADD ( pv_input CONQUISTADOR_STRUCT ) IS BEGIN lv_retval.EXTEND; lv_retval(lv_retval.COUNT) := pv_input; END ADD; BEGIN /* Read through the cursor and assign to the UDT table. */ FOR i IN get_conquistador(pv_lang) LOOP add(conquistador_struct( i.conquistador , i.actual_name , i.nationality )); END LOOP; /* Return collection. */ RETURN lv_retval; END; / |
While there is some white space for readability, the Oracle version is basically twice as long as the PL/pgSQL version. It also requires you to add UDTs to the data dictionary to make it work. PL/pgSQL actually doesn’t let you add references to type definitions and requires you use enumerated descriptions with column definitions.
Then, you can test it with the following syntax. The TABLE function is required to convert the list to a SQL consumable result set:
COL conquistador FORMAT A21 COL actual_name FORMAT A21 COL nationality FORMAT A12 SELECT * FROM TABLE(getConquistador('de')); |
It will return the following:
CONQUISTADOR ACTUAL_NAME NATIONALITY --------------------- --------------------- ------------ Nicolás de Federmán Nikolaus Federmann German Jorge de la Espira Georg von Speyer German 2 rows selected. |
As always, I hope this helps with a technique that’s useful.
PostgreSQL Table Function
A quick tutorial on how to write a PL/pgSQL Table function. The functions is simple. It returns the list of conquistadors that were originally German. It does that by filtering on the lang column in the table. For example, you use ‘de‘ for German.
I’ll stage this with the same conquistador table used in the last post. Don’t forget to use the chcp command to the Active Console Code Page to 4-byte Unicode before you run the script file, like:
chcp 65001 |
Then, connect to the psql shell and run the following script file:
/* Conditionally drop the conquistador table. */ DROP TABLE IF EXISTS conquistador; /* Create the conquistador table. */ CREATE TABLE conquistador ( conquistador_id SERIAL , conquistador VARCHAR(30) , actual_name VARCHAR(30) , nationality VARCHAR(30) , lang VARCHAR(2)); /* Insert some conquistadors into the table. */ INSERT INTO conquistador ( conquistador , actual_name , nationality , lang ) VALUES ('Juan de Fuca','Ioánnis Fokás','Greek','el') ,('Nicolás de Federmán','Nikolaus Federmann','German','de') ,('Sebastián Caboto','Sebastiano Caboto','Venetian','it') ,('Jorge de la Espira','Georg von Speyer','German','de') ,('Eusebio Francisco Kino','Eusebius Franz Kühn','Italian','it') ,('Wenceslao Linck','Wenceslaus Linck','Bohemian','cs') ,('Fernando Consag','Ferdinand Konšcak','Croatian','sr') ,('Américo Vespucio','Amerigo Vespucci','Italian','it') ,('Alejo García','Aleixo Garcia','Portuguese','pt'); |
Now, you can build another script file to create the getConquistador function, like:
/* Drop the funciton conditionally. */ DROP FUNCTION IF EXISTS getConquistador; |
Create the getConquistador function:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | CREATE FUNCTION getConquistador (IN lang_in VARCHAR(2)) RETURNS TABLE ( conquistador VARCHAR(30) , actual_name VARCHAR(30) , nationality VARCHAR(30)) AS $$ BEGIN RETURN QUERY SELECT c.conquistador , c.actual_name , c.nationality FROM conquistador c WHERE c.lang = lang_in; END; $$ LANGUAGE plpgsql; |
Then, you can test it like:
SELECT * FROM getConquistador('de'); |
It will return the following:
conquistador | actual_name | nationality -----------------------+--------------------+------------- Nicolás de Federmán | Nikolaus Federmann | German Jorge de la Espira | Georg von Speyer | German (2 rows) |
As always, I hope this helps with a technique that’s useful.
PostgreSQL Unicode
It seems unavoidable to use Windows. Each time I’m compelled to run tests on the platform I find new errors. For example, they don’t use 4-byte unicode and as a result when you want to use Unicode in PostgreSQL there’s a mismatch.
For example, change the Active Console Code Page with the chcp (change code page) to match the one PostgreSQL uses, like:
chip 1252 |
It lets you avoid this warning message:
Password for user postgres: psql (14.1) WARNING: Console code page (437) differs from Windows code page (1252) 8-bit characters might not work correctly. See psql reference page "Notes for Windows users" for details. Type "help" for help. postgres=# |
However, it won’t avoid display issues with real Unicode values. For example, let’s use a small international table like the following:
/* Conditionally drop the conquistador table. */ DROP TABLE IF EXISTS conquistador; /* Create the conquistador table. */ CREATE TABLE conquistador ( conquistador_id SERIAL , conquistador VARCHAR(30) , actual_name VARCHAR(30) , nationality VARCHAR(30) , lang VARCHAR(2)); /* Insert some conquistadors into the table. */ INSERT INTO conquistador ( conquistador , actual_name , nationality , lang ) VALUES ('Juan de Fuca','Ioánnis Fokás','Greek','el') ,('Nicolás de Federmán','Nikolaus Federmann','German','de') ,('Sebastián Caboto','Sebastiano Caboto','Venetian','it') ,('Jorge de la Espira','Georg von Speyer','German','de') ,('Eusebio Francisco Kino','Eusebius Franz Kühn','Italian','it') ,('Wenceslao Linck','Wenceslaus Linck','Bohemian','cs') ,('Fernando Consag','Ferdinand Konšcak','Croatian','sr') ,('Américo Vespucio','Amerigo Vespucci','Italian','it') ,('Alejo García','Aleixo Garcia','Portuguese','pt'); /* Query the values from the conquistador table. */ SELECT * FROM conquistador; |
When you call the script to load it, like:
\i testScript.sql |
It’ll display the following, which you can check against the strings in the VALUES clause above. There are encoding issues on lines 1, 2, 3, 5, 7, and 8 below.
conquistador_id | conquistador | actual_name | nationality | lang -----------------+------------------------+----------------------+-------------+------ 1 | Juan de Fuca | Ioánnis Fokás | Greek | el 2 | Nicolás de Federmán | Nikolaus Federmann | German | de 3 | Sebastián Caboto | Sebastiano Caboto | Venetian | it 4 | Jorge de la Espira | Georg von Speyer | German | de 5 | Eusebio Francisco Kino | Eusebius Franz Kühn | Italian | it 6 | Wenceslao Linck | Wenceslaus Linck | Bohemian | cs 7 | Fernando Consag | Ferdinand KonÅ¡cak | Croatian | sr 8 | Américo Vespucio | Amerigo Vespucci | Italian | it 9 | Alejo GarcÃa | Aleixo Garcia | Portuguese | pt (9 rows) |
If you’re like me, it was annoying. The problem is that the native 2-byte Unicode of Microsoft sends values into PostgreSQL that are invalid. Those codes are read back with unintended values from other character encoding sets.
While you can’t set Windows generic encoding to 65001 without causing the system problems, you can set Active Console Code Page value in the scope of a Command-Line session before running the script.
The chcp command lets you set it to 4-byte Unicode, like:
chcp 65001 |
Now, rerun the script and PostgreSQL will display the correct character encoding set with some spacing irregularities. However, that’s not what’s important when you call table from another programming language through the ODBC-layer. The data will be returned in a 4-byte Unicode encoding stream.
conquistador_id | conquistador | actual_name | nationality | lang -----------------+------------------------+----------------------+-------------+------ 1 | Juan de Fuca | Ioánnis Fokás | Greek | el 2 | Nicolás de Federmán | Nikolaus Federmann | German | de 3 | Sebastián Caboto | Sebastiano Caboto | Venetian | it 4 | Jorge de la Espira | Georg von Speyer | German | de 5 | Eusebio Francisco Kino | Eusebius Franz Kühn | Italian | it 6 | Wenceslao Linck | Wenceslaus Linck | Bohemian | cs 7 | Fernando Consag | Ferdinand Konšcak | Croatian | sr 8 | Américo Vespucio | Amerigo Vespucci | Italian | it 9 | Alejo García | Aleixo Garcia | Portuguese | pt (9 rows) |
A similar error to what I encountered testing MySQL Workbench’s ability to export SQL Server databases 10 years ago. I thought giving a solution to get coerce correct 4-byte Unicode data insertion may help those who also may be surprised by the behavior.
PL/pgSQL List to Struct
This blog post addresses how to convert a list of values into a structure (in C/C++ its a struct, in Java its an ArrayList, and PL/pgSQL it’s an array of a type). The cast_strings function converts a list of strings into a record data structure. It calls the verify_date function to identify a DATE data type and uses regular expressions to identify numbers and strings.
You need to build the struct type below first.
CREATE TYPE struct AS ( xnumber DECIMAL , xdate DATE , xstring VARCHAR(100)); |
The cast_strings function is defined below:
CREATE FUNCTION cast_strings ( pv_list VARCHAR(10)[] ) RETURNS struct AS $$ DECLARE /* Declare a UDT and initialize an empty struct variable. */ lv_retval STRUCT := (null, null, null); BEGIN /* Loop through list of values to find only the numbers. */ FOR i IN 1..ARRAY_LENGTH(pv_list,1) LOOP /* Order if statements by evaluation. */ CASE /* Check for a value with only digits. */ WHEN lv_retval.xnumber IS NULL AND REGEXP_MATCH(pv_list[i],'^[0-9]+$') IS NOT NULL THEN lv_retval.xnumber := pv_list[i]; /* Check for a valid date. */ WHEN lv_retval.xdate IS NULL AND verify_date(pv_list[i]) IS NOT NULL THEN lv_retval.xdate := pv_list[i]; /* Check for a string with characters, whitespace, and digits. */ WHEN lv_retval.xstring IS NULL AND REGEXP_MATCH(pv_list[i],'^[A-Za-z 0-9]+$') IS NOT NULL THEN lv_retval.xstring := pv_list[i]; ELSE NULL; END CASE; END LOOP; /* Print the results. */ RETURN lv_retval; END; $$ LANGUAGE plpgsql; |
There are two test cases for the cast_strings function. One uses a DO-block and the other a query.
- The first use-case checks with a DO-block:
DO $$ DECLARE lv_list VARCHAR(11)[] := ARRAY['86','1944-04-25','Happy']; lv_struct STRUCT; BEGIN /* Pass the array of strings and return a record type. */ lv_struct := cast_strings(lv_list); /* Print the elements returned. */ RAISE NOTICE '[%]', lv_struct.xnumber; RAISE NOTICE '[%]', lv_struct.xdate; RAISE NOTICE '[%]', lv_struct.xstring; END; $$;
It should return:
psql:verify_pg.SQL:263: NOTICE: [86] psql:verify_pg.SQL:263: NOTICE: [1944-04-25] psql:verify_pg.SQL:263: NOTICE: [Happy]
The program returns a structure with values converted into their appropriate data type.
- The second use-case checks with a query:
WITH get_struct AS (SELECT cast_strings(ARRAY['99','2015-06-14','Agent 99']) AS mystruct) SELECT (mystruct).xnumber , (mystruct).xdate , (mystruct).xstring FROM get_struct;
It should return:
xnumber | xdate | xstring ---------+------------+---------- 99 | 2015-06-14 | Agent 99 (1 row)
The query defines a call to the cast_strings function with a valid set of values and then displays the elements of the returned structure.
As always, I hope this helps those looking for how to solve this type of problem. Just a quick reminder that this was written and tested in PostgreSQL 14.
Oracle DSN Security
Oracle disallows entry of a password value when configuring the ODBC’s Windows Data Source Name (DSN) configurations. As you can see from the dialog’s options:
So, I check the Oracle ODBC’s property list with the following PowerShell command:
Get-Item -Path Registry::HKEY_LOCAL_MACHINE\SOFTWARE\ODBC\ODBC.INI\Oracle | Select-Object |
It returned:
Oracle Driver : C:\app\mclaughlinm\product\18.0.0\dbhomeXE\BIN\SQORA32.DLL DisableRULEHint : T Attributes : W SQLTranslateErrors : F LobPrefetchSize : 8192 AggregateSQLType : FLOAT MaxTokenSize : 8192 FetchBufferSize : 64000 NumericSetting : NLS ForceWCHAR : F FailoverDelay : 10 FailoverRetryCount : 10 MetadataIdDefault : F BindAsFLOAT : F BindAsDATE : F CloseCursor : F EXECSchemaOpt : EXECSyntax : F Application Attributes : T QueryTimeout : T CacheBufferSize : 20 StatementCache : F ResultSets : T MaxLargeData : 0 UseOCIDescribeAny : F Failover : T Lobs : T DisableMTS : T DisableDPM : F BatchAutocommitMode : IfAllSuccessful Description : Oracle ODBC ServerName : xe Password : UserID : c##student DSN : Oracle |
Then, I used this PowerShell command to set the Password property:
Set-ItemProperty -Path Registry::HKEY_LOCAL_MACHINE\SOFTWARE\ODBC\ODBC.INI\Oracle -Name "Password" -Value 'student' |
After setting the Password property’s value, I queried it with the following PowerShell command:
Get-ItemProperty -Path Registry::HKEY_LOCAL_MACHINE\SOFTWARE\ODBC\ODBC.INI\Oracle | Select-Object -Property "Password" |
It returns:
Password : student |
After manually setting the Oracle ODBC DSN’s password value you can now connect without providing a password at runtime. It also means anybody who hacks the Windows environment can access the password through trivial PowerShell command.
I hope this alerts readers to a potential security risk when you use Oracle DSNs.
MySQL Windows DSN
Almost a Ripley’s Believe It or Not. An prior data science student told me that his new IT department setup a Windows component that let him connect his Excel Spreadsheets to their production MySQL database without a password. Intrigued, I asked if it was a MySQL Connector/ODBC Data Source Configuration, or DSN (Data Source Name)?
He wasn’t sure, so I asked him to connect to PowerShell and run the following command:
Get-Item -Path Registry::HKEY_LOCAL_MACHINE\SOFTWARE\ODBC\ODBC.INI\MySQL |
It returned something like this (substituting output from one of my test systems):
Hive: HKEY_LOCAL_MACHINE\SOFTWARE\ODBC\ODBC.INI Name Property ---- -------- MySQL Driver : C:\Program Files\MySQL\Connector ODBC 8.0\myodbc8w.dll DESCRIPTION : MySQL ODBC Connector SERVER : localhost UID : student PWD : student DATABASE : studentdb PORT : 3306 |
The student was stunned and concerned he was compromising his employer’s system security. I suggested he share the information with his IT department so they could provide a different approach for his access to the production database. His IT department immediately agreed. Unfortunately, he’s bummed he can’t simply access the data through Excel.
I told him they were welcome to use the MySQL Connect Dialog PowerShell solution that I wrote. It creates a minimal MySQL DSN and requires a manual password entry through the PowerShell Dialog box. I also suggested that they look into the PowerShell Excel Module.
I also suggested they develop a query only copy of the production database, or shift access to a data warehouse. Needless to say, it wasn’t a large corporation.
As always, I hope this helps others.