MacLochlainns Weblog

Michael McLaughlin's Technical Blog

Site Admin

Archive for April, 2022

Selective Aggregation

without comments

Selective Aggregation

Learning Outcomes

  • Learn how to combine CASE operators and aggregation functions.
  • Learn how to selective aggregate values.
  • Learn how to use SQL to format report output.

Selective aggregation is the combination of the CASE operator and aggregation functions. Any aggregation function adds, sums, or averages the numbers that it finds; and when you embed the results of a CASE operator inside an aggregation function you get a selective result. The selectivity is determined by the WHEN clause of a CASE operator, which is more or less like an IF statement in an imperative programming language.

The prototype for selective aggregation is illustrated with a SUM function below:

SELECT   SUM(CASE
               WHEN left_operand = right_operand THEN result
               WHEN left_operand > right_operand THEN result
               WHEN left_operand IN (SET OF comma-delimited VALUES) THEN result
               WHEN left_operand IN (query OF results) THEN result
               ELSE alt_result
             END) AS selective_aggregate
FROM     some_table;

A small example let’s you see how selective aggregation works. You create a PAYMENT table and PAYMENT_S sequence for this example, as follows:

-- Create a PAYMENT table.
CREATE TABLE payment
( payment_id     NUMBER
, payment_date   DATE	      CONSTRAINT nn_payment_1 NOT NULL
, payment_amount NUMBER(20,2) CONSTRAINT nn_payment_2 NOT NULL
, CONSTRAINT pk_payment PRIMARY KEY (payment_id));
 
-- Create a PAYMENT_S sequence.
CREATE SEQUENCE payment_s;

After you create the table and sequence, you should insert some data. You can match the values below or choose your own values. You should just insert values for a bunch of rows.

After inserting 10,000 rows, you can get an unformatted total with the following query:

-- Query total amount.
SELECT   SUM(payment_amount) AS payment_total
FROM     payment;

It outputs the following:

PAYMENT_TOTAL
-------------
   5011091.75

You can nest the result inside the TO_CHAR function to format the output, like

-- Query total formatted amount.
SELECT   TO_CHAR(SUM(payment_amount),'999,999,999.00') AS payment_total
FROM     payment;

It outputs the following:

PAYMENT_TOTAL
---------------
   5,011,091.75

Somebody may suggest that you use a PIVOT function to rotate the data into a summary by month but the PIVOT function has limits. The pivoting key must be numeric and the column values will use only those numeric values.

-- Pivoted summaries by numeric monthly value.
SELECT   *
FROM    (SELECT EXTRACT(MONTH FROM payment_date) payment_month
         ,      payment_amount
         FROM   payment)
         PIVOT (SUM(payment_amount) FOR payment_month IN
                 (1,2,3,4,5,6,7,8,9,10,11,12));

It outputs the following:

	 1	    2	       3	  4	     5		6	   7	      8 	 9	   10	      11	 12
---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
 245896.55  430552.36  443742.63  457860.27  470467.18	466370.71  415158.28  439898.72  458998.09  461378.56  474499.22  246269.18

You can use selective aggregation to get the results by a character label, like

SELECT   SUM(
           CASE
             WHEN EXTRACT(MONTH FROM payment_date) = 1
             AND  EXTRACT(YEAR FROM payment_date) = 2019  THEN payment_amount
           END) AS "JAN"
,        SUM(
           CASE
             WHEN EXTRACT(MONTH FROM payment_date) = 2
             AND  EXTRACT(YEAR FROM payment_date) = 2019  THEN payment_amount
           END) AS "FEB"
,        SUM(
           CASE
             WHEN EXTRACT(MONTH FROM payment_date) = 3
             AND  EXTRACT(YEAR FROM payment_date) = 2019  THEN payment_amount
           END) AS "MAR"
,        SUM(
           CASE
             WHEN EXTRACT(MONTH FROM payment_date) IN (1,2,3)
             AND  EXTRACT(YEAR FROM payment_date) = 2019 THEN payment_amount
           END) AS "1FQ"
,        SUM(
           CASE
             WHEN EXTRACT(MONTH FROM payment_date) = 4
             AND  EXTRACT(YEAR FROM payment_date) = 2019  THEN payment_amount
           END) AS "APR"
FROM     payment;

It outputs the following:

       JAN	  FEB	     MAR	1FQ	   APR
---------- ---------- ---------- ---------- ----------
 245896.55  430552.36  443742.63 1120191.54  457860.27

You can format the output with a combination of the TO_CHAR and LPAD functions. The TO_CHAR allows you to add a formatting mask, complete with commas and two mandatory digits to the right of the decimal point. The reformatted query looks like

COL JAN FORMAT A13 HEADING "Jan"
COL FEB FORMAT A13 HEADING "Feb"
COL MAR FORMAT A13 HEADING "Mar"
COL 1FQ FORMAT A13 HEADING "1FQ"
COL APR FORMAT A13 HEADING "Apr"
SELECT   LPAD(TO_CHAR(SUM(
           CASE
             WHEN EXTRACT(MONTH FROM payment_date) = 1
             AND  EXTRACT(YEAR FROM payment_date) = 2019  THEN payment_amount
           END),'9,999,999.00'),13,' ') AS "JAN"
,        LPAD(TO_CHAR(SUM(
           CASE
             WHEN EXTRACT(MONTH FROM payment_date) = 2
             AND  EXTRACT(YEAR FROM payment_date) = 2019  THEN payment_amount
           END),'9,999,999.00'),13,' ') AS "FEB"
,        LPAD(TO_CHAR(SUM(
           CASE
             WHEN EXTRACT(MONTH FROM payment_date) = 3
             AND  EXTRACT(YEAR FROM payment_date) = 2019  THEN payment_amount
           END),'9,999,999.00'),13,' ') AS "MAR"
,        LPAD(TO_CHAR(SUM(
           CASE
             WHEN EXTRACT(MONTH FROM payment_date) IN (1,2,3)
             AND  EXTRACT(YEAR FROM payment_date) = 2019 THEN payment_amount
           END),'9,999,999.00'),13,' ') AS "1FQ"
,        LPAD(TO_CHAR(SUM(
           CASE
             WHEN EXTRACT(MONTH FROM payment_date) = 4
             AND  EXTRACT(YEAR FROM payment_date) = 2019  THEN payment_amount
           END),'9,999,999.00'),13,' ') AS "APR"
FROM     payment;

It displays the formatted output:

Jan	      Feb	    Mar 	  1FQ		Apr
------------- ------------- ------------- ------------- -------------
   245,896.55	 430,552.36    443,742.63  1,120,191.54    457,860.27

INSERT Statement

without comments

INSERT Statement

Learning Outcomes

  • Learn how to use positional- and named-notation in INSERT statements.
  • Learn how to use the VALUES clause in INSERT statements.
  • Learn how to use subqueries in INSERT statements.

The INSERT statement lets you enter data into tables and views in two ways: via an INSERT statement with a VALUES clause and via an INSERT statement with a query. The VALUES clause takes a list of literal values (strings, numbers, and dates represented as strings), expression values (return values from functions), or variable values.

Query values are results from SELECT statements that are subqueries (covered earlier in this appendix). INSERT statements work with scalar, single-row, and multiple-row subqueries. The list of columns in the VALUES clause or SELECT clause of a query (a SELECT list) must map to the positional list of columns that defines the table. That list is found in the data dictionary or catalog. Alternatively to the list of columns from the data catalog, you can provide a named list of those columns. The named list overrides the positional (or default) order from the data catalog and must provide at least all mandatory columns in the table definition. Mandatory columns are those that are not null constrained.

Oracle databases differ from other databases in how they implement the INSERT statement. Oracle doesn’t support multiple-row inserts with a VALUES clause. Oracle does support default and override signatures as qualified in the ANSI SQL standards. Oracle also provides a multiple- table INSERT statement. This section covers how you enter data with an INSERT statement that is based on a VALUES clause or a subquery result statement. It also covers multiple-table INSERT statements.

The INSERT statement has one significant limitation: its default signature. The default signature is the list of columns that defines the table in the data catalog. The list is defined by the position and data type of columns. The CREATE statement defines the initial default signature, and the ALTER statement can change the number, data types, or ordering of columns in the default signature.

The default prototype for an INSERT statement allows for an optional column list that overrides the default list of columns. When you provide the column list you choose to implement named-notation, which is the right way to do it. Relying on the insertion order of the columns is a bad idea. An INSERT statement without a list of column names is a position-notation statement. Position-notation is bad because somebody can alter that order and previously written INSERT statements will break or put data in the wrong columns.

Like methods in OOPLs, an INSERT statement without the optional column list constructs an instance (or row) of the table using the default constructor. The override constructor for a row is defined by any INSERT statement when you provide an optional column list. That’s because it overrides the default constructor.

The generic prototype for an INSERT statement is confusing when it tries to capture both the VALUES clause and the result set from a query. Therefore, I’ve opted to provide two generic prototypes.

Insert by value

The first uses the VALUES clause:

INSERT
INTO table_name
[( column1, column2, column3, ...)] VALUES
( value1, value2, value3, ...);

Notice that the prototype for an INSERT statement with the result set from a query doesn’t use the VALUES clause at all. A parsing error occurs when the VALUES clause and query both occur in an INSERT statement.

The second prototype uses a query and excludes the VALUES clause. The subquery may return one to many rows of data. The operative rule is that all columns in the query return the same number of rows of data, because query results should be rectangles—rectangles made up of one to many rows of columns.

Insert by subquery

Here’s the prototype for an INSERT statement that uses a subquery:

INSERT
INTO table_name
[( column1, column2, column3, ...)]
( SELECT value1, value2, value3, ... FROM table_name WHERE ...);

A query, or SELECT statement, returns a SELECT list. The SELECT list is the list of columns, and it’s evaluated by position and data type. The SELECT list must match the definition of the table or the override signature provided.

Default signatures present a risk of data corruption through insertion anomalies, which occur when you enter bad data in tables. Mistakes transposing or misplacing values can occur more frequently with a default signature, because the underlying table structure can change. As a best practice, always use named notation by providing the optional list of values; this should help you avoid putting the right data in the wrong place.

The following subsections provide examples that use the default and override syntax for INSERT statements in Oracle databases. The subsections also cover multiple-table INSERT statements and a RETURNING INTO clause, which is an extension of the ANSI SQL standard. Oracle uses the RETURNING INTO clause to manage large objects, to return autogenerated identity column values, and to support some of the features of Oracle’s dynamic SQL. Note that Oracle also supports a bulk INSERT statement, which requires knowledge of PL/SQL.

Written by maclochlainn

April 5th, 2022 at 1:23 pm

MySQL CSV Output

without comments

Saturday, I posted how to use Microsoft ODBC DSN to connect to MySQL. Somebody didn’t like the fact that the PowerShell program failed to write a *.csv file to disk because the program used the Write-Host command to write to the content of the query to the console.

I thought that approach was a better as an example. However, it appears that it wasn’t because not everybody knows simple redirection. The original program can transfer the console output to a file, like:

powershell .\MySQLODBC.ps1 > output.csv

So, the first thing you need to do is add a parameter list, like:

2
3
4
param (
  [Parameter(Mandatory)][string]$fileName
)

Anyway, it’s trivial to demonstrate how to modify the PowerShell program to write to a disk. You should also create a virtual PowerShell drive before writing the file. That’s because you can change the physical directory anytime you want with minimal changes to rest of your code’s file references.

You can create a PowerShell virtual drive with the following command:

7
8
New-PSDrive -Name test -PSProvider FileSystem -Description 'Test area' `
            -Root C:\Data\cit225\mysql\test

but, it will write the following to console:

Name           Used (GB)     Free (GB) Provider      Root                                                                                 CurrentLocation
----           ---------     --------- --------      ----                                                                                 ---------------
test                0.00         28.74 FileSystem    C:\Data\cit225\mysql\test

You can suppress the console output with Microsoft’s version of redirection to the void (> /dev/null), which pipes (|) the standard out (stdout) to Out-Null, like:

7
8
New-PSDrive -Name test -PSProvider FileSystem -Description 'Test area' `
            -Root C:\Data\cit225\mysql\test | Out-Null

Since the program may run before an output file has been created, or after its been created and removed, you need to check whether the file exists before attempting to remove it. PowerShell provides the Test-Path command to check for the existence of a file and the Remove-Item command to remove a file, like:

11
12
if (Test-Path test:$fileName) {
  Remove-Item -Path test:$fileName }

Then, you simply replace the Write-Host call in the other program with the Add-Content command:

Add-Content -Value $output -Path test:$fileName

Now, the PowerShell script file writes the MySQL query’s output to an output.csv file. You can call the MySQLContact.ps1 script file with the following syntax:

powershell MySQLContact.ps1 output.csv

In case these changes don’t make sense outside the scope of the full script, here is the rewritten script:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
# Define parameter list for mandatory file name.
param (
  [Parameter(Mandatory)][string]$fileName
)
 
# Define a PowerShell Virtual Drive.
New-PSDrive -Name test -PSProvider FileSystem -Description 'Test area' `
            -Root C:\Data\cit225\mysql\test | Out-Null
 
# Remove the file only when it exists.
if (Test-Path test:$fileName) {
  Remove-Item -Path test:$fileName }
 
# Define a ODBC DSN connection string.
$ConnectionString = 'DSN=MySQLODBC2'
 
# Define a MySQL Command Object for a non-query.
$Connection = New-Object System.Data.Odbc.OdbcConnection;
$Connection.ConnectionString = $ConnectionString
 
# Attempt connection.
try {
  $Connection.Open()
 
  # Create a SQL command.
  $Command = $Connection.CreateCommand();
  $Command.CommandText = "SELECT last_name " + 
                         ",      first_name " +
                         "FROM   contact " +
                         "ORDER BY 1, 2";
 
  # Attempt to read SQL command.
  try {
    $row = $Command.ExecuteReader();
 
    # Read while records are found.
    while ($row.Read()) {
      # Initialize output for each row.
      $output = ""
 
      # Navigate across all columns (only two in this example).
      for ($column = 0; $column -lt $row.FieldCount; $column += 1) {
        # Mechanic for comma-delimit between last and first name.  
        if ($output.length -eq 0) { 
          $output += $row[$column] }
        else {
          $output += ", " + $row[$column] }
      }
        # Write the output from the database to a file.
        Add-Content -Value $output -Path test:$fileName
    }
 
  } catch {
    Write-Error "Message: $($_.Exception.Message)"
    Write-Error "StackTrace: $($_.Exception.StackTrace)"
    Write-Error "LoaderExceptions: $($_.Exception.LoaderExceptions)"
  } finally {
    # Close the reader.
    $row.Close() }
 
} catch {
  Write-Error "Message: $($_.Exception.Message)"
  Write-Error "StackTrace: $($_.Exception.StackTrace)"
  Write-Error "LoaderExceptions: $($_.Exception.LoaderExceptions)"
} finally {
  $Connection.Close() }

While I understand you might want to go to this level of effort if you where building a formal cmdlet, I’m not convinced its worth the effort in an ordinary PowerShell script. However, I don’t like to leave a question unanswered.

Written by maclochlainn

April 4th, 2022 at 12:45 am

MySQL ODBC DSN

with one comment

This post explains and demonstrates how to install, configure, and use the MySQL’s ODBC libraries and a DSN (Data Source Name) to connect your Microsoft PowerShell programs to a locally or remotely installed MySQL database. After you’ve installed the MySQL ODBC library, use Windows search field to find the ODBC Data Sources dialog and run it as administrator.

There are four steps to setup, test, and save your ODBC Data Source Name (DSN) for MySQL. You can click on the images on the right to launch them in a more readable format or simply read the instructions.

MySQL ODBC Setup Steps

  1. Click the SystemDSN tab to see he view which is exactly like the User DSN tab. Click the Add button to start the workflow.

  1. The Create New Data Source dialog requires you select the MySQL ODBC Driver(UNICODE) option from the list and click the Finish button to proceed.

  1. The MySQL Unicode ODBC Driver Setup dialog should complete the prompts as follows below. If you opt for localhost as the server value and you have a DCHP IP address, make sure you’ve configured your hosts file in the C:\Windows\System32\drivers\etc directory. You should enter the following two lines in the hosts file:

    127.0.0.1  localhost
    ::1        localhost

    These are the string values you should enter in the MySQL Unicode ODBC Driver Setup dialog:

    Data Source: MySQLODBC
    Database:    studentdb
    Server:      localhost
    User Name:   student
    Description: MySQL ODBC Connector
    Port:        3306
    Password:    student

    After you complete the entry, click the Test button.

  1. The Connection Test dialog should return a “Connection successful” message. Click the OK button to continue, and then click the OK button in the next two screens.

After you have created the System MySQL ODBC Setup, it’s time to build a PowerShell Cmdlet (or, Commandlet). Some documentation and blog notes incorrectly suggest you need to write a connection string with a UID and password, like:

$ConnectionString = 'DSN=MySQLODBC;Uid=student;Pwd=student'

You can do that if you leave the UID and password fields empty in the MySQL ODBC Setup but it’s recommended to enter them there to avoid putting them in your PowerShell script file.

The UID and password are unnecessary in the connection string when they’re in MySQL ODBC DSN. You can use a connection string like the following when the UID and password are in the DSN:

$ConnectionString = 'DSN=MySQLODBC'

You can create a MySQLCursor.ps1 Cmdlet like the following:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
# Define a ODBC DSN connection string.
$ConnectionString = 'DSN=MySQLODBC'
 
# Define a MySQL Command Object for a non-query.
$Connection = New-Object System.Data.Odbc.OdbcConnection;
$Connection.ConnectionString = $ConnectionString
 
# Attempt connection.
try {
  $Connection.Open()
 
  # Create a SQL command.
  $Command = $Connection.CreateCommand();
  $Command.CommandText = "SELECT database();";
 
  # Attempt to read SQL command.
  try {
    $Reader = $Command.ExecuteReader();
 
    # Read while records are found.
    while ($Reader.Read()) {
      Write-Host "Current Database [" $Reader[0] "]"}
 
  } catch {
    Write-Host "Message: $($_.Exception.Message)"
    Write-Host "StackTrace: $($_.Exception.StackTrace)"
    Write-Host "LoaderExceptions: $($_.Exception.LoaderExceptions)"
  } finally {
    # Close the reader.
    $Reader.Close() }
 
} catch {
  Write-Host "Message: $($_.Exception.Message)"
  Write-Host "StackTrace: $($_.Exception.StackTrace)"
  Write-Host "LoaderExceptions: $($_.Exception.LoaderExceptions)"
} finally {
  $Connection.Close() }

Line 14 assigns a SQL query that returns a single row with one column as the CommandText of a Command object. Line 22 reads the zero position of a row or record set with only one column.

You call the MySQLCursor.ps1 Cmdlet with the following syntax:

powershell .\MySQLCursor.ps1

It returns:

Current Database [ studentdb ]

A more realistic way to write a query would return multiple rows with a set of two or more columns. The following program queries a table with multiple rows of two columns, but the program logic can manage any number of columns.

# Define a ODBC DSN connection string.
$ConnectionString = 'DSN=MySQLODBC'
 
# Define a MySQL Command Object for a non-query.
$Connection = New-Object System.Data.Odbc.OdbcConnection;
$Connection.ConnectionString = $ConnectionString
 
# Attempt connection.
try {
  $Connection.Open()
 
  # Create a SQL command.
  $Command = $Connection.CreateCommand();
  $Command.CommandText = "SELECT last_name, first_name FROM contact ORDER BY 1, 2";
 
  # Attempt to read SQL command.
  try {
    $row = $Command.ExecuteReader();
 
    # Read while records are found.
    while ($row.Read()) {
      # Initialize output for each row.
      $output = ""
 
      # Navigate across all columns (only two in this example).
      for ($column = 0; $column -lt $row.FieldCount; $column += 1) {
        # Mechanic for comma-delimit between last and first name.  
        if ($output.length -eq 0) { 
          $output += $row[$column] }
        else {
          $output += ", " + $row[$column] }
        }
        # Write the output from the database.
        Write-Host $output
      }
 
  } catch {
    Write-Host "Message: $($_.Exception.Message)"
    Write-Host "StackTrace: $($_.Exception.StackTrace)"
    Write-Host "LoaderExceptions: $($_.Exception.LoaderExceptions)"
  } finally {
    # Close the reader.
    $row.Close() }
 
} catch {
  Write-Host "Message: $($_.Exception.Message)"
  Write-Host "StackTrace: $($_.Exception.StackTrace)"
  Write-Host "LoaderExceptions: $($_.Exception.LoaderExceptions)"
} finally {
  $Connection.Close() }

You call the MySQLContact.ps1 Cmdlet with the following syntax:

powershell .\MySQLContact.ps1

It returns an ordered set of comma-separated values, like

Clinton, Goeffrey
Gretelz, Simon
Moss, Wendy
Potter, Ginny
Potter, Harry
Potter, Lily
Royal, Elizabeth
Smith, Brian
Sweeney, Ian
Sweeney, Matthew
Sweeney, Meaghan
Vizquel, Doreen
Vizquel, Oscar
Winn, Brian
Winn, Randi

As always, I hope this helps those looking for a complete concrete example of how to make Microsoft Powershell connect and query results from a MySQL database.

Written by maclochlainn

April 2nd, 2022 at 7:56 pm