MacLochlainns Weblog

Michael McLaughlin's Technical Blog

Site Admin

MySQL Workbench Keys

without comments

As I teach students how to create tables in MySQL Workbench, it’s always important to review the meaning of the checkbox keys. Then, I need to remind them that every table requires a natural key from our prior discussion on normalization. I explain that a natural key is a compound candidate key (made up of two or more column values), and that it naturally defines uniqueness for each row in a table.

Then, we discuss surrogate keys, which are typically ID column keys. I explain that surrogate keys are driven by sequences in the database. While a number of databases disclose the name of sequences, MySQL treats the sequence as an attribute of the table. In Object-Oriented Analysis and Design (OOAD), that makes the sequence a member of the table by composition rather than aggregation. Surrogate keys are also unique in the table but should never be used to determine uniqueness like the natural key. Surrogate keys are also candidate keys, like a VIN number uniquely identifies a vehicle.

In a well designed table you always have two candidate keys: One describes the unique row and the other assigns a number to it. While you can perform joins by using either candidate key, you always should use the surrogate key for joins statements. This means you elect, or choose, the surrogate candidate key as the primary key. Then, you build a unique index for the natural key, which lets you query any unique row with human decipherable words.

The column attribute table for MySQL Workbench is:

Key Meaning
PK Designates a primary key column.
NN Designates a not-null column constraint.
UQ Designates a column contains a unique value for every row.
BIN Designates a VARCHAR data type column so that its values are stored in a case-sensitive fashion. You can’t apply this constraint to other data types.
UN Designates a column contains an unsigned numeric data type. The possible values are 0 to the maximum number of the data type, like integer, float, or double. The value 0 isn’t possible when you also select the PK and AI check boxes, which ensures the column automatically increments to the maximum value of the column.
ZF Designates a zero fill populates zeros in front of any number data type until all space is consumed, which acts like a left pad function with zeros.
AI Designates AUTO_INCREMENT and should only be checked for a surrogate primary key value.

All surrogate key columns should check the PK, NN, UN, and AI checkboxes. The default behavior checks only the PK and NN checkboxes and leaves the UN and AI boxes unchecked. You should also click the UN checkbox with the AI checkbox for all surrogate key columns. The AI checkbox enables AUTO_INCREMENT behavior. The UN checkbox ensure you have the maximum number of integers before you would migrate the table to a double precision number. More or less, this is what I wrote in MySQL Workbench Data Modeling & Development as the primary product guide in 2013, and what you find in the MySQL Workbench Manual 8.1.10.2 Columns Tab section.

Active tables grow quickly and using a signed int means you run out of rows more quickly. This is an important design consideration because using a unsigned int adds a maintenance task later. The maintenance task will require changing the data type of all dependent foreign key columns before changing the primary key column’s data type. Assuming you’re design uses referential integrity constraints, implemented as a foreign keys, you will need to:

  • Remove any foreign key constraints before changing the referenced primary key and dependent foreign key column’s data types.
  • Change the primary and foreign key column’s data types.
  • Add back foreign key constraints after changing the referenced primary key and dependent foreign key column’s data types.

While fixing a less optimal design is a relatively simple scripting exercise for most data engineers, you can avoid this maintenance task. Implement all surrogate primary key columns and foreign key columns with the signed int as their initial data type.

The following small ERD displays a multi-language lookup table, which is preferable to a monolinquistic enum data type.:

A design uses a lookup table when there are known lists of selections to make. There are known lists that occur in most if not all business applications. Maintaining that list of values is an application setup task and requires the development team to build an entry and update form to input and maintain the lists.

While some MySQL examples demonstrate these types of lists by using the MySQL enum data type. However, the MySQL enum type doesn’t support multilingual implementations, isn’t readily portable to other relational database, and has a number of limitations.

A lookup table is the better solution to using an enum data type. It typically follows this pattern:

  • Identify the target table and column where a list is useful. Use the table_name and column_name columns as a super key to identify the location where the list belongs.
  • Identify a unique type identifier for the list. Store the unique type value in the type column of the lookup table.
  • Use a lang column to enable multilingual lists.

The combination of the table_name, column_name, type, and lang let you identify unique sets. You can find a monolingual implementation in these two older blog posts:

The column view of the lookup table shows the appropriate design checkboxes:

While most foreign keys use copies of surrogate keys, there are instances when you copy the natural key value from another table rather than the surrogate key. This is done when your application will frequently query the dependent lookup table without a join to the lang table, which means the foreign key value should be a human friendly foreign key value that works as a super key.

A super key is a column or set of columns that uniquely identifies a rows in the scope of a relation. For this example, the lang column identifies rows that belong to a language in a multilingual data model. Belonging to a language is the relation between the lookup and language table. It is also a key when filtering rows with a specific lang value from the lookup table.

You navigate to the foreign key tab to create a lookup_fk foreign key constraint, like:

With this type of foreign key constraint, you copy the lang value from the language table when inserting the lookup table values. Then, your HTML forms can use the lookup table’s meaning column in any of the supported languages, like:

SELECT lookup_id
,      type
,      meaning
FROM   lookup
WHERE  table_name = 'some_table_name'
AND    column_name = 'some_column_name'
AND    lang = 'some_lang_name';

The type column value isn’t used in the WHERE clause to filter the data set because it is unique within the relation of the table_name, column_name, and lang column values. It is always non-unique when you exclude the lang column value, and potentially non-unique for another combination of the table_name and column_name column values.

As a rule, most foreign key references are to the lookup table’s surrogate primary key because the meaning column’s value is too long to copy into the referencing table or subject to change in the base or translated languages. Small values, with intrinsic meaning, are stored in a code column in many implementations, like the lang column. Those typically follow the same implementation rule as the lang column and are copied into the referencing table.

If I’ve left questions, let me know. Other wise, I hope this helps qualify a best design practice.

Written by maclochlainn

October 16th, 2022 at 5:53 pm