MacLochlainns Weblog

Michael McLaughlin's Technical Blog

Site Admin

Archive for the ‘cqlsh’ Category

Cassandra on Fedora 30

without comments

The first thing to do with Fedora 30 is to check what part of Apache Cassandra is installed. You can use the following rpm command to determine that:

rpm -qa | grep cassandra

My Fedora 30 returned the following values:

cassandra-java-libs-3.11.1-12.fc30.x86_64
cassandra-python2-cqlshlib-3.11.1-12.fc30.x86_64
cassandra-3.11.1-12.fc30.x86_64
python2-cassandra-driver-3.18.0-1.fc30.x86_64

Notably missing from the list of rpm list is the cassandra-server package. You install cassandra-server with the def utility:

dnf install -y cassandra-server

You should get an installation log like the following for the cassandra-server package:

Fedora Magazine has a great Get Started with Apache Cassandra on Fedora article on all the steps required to setup clusters. This article only covers creating and enabling the Cassandra service, and setting up a single node Cassandra instance.

You start Cassandra with the following command as the root user:

systemctl start cassandra

You enable Cassandra with the following command as the root user:

systemctl enable cassandra

It creates the following symlink:

Created symlink /etc/systemd/system/multi-user.target.wants/cassandra.service → /usr/lib/systemd/system/cassandra.service.

You can connect to the Test cluster with the following command:

cqlsh

You should see the following:

Connected to Test Cluster at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 3.11.1 | CQL spec 3.4.4 | Native protocol v4]
Use HELP for help.

You can see the options by typing the help command:

Documented shell commands:
===========================
CAPTURE  CLS          COPY  DESCRIBE  EXPAND  LOGIN   SERIAL  SOURCE   UNICODE
CLEAR    CONSISTENCY  DESC  EXIT      HELP    PAGING  SHOW    TRACING
 
CQL help topics:
================
AGGREGATES               CREATE_KEYSPACE           DROP_TRIGGER      TEXT     
ALTER_KEYSPACE           CREATE_MATERIALIZED_VIEW  DROP_TYPE         TIME     
ALTER_MATERIALIZED_VIEW  CREATE_ROLE               DROP_USER         TIMESTAMP
ALTER_TABLE              CREATE_TABLE              FUNCTIONS         TRUNCATE 
ALTER_TYPE               CREATE_TRIGGER            GRANT             TYPES    
ALTER_USER               CREATE_TYPE               INSERT            UPDATE   
APPLY                    CREATE_USER               INSERT_JSON       USE      
ASCII                    DATE                      INT               UUID     
BATCH                    DELETE                    JSON            
BEGIN                    DROP_AGGREGATE            KEYWORDS        
BLOB                     DROP_COLUMNFAMILY         LIST_PERMISSIONS
BOOLEAN                  DROP_FUNCTION             LIST_ROLES      
COUNTER                  DROP_INDEX                LIST_USERS      
CREATE_AGGREGATE         DROP_KEYSPACE             PERMISSIONS     
CREATE_COLUMNFAMILY      DROP_MATERIALIZED_VIEW    REVOKE          
CREATE_FUNCTION          DROP_ROLE                 SELECT          
CREATE_INDEX             DROP_TABLE                SELECT_JSON

Here’s my script that creates Cassandra keyspace, which is more or less a database. You use the USE command to connect to the keyspace or database, like you would in MySQL. You do not have sequences in Cassandra because they’re not a good fit for a distributed architecture. Cassandra does not support a native procedural extension like relational databases. You must create User-defined functions (UDFs) by embedding the logic in Java.

This script does the following:

  • Creates a keyspace
  • Uses the keyspace
  • Conditionally drops tables and functions
  • Creates two tables
  • Inserts data into the two tables
  • Queries data from the tables

I also included a call to a UDF inside a query in two of the examples. One of the queries demonstrates how to return a JSON structure from a query. To simplify things and provide clarification of the scripts behaviors, the details are outlined below.

  • The first segment of the script creates the keyspace, changes the scope to use the keyspace, conditionally drop tables, create tables, and insert values into the tables:

    /* Create a keyspace in Cassandra, which is like a database
       in MySQL or a schema in Oracle. */
    CREATE KEYSPACE IF NOT EXISTS student
      WITH REPLICATION = {
         'class':'SimpleStrategy'
        ,'replication_factor': 1 }
      AND DURABLE_WRITES = true;
     
    /* Use the keyspace or connect to the database. */
    USE student;
     
    /* Drop the member table from the student keyspace. */
    DROP TABLE IF EXISTS member;
     
    /* Create a member table in the student keyspace. */
    CREATE TABLE member
    ( member_number       VARCHAR
    , member_type         VARCHAR
    , credit_card_number  VARCHAR
    , credit_card_type    VARCHAR
    , PRIMARY KEY ( member_number ));
     
    /* Conditionally drop the contact table from the student keyspace. */
    DROP TABLE IF EXISTS contact;
     
    /* Create a contact table in the student keyspace. */
    CREATE TABLE contact
    ( contact_number      VARCHAR
    , contact_type        VARCHAR
    , first_name          VARCHAR
    , middle_name         VARCHAR
    , last_name           VARCHAR
    , member_number       VARCHAR
    , PRIMARY KEY ( contact_number ));
     
    /* Insert a row into the member table. */
    INSERT INTO member
    ( member_number, member_type, credit_card_number, credit_card_type )
    VALUES
    ('SFO-12345','GROUP','2222-4444-5555-6666','VISA');
     
    /* Insert a row into the contact table. */
    INSERT INTO contact
    ( contact_number, contact_type, first_name, middle_name, last_name, member_number )
    VALUES
    ('CUS_00001','FAMILY','Barry', NULL,'Allen','SFO-12345');
     
    /* Insert a row into the contact table. */
    INSERT INTO contact
    ( contact_number, contact_type, first_name, middle_name, last_name, member_number )
    VALUES
    ('CUS_00002','FAMILY','Iris', NULL,'West-Allen','SFO-12345');
     
    /* Insert a row into the member table. */
    INSERT INTO member
    ( member_number, member_type, credit_card_number, credit_card_type )
    VALUES
    ('SFO-12346','GROUP','3333-8888-9999-2222','VISA');
     
    /* Insert a row into the contact table. */
    INSERT INTO contact
    ( contact_number, contact_type, first_name, middle_name, last_name, member_number )
    VALUES
    ('CUS_00003','FAMILY','Caitlin','Marie','Snow','SFO-12346');

    The following queries the member table:

    /* Select all columns from the member table. */
    SELECT * FROM member;

    It returns the following:

     member_number | credit_card_number  | credit_card_type | member_type
    ---------------+---------------------+------------------+-------------
         SFO-12345 | 2222-4444-5555-6666 |             VISA |       GROUP
         SFO-12346 | 3333-8888-9999-2222 |             VISA |       GROUP
  • Create a concatenate User-defined function (UDF) for Cassandra. The first step requires you to edit the cassandra.yaml file, which you find in the /etc/cassandra/default.conf directory. There is a single parameter that you need to edit, and it is the enable_user_defined_functions parameter. By default the parameter is set to false, and you need to enable it to create UDFs.

    If you open the cassandra.yaml file as the root user, you should find the parameter on line 987, like:

    983
    984
    985
    986
    987
    
    # If unset, all GC Pauses greater than gc_log_threshold_in_ms will log at
    # INFO level
    # UDFs (user defined functions) are disabled by default.
    # As of Cassandra 3.0 there is a sandbox in place that should prevent execution of evil code.
    enable_user_defined_functions: false

    After you make the edit, the cassandra.yaml file should look like this:

    983
    984
    985
    986
    987
    
    # If unset, all GC Pauses greater than gc_log_threshold_in_ms will log at
    # INFO level
    # UDFs (user defined functions) are disabled by default.
    # As of Cassandra 3.0 there is a sandbox in place that should prevent execution of evil code.
    enable_user_defined_functions: true

    After you make the change, you can create your own UDF. The following UDF formats the first, middle, and last name so there’s only one whitespace between the first and last name when there middle name value is null.

    This type of function must use a CALLED ON NULL INPUT clause in lieu of a RETURNS NULL ON NULL INPUT clause. The latter would force the function to return a null value if any one of the parameters were null.

    /* Drop the concatenate function because a replace disallows changing a
       RETURNS NULL ON NULL INPUT with a CALLED ON NULL INPUT without raising
       an "89: InvalidRequest" exception. */
    DROP FUNCTION concatenate;
     
    /* Create a user-defined function to concatenate names. */
    CREATE OR REPLACE FUNCTION concatenate (first_name VARCHAR, middle_name VARCHAR, last_name VARCHAR)
    CALLED ON NULL INPUT
    RETURNS VARCHAR
    LANGUAGE java
    AS $$
      /* Concatenate first and last names when middle name is null, and
         first, middle, and last names when middle name is not null. */
      String name;
     
      /* Check for null middle name. */
      if (middle_name == null) {
        name = first_name + " " + last_name; }
      else {
        name = first_name + " " + middle_name + " " + last_name; }
     
      return name;
    $$;
  • Query the values from the contact table with the UDF function in the SELECT-list:

    /* Query the contact information. */
    SELECT member_number
    ,      contact_number
    ,      contact_type
    ,      concatenate(first_name, middle_name, last_name) AS full_name
    FROM   contact;

    It returns the following:

     member_number | contact_number | contact_type | full_name
    ---------------+----------------+--------------+--------------------
         SFO-12345 |      CUS_00001 |       FAMILY |        Barry Allen
         SFO-12345 |      CUS_00002 |       FAMILY |    Iris West-Allen
         SFO-12346 |      CUS_00003 |       FAMILY | Caitlin Marie Snow

    Query the values from the contact table with a JSON format:

    /* Query the contact information and return in a JSON format. */
    SELECT JSON
           contact_number
    ,      contact_type
    ,      concatenate(first_name, middle_name, last_name) AS full_name
    FROM   contact;

    It returns the following:

     [json]
    -------------------------------------------------------------------------------------------------
    {"contact_number": "CUS_00001", "contact_type": "FAMILY", "full_name": "Barry Allen"}
    {"contact_number": "CUS_00002", "contact_type": "FAMILY", "full_name": "Iris West-Allen"}
    {"contact_number": "CUS_00003", "contact_type": "FAMILY", "full_name": "Caitlin Marie Snow"}

Written by maclochlainn

September 12th, 2019 at 1:17 am

Cassandra on Fedora 27

without comments

The last time that I installed Cassandra was on a version of Fedora 20. So, I new the first thing to check was the installation of Java. You can check the Java installation with two statements on a fresh installation of Fedora 27. You need to check the Java runtime and then the Java Software Development Kit before installing, starting, and using Cassandra.

Installing Prerequisites

You check the Java runtime with this command:

java -version

It should return:

openjdk version "1.8.0_171"
OpenJDK Runtime Environment (build 1.8.0_171-b10)
OpenJDK 64-Bit Server VM (build 25.171-b10, mixed mode)

You check the Java Software Development Kit (JSDK) with this command:

javac -version

It should return:

javac 1.8.0_171

After verifying the Java and JSDK installation, you can install the Cassandra packages with the following yum command as the root user or a user with sudoer privileges:

yum install -y *cassandra*

You should see a successful installation log like:

Starting Cassandra

After you install Cassandra, you can start it as any sudoer user with the following syntax:

sudo cassandra -R

Using Cassandra

You can connect to the Cassandra server with the cqlsh client software. You use the following syntax:

cqlsh

You should see the Cassandra version information, and then you can type help at the cqlsh> prompt to see the available commands:

Connected TO Test Cluster at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 3.11.1 | CQL spec 3.4.4 | Native protocol v4]
USE HELP FOR help.
cqlsh> help
 
Documented shell commands:
===========================
CAPTURE  CLS          COPY  DESCRIBE  EXPAND  LOGIN   SERIAL  SOURCE   UNICODE
CLEAR    CONSISTENCY  DESC  EXIT      HELP    PAGING  SHOW    TRACING
 
CQL help topics:
================
AGGREGATES               CREATE_KEYSPACE           DROP_TRIGGER      TEXT     
ALTER_KEYSPACE           CREATE_MATERIALIZED_VIEW  DROP_TYPE         TIME     
ALTER_MATERIALIZED_VIEW  CREATE_ROLE               DROP_USER         TIMESTAMP
ALTER_TABLE              CREATE_TABLE              FUNCTIONS         TRUNCATE 
ALTER_TYPE               CREATE_TRIGGER            GRANT             TYPES    
ALTER_USER               CREATE_TYPE               INSERT            UPDATE   
APPLY                    CREATE_USER               INSERT_JSON       USE      
ASCII                    DATE                      INT               UUID     
BATCH                    DELETE                    JSON            
BEGIN                    DROP_AGGREGATE            KEYWORDS        
BLOB                     DROP_COLUMNFAMILY         LIST_PERMISSIONS
BOOLEAN                  DROP_FUNCTION             LIST_ROLES      
COUNTER                  DROP_INDEX                LIST_USERS      
CREATE_AGGREGATE         DROP_KEYSPACE             PERMISSIONS     
CREATE_COLUMNFAMILY      DROP_MATERIALIZED_VIEW    REVOKE          
CREATE_FUNCTION          DROP_ROLE                 SELECT          
CREATE_INDEX             DROP_TABLE                SELECT_JSON

Here’s my script that creates Cassandra keyspace, which is more or less a database. You use the USE command to connect to the keyspace or database, like you would in MySQL. You do not have sequences in Cassandra because they’re not a good fit for a distributed architecture. Cassandra does not support a native procedural extension like relational databases. You must create User-defined functions (UDFs) by embedding the logic in Java.

This script does the following:

  • Creates a keyspace
  • Uses the keyspace
  • Conditionally drops tables and functions
  • Creates two tables
  • Inserts data into the two tables
  • Queries data from the tables

I also included a call to a UDF inside a query in two of the examples. One of the queries demonstrates how to return a JSON structure from a query. To simplify things and provide clarification of the scripts behaviors, the details are outlined below.

  • The first segment of the script creates the keyspace, changes the scope to use the keyspace, conditionally drop tables, create tables, and insert values into the tables:

    /* Create a keyspace in Cassandra, which is like a database
       in MySQL or a schema in Oracle. */
    CREATE KEYSPACE IF NOT EXISTS student
      WITH REPLICATION = {
         'class':'SimpleStrategy'
        ,'replication_factor': 1 }
      AND DURABLE_WRITES = true;
     
    /* Use the keyspace or connect to the database. */
    USE student;
     
    /* Drop the member table from the student keyspace. */
    DROP TABLE IF EXISTS member;
     
    /* Create a member table in the student keyspace. */
    CREATE TABLE member
    ( member_number       VARCHAR
    , member_type         VARCHAR
    , credit_card_number  VARCHAR
    , credit_card_type    VARCHAR
    , PRIMARY KEY ( member_number ));
     
    /* Conditionally drop the contact table from the student keyspace. */
    DROP TABLE IF EXISTS contact;
     
    /* Create a contact table in the student keyspace. */
    CREATE TABLE contact
    ( contact_number      VARCHAR
    , contact_type        VARCHAR
    , first_name          VARCHAR
    , middle_name         VARCHAR
    , last_name           VARCHAR
    , member_number       VARCHAR
    , PRIMARY KEY ( contact_number ));
     
    /* Insert a row into the member table. */
    INSERT INTO member
    ( member_number, member_type, credit_card_number, credit_card_type )
    VALUES
    ('SFO-12345','GROUP','2222-4444-5555-6666','VISA');
     
    /* Insert a row into the contact table. */
    INSERT INTO contact
    ( contact_number, contact_type, first_name, middle_name, last_name, member_number )
    VALUES
    ('CUS_00001','FAMILY','Barry', NULL,'Allen','SFO-12345');
     
    /* Insert a row into the contact table. */
    INSERT INTO contact
    ( contact_number, contact_type, first_name, middle_name, last_name, member_number )
    VALUES
    ('CUS_00002','FAMILY','Iris', NULL,'West-Allen','SFO-12345');
     
    /* Insert a row into the member table. */
    INSERT INTO member
    ( member_number, member_type, credit_card_number, credit_card_type )
    VALUES
    ('SFO-12346','GROUP','3333-8888-9999-2222','VISA');
     
    /* Insert a row into the contact table. */
    INSERT INTO contact
    ( contact_number, contact_type, first_name, middle_name, last_name, member_number )
    VALUES
    ('CUS_00003','FAMILY','Caitlin','Marie','Snow','SFO-12346');
  • The following queries the member table:

    /* Select all columns from the member table. */
    SELECT * FROM member;

    It returns the following:

     member_number | credit_card_number  | credit_card_type | member_type
    ---------------+---------------------+------------------+-------------
         SFO-12345 | 2222-4444-5555-6666 |             VISA |       GROUP
         SFO-12346 | 3333-8888-9999-2222 |             VISA |       GROUP
  • Create a concatenate User-defined function (UDF) for Cassandra. The first step requires you to edit the cassandra.yaml file, which you find in the /etc/cassandra/default.conf directory. There is a single parameter that you need to edit, and it is the enable_user_defined_functions parameter. By default the parameter is set to false, and you need to enable it to create UDFs.

    After you make the edit, the cassandra.yaml file should look like this:

    1089
    1090
    1091
    1092
    1093
    
    # If unset, all GC Pauses greater than gc_log_threshold_in_ms will log at
    # INFO level
    # UDFs (user defined functions) are disabled by default.
    # As of Cassandra 3.0 there is a sandbox in place that should prevent execution of evil code.
    enable_user_defined_functions: true

    After you make the change, you can create your own UDF. The following UDF formats the first, middle, and last name so there’s only one whitespace between the first and last name when there middle name value is null.

    This type of function must use a CALLED ON NULL INPUT clause in lieu of a RETURNS NULL ON NULL INPUT clause. The latter would force the function to return a null value if any one of the parameters were null.

    /* Drop the concatenate function because a replace disallows changing a
       RETURNS NULL ON NULL INPUT with a CALLED ON NULL INPUT without raising
       an "89: InvalidRequest" exception. */
    DROP FUNCTION concatenate;
     
    /* Create a user-defined function to concatenate names. */
    CREATE OR REPLACE FUNCTION concatenate (first_name VARCHAR, middle_name VARCHAR, last_name VARCHAR)
    CALLED ON NULL INPUT
    RETURNS VARCHAR
    LANGUAGE java
    AS $$
      /* Concatenate first and last names when middle name is null, and
         first, middle, and last names when middle name is not null. */
      String name;
     
      /* Check for null middle name. */
      if (middle_name == null) {
        name = first_name + " " + last_name; }
      else {
        name = first_name + " " + middle_name + " " + last_name; }
     
      return name;
    $$;
  • Query the values from the contact table with the UDF function in the SELECT-list:

    /* Query the contact information. */
    SELECT member_number
    ,      contact_number
    ,      contact_type
    ,      concatenate(first_name, middle_name, last_name) AS full_name
    FROM   contact;

    It returns the following:

     member_number | contact_number | contact_type | full_name
    ---------------+----------------+--------------+--------------------
         SFO-12345 |      CUS_00001 |       FAMILY |        Barry Allen
         SFO-12345 |      CUS_00002 |       FAMILY |    Iris West-Allen
         SFO-12346 |      CUS_00003 |       FAMILY | Caitlin Marie Snow
  • Query the values from the contact table with a JSON format:

    /* Query the contact information and return in a JSON format. */
    SELECT JSON
           contact_number
    ,      contact_type
    ,      concatenate(first_name, middle_name, last_name) AS full_name
    FROM   contact;

    It returns the following:

     [json]
    -------------------------------------------------------------------------------------------------
    {"contact_number": "CUS_00001", "contact_type": "FAMILY", "full_name": "Barry Allen"}
    {"contact_number": "CUS_00002", "contact_type": "FAMILY", "full_name": "Iris West-Allen"}
    {"contact_number": "CUS_00003", "contact_type": "FAMILY", "full_name": "Caitlin Marie Snow"}

You can call the script from a relative directory inside cqlsh, like this:

SOURCE 'cstudent.cql'

At the end of the day, the concept of adding and removing nodes is attractive. Though, the lack of normal relational mechanics and narrowly supported set of CQL semantics leaves me with open questions. For example, is clustering without a coordinator really valuable enough to settle for eventual, or tunable, consistency with such a narrowly scoped query language?

As always, I hope this helps those looking for a quick how-to on Cassandra.

Written by maclochlainn

May 17th, 2018 at 12:38 am

Cassandra Query Language

without comments

After installing Cassandra and reading Cassandra The Definitive Guide, it struck me that I should learn a bit more about the Cassandra Query Language (CQL). So, after I setup a single-node environment and created a .bashcassandra environment file to connect as a student user to the Cassandra instance:

# Add the Java and JRE paths to the $PATH environments.
export set PATH=$PATH:/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.45.x86_64:/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.45.x86_64/jre
 
# Add the $JAVA_HOME and $JRE_HOME environment variables.
export set JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.45.x86_64/
export set JRE_HOME=/usr

Having started Cassandra as the cassandra user, I connected to the Cassandra Query Language Shell (cqlsh) to learn how to write CQL. You can find the basic structure of the Cassandra Query Language (CQL) on the Apache Cassandra website. I also discovered that CQL by itself can’t let you join tables without using Apache SparkSQL. Apache SparkSQL adds the ability to perform CQL joins in Cassandra, and became available in 2015.

I also learned you can’t use a CREATE OR REPLACE command when you change certain aspects of User-Defined Functions (UDFs). You actually need to drop any UDF before you change RETURNS NULL ON NULL INPUT clause to a CALLED ON NULL INPUT clause or vice versa. You can’t embed Java that connects to database without using the cassandra-java-driver-2.0.2 driver.

You connect to the cqlsh like this:

cqlsh

Here’s my script that creates Cassandra keyspace, which is more or less a database. You use the USE command to connect to the keyspace or database, like you would in MySQL. You do not have sequences in Cassandra because they’re not a good fit for a distributed architecture. Cassandra does not support a native procedural extension like relational databases. You must create User-defined functions (UDFs) by embedding the logic in Java.

This script does the following:

  • Creates a keyspace
  • Uses the keyspace
  • Conditionally drops tables and functions
  • Creates two tables
  • Inserts data into the two tables
  • Queries data from the tables

I also included a call to a UDF inside a query in two of the examples. One of the queries demonstrates how to return a JSON structure from a query. To simplify things and provide clarification of the scripts behaviors, the details are outlined below.

  • The first segment of the script creates the keyspace, changes the scope to use the keyspace, conditionally drop tables, create tables, and insert values into the tables:

    /* Create a keyspace in Cassandra, which is like a database
       in MySQL or a schema in Oracle. */
    CREATE KEYSPACE IF NOT EXISTS student
      WITH REPLICATION = {
         'class':'SimpleStrategy'
        ,'replication_factor': 1 }
      AND DURABLE_WRITES = true;
     
    /* Use the keyspace or connect to the database. */
    USE student;
     
    /* Drop the member table from the student keyspace. */
    DROP TABLE IF EXISTS member;
     
    /* Create a member table in the student keyspace. */
    CREATE TABLE member
    ( member_number       VARCHAR
    , member_type         VARCHAR
    , credit_card_number  VARCHAR
    , credit_card_type    VARCHAR
    , PRIMARY KEY ( member_number ));
     
    /* Conditionally drop the contact table from the student keyspace. */
    DROP TABLE IF EXISTS contact;
     
    /* Create a contact table in the student keyspace. */
    CREATE TABLE contact
    ( contact_number      VARCHAR
    , contact_type        VARCHAR
    , first_name          VARCHAR
    , middle_name         VARCHAR
    , last_name           VARCHAR
    , member_number       VARCHAR
    , PRIMARY KEY ( contact_number ));
     
    /* Insert a row into the member table. */
    INSERT INTO member
    ( member_number, member_type, credit_card_number, credit_card_type )
    VALUES
    ('SFO-12345','GROUP','2222-4444-5555-6666','VISA');
     
    /* Insert a row into the contact table. */
    INSERT INTO contact
    ( contact_number, contact_type, first_name, middle_name, last_name, member_number )
    VALUES
    ('CUS_00001','FAMILY','Barry', NULL,'Allen','SFO-12345');
     
    /* Insert a row into the contact table. */
    INSERT INTO contact
    ( contact_number, contact_type, first_name, middle_name, last_name, member_number )
    VALUES
    ('CUS_00002','FAMILY','Iris', NULL,'West-Allen','SFO-12345');
     
    /* Insert a row into the member table. */
    INSERT INTO member
    ( member_number, member_type, credit_card_number, credit_card_type )
    VALUES
    ('SFO-12346','GROUP','3333-8888-9999-2222','VISA');
     
    /* Insert a row into the contact table. */
    INSERT INTO contact
    ( contact_number, contact_type, first_name, middle_name, last_name, member_number )
    VALUES
    ('CUS_00003','FAMILY','Caitlin','Marie','Snow','SFO-12346');
  • The following queries the member table:

    /* Select all columns from the member table. */
    SELECT * FROM member;

    It returns the following:

     member_number | credit_card_number  | credit_card_type | member_type
    ---------------+---------------------+------------------+-------------
         SFO-12345 | 2222-4444-5555-6666 |             VISA |       GROUP
         SFO-12346 | 3333-8888-9999-2222 |             VISA |       GROUP
  • Create a concatenate User-defined function (UDF) for Cassandra. The first step requires you to edit the cassandra.yaml file, which you find in the /etc/cassandra/default.conf directory. There is a single parameter that you need to edit, and it is the enable_user_defined_functions parameter. By default the parameter is set to false, and you need to enable it to create UDFs.

    After you make the edit, the cassandra.yaml file should look like this:

    1089
    1090
    1091
    1092
    1093
    
    # If unset, all GC Pauses greater than gc_log_threshold_in_ms will log at
    # INFO level
    # UDFs (user defined functions) are disabled by default.
    # As of Cassandra 3.0 there is a sandbox in place that should prevent execution of evil code.
    enable_user_defined_functions: true

    After you make the change, you can create your own UDF. The following UDF formats the first, middle, and last name so there’s only one whitespace between the first and last name when there middle name value is null.

    This type of function must use a CALLED ON NULL INPUT clause in lieu of a RETURNS NULL ON NULL INPUT clause. The latter would force the function to return a null value if any one of the parameters were null.

    /* Drop the concatenate function because a replace disallows changing a
       RETURNS NULL ON NULL INPUT with a CALLED ON NULL INPUT without raising
       an "89: InvalidRequest" exception. */
    DROP FUNCTION concatenate;
     
    /* Create a user-defined function to concatenate names. */
    CREATE OR REPLACE FUNCTION concatenate (first_name VARCHAR, middle_name VARCHAR, last_name VARCHAR)
    CALLED ON NULL INPUT
    RETURNS VARCHAR
    LANGUAGE java
    AS $$
      /* Concatenate first and last names when middle name is null, and
         first, middle, and last names when middle name is not null. */
      String name;
     
      /* Check for null middle name. */
      if (middle_name == null) {
        name = first_name + " " + last_name; }
      else {
        name = first_name + " " + middle_name + " " + last_name; }
     
      return name;
    $$;
  • Query the values from the contact table with the UDF function in the SELECT-list:

    /* Query the contact information. */
    SELECT member_number
    ,      contact_number
    ,      contact_type
    ,      concatenate(first_name, middle_name, last_name) AS full_name
    FROM   contact;

    It returns the following:

     member_number | contact_number | contact_type | full_name
    ---------------+----------------+--------------+--------------------
         SFO-12345 |      CUS_00001 |       FAMILY |        Barry Allen
         SFO-12345 |      CUS_00002 |       FAMILY |    Iris West-Allen
         SFO-12346 |      CUS_00003 |       FAMILY | Caitlin Marie Snow
  • Query the values from the contact table with a JSON format:

    /* Query the contact information and return in a JSON format. */
    SELECT JSON
           contact_number
    ,      contact_type
    ,      concatenate(first_name, middle_name, last_name) AS full_name
    FROM   contact;

    It returns the following:

     [json]
    -------------------------------------------------------------------------------------------------
    {"contact_number": "CUS_00001", "contact_type": "FAMILY", "full_name": "Barry Allen"}
    {"contact_number": "CUS_00002", "contact_type": "FAMILY", "full_name": "Iris West-Allen"}
    {"contact_number": "CUS_00003", "contact_type": "FAMILY", "full_name": "Caitlin Marie Snow"}

You can call the script from a relative directory inside cqlsh, like this:

SOURCE 'cstudent.cql'

At the end of the day, the concept of adding and removing nodes is attractive. Though, the lack of normal relational mechanics and narrowly supported set of CQL semantics leaves me with open questions. For example, is clustering without a coordinator really valuable enough to settle for eventual, or tunable, consistency with such a narrowly scoped query language?

As always, I hope this helps those looking for a quick how-to on Cassandra.

Written by maclochlainn

July 30th, 2017 at 12:33 am