MacLochlainns Weblog

Michael McLaughlin's Technical Blog

Site Admin

Archive for the ‘CQL’ Category

Cassandra Query Language

without comments

After installing Cassandra and reading Cassandra The Definitive Guide, it struck me that I should learn a bit more about the Cassandra Query Language (CQL). So, after I setup a single-node environment and created a .bashcassandra environment file to connect as a student user to the Cassandra instance:

# Add the Java and JRE paths to the $PATH environments.
export set PATH=$PATH:/usr/lib/jvm/java-1.8.0-openjdk-
# Add the $JAVA_HOME and $JRE_HOME environment variables.
export set JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-
export set JRE_HOME=/usr

Having started Cassandra as the cassandra user, I connected to the Cassandra Query Language Shell (cqlsh) to learn how to write CQL. You can find the basic structure of the Cassandra Query Language (CQL) on the Apache Cassandra website. I also discovered that CQL by itself can’t let you join tables without using Apache SparkSQL. Apache SparkSQL adds the ability to perform CQL joins in Cassandra, and became available in 2015.

I also learned you can’t use a CREATE OR REPLACE command when you change certain aspects of User-Defined Functions (UDFs). You actually need to drop any UDF before you change RETURNS NULL ON NULL INPUT clause to a CALLED ON NULL INPUT clause or vice versa. You can’t embed Java that connects to database without using the cassandra-java-driver-2.0.2 driver.

You connect to the cqlsh like this:


Here’s my script that creates Cassandra keyspace, which is more or less a database. You use the USE command to connect to the keyspace or database, like you would in MySQL. You do not have sequences in Cassandra because they’re not a good fit for a distributed architecture. Cassandra does not support a native procedural extension like relational databases. You must create User-defined functions (UDFs) by embedding the logic in Java.

This script does the following:

  • Creates a keyspace
  • Uses the keyspace
  • Conditionally drops tables and functions
  • Creates two tables
  • Inserts data into the two tables
  • Queries data from the tables

I also included a call to a UDF inside a query in two of the examples. One of the queries demonstrates how to return a JSON structure from a query. To simplify things and provide clarification of the scripts behaviors, the details are outlined below.

  • The first segment of the script creates the keyspace, changes the scope to use the keyspace, conditionally drop tables, create tables, and insert values into the tables:

    /* Create a keyspace in Cassandra, which is like a database
       in MySQL or a schema in Oracle. */
        ,'replication_factor': 1 }
      AND DURABLE_WRITES = true;
    /* Use the keyspace or connect to the database. */
    USE student;
    /* Drop the member table from the student keyspace. */
    /* Create a member table in the student keyspace. */
    CREATE TABLE member
    ( member_number       VARCHAR
    , member_type         VARCHAR
    , credit_card_number  VARCHAR
    , credit_card_type    VARCHAR
    , PRIMARY KEY ( member_number ));
    /* Conditionally drop the contact table from the student keyspace. */
    /* Create a contact table in the student keyspace. */
    CREATE TABLE contact
    ( contact_number      VARCHAR
    , contact_type        VARCHAR
    , first_name          VARCHAR
    , middle_name         VARCHAR
    , last_name           VARCHAR
    , member_number       VARCHAR
    , PRIMARY KEY ( contact_number ));
    /* Insert a row into the member table. */
    INSERT INTO member
    ( member_number, member_type, credit_card_number, credit_card_type )
    /* Insert a row into the contact table. */
    INSERT INTO contact
    ( contact_number, contact_type, first_name, middle_name, last_name, member_number )
    ('CUS_00001','FAMILY','Barry', NULL,'Allen','SFO-12345');
    /* Insert a row into the contact table. */
    INSERT INTO contact
    ( contact_number, contact_type, first_name, middle_name, last_name, member_number )
    ('CUS_00002','FAMILY','Iris', NULL,'West-Allen','SFO-12345');
    /* Insert a row into the member table. */
    INSERT INTO member
    ( member_number, member_type, credit_card_number, credit_card_type )
    /* Insert a row into the contact table. */
    INSERT INTO contact
    ( contact_number, contact_type, first_name, middle_name, last_name, member_number )
  • The following queries the member table:

    /* Select all columns from the member table. */
    SELECT * FROM member;

    It returns the following:

     member_number | credit_card_number  | credit_card_type | member_type
         SFO-12345 | 2222-4444-5555-6666 |             VISA |       GROUP
         SFO-12346 | 3333-8888-9999-2222 |             VISA |       GROUP
  • Create a concatenate User-defined function (UDF) for Cassandra. The first step requires you to edit the cassandra.yaml file, which you find in the /etc/cassandra/default.conf directory. There is a single parameter that you need to edit, and it is the enable_user_defined_functions parameter. By default the parameter is set to false, and you need to enable it to create UDFs.

    After you make the edit, the cassandra.yaml file should look like this:

    # If unset, all GC Pauses greater than gc_log_threshold_in_ms will log at
    # INFO level
    # UDFs (user defined functions) are disabled by default.
    # As of Cassandra 3.0 there is a sandbox in place that should prevent execution of evil code.
    enable_user_defined_functions: true

    After you make the change, you can create your own UDF. The following UDF formats the first, middle, and last name so there’s only one whitespace between the first and last name when there middle name value is null.

    This type of function must use a CALLED ON NULL INPUT clause in lieu of a RETURNS NULL ON NULL INPUT clause. The latter would force the function to return a null value if any one of the parameters were null.

    /* Drop the concatenate function because a replace disallows changing a
       an "89: InvalidRequest" exception. */
    DROP FUNCTION concatenate;
    /* Create a user-defined function to concatenate names. */
    CREATE OR REPLACE FUNCTION concatenate (first_name VARCHAR, middle_name VARCHAR, last_name VARCHAR)
    LANGUAGE java
    AS $$
      /* Concatenate first and last names when middle name is null, and
         first, middle, and last names when middle name is not null. */
      String name;
      /* Check for null middle name. */
      if (middle_name == null) {
        name = first_name + " " + last_name; }
      else {
        name = first_name + " " + middle_name + " " + last_name; }
      return name;
  • Query the values from the contact table with the UDF function in the SELECT-list:

    /* Query the contact information. */
    SELECT member_number
    ,      contact_number
    ,      contact_type
    ,      concatenate(first_name, middle_name, last_name) AS full_name
    FROM   contact;

    It returns the following:

     member_number | contact_number | contact_type | full_name
         SFO-12345 |      CUS_00001 |       FAMILY |        Barry Allen
         SFO-12345 |      CUS_00002 |       FAMILY |    Iris West-Allen
         SFO-12346 |      CUS_00003 |       FAMILY | Caitlin Marie Snow
  • Query the values from the contact table with a JSON format:

    /* Query the contact information and return in a JSON format. */
    ,      contact_type
    ,      concatenate(first_name, middle_name, last_name) AS full_name
    FROM   contact;

    It returns the following:

    {"contact_number": "CUS_00001", "contact_type": "FAMILY", "full_name": "Barry Allen"}
    {"contact_number": "CUS_00002", "contact_type": "FAMILY", "full_name": "Iris West-Allen"}
    {"contact_number": "CUS_00003", "contact_type": "FAMILY", "full_name": "Caitlin Marie Snow"}

You can call the script from a relative directory inside cqlsh, like this:

SOURCE 'cstudent.cql'

At the end of the day, the concept of adding and removing nodes is attractive. Though, the lack of normal relational mechanics and narrowly supported set of CQL semantics leaves me with open questions. For example, is clustering without a coordinator really valuable enough to settle for eventual, or tunable, consistency with such a narrowly scoped query language?

As always, I hope this helps those looking for a quick how-to on Cassandra.

Written by maclochlainn

July 30th, 2017 at 12:33 am

Install Cassandra on Fedora

with one comment

It was quite interesting to discover that DataStax no longer provides the DataStax Community version of Apache Cassandra or the DataStax Distribution of Apache Cassandra. Needless to say, I was quite disappointed because it means folks will get less opportunity to learn how to use Cassandra because it makes it more difficult for beginning developers.

I spent a good hour sorting through what was available and then figuring out the real requirements to install Apache Cassandra 3.11. These are the instructions.

Install Java and JRE as Prerequisites

If you don’t have the JRE installed, you should download it from Oracle’s website and install it. After you download the latest version of the JRE package (jre-8u141-linux-x64.rpm). You should use the rpm utility to install the JRE package, like the following example:

rpm -ivh /home/student/Downloads/jre-8*.rpm

It should generate the following installation report:

Preparing...                          ################################# [100%]
	package jre1.8.0_141-1.8.0_141-fcs.x86_64 is already installed
sh-4.2# rpm -qa jre
sh-4.2# rpm -qf jre
error: file /jre: No such file or directory
sh-4.2# rpm -qa | grep jre
sh-4.2# rpm -qa | grep jre | rpm -qi
rpm: no arguments given for query
sh-4.2# rpm -qi `rpm -qa | grep jre`
Name        : jre1.8.0_141
Version     : 1.8.0_141
Release     : fcs
Architecture: x86_64
Install Date: Mon 24 Jul 2017 11:09:58 PM PDT
Group       : Development/Tools
Size        : 139460427
License     :
Signature   : (none)
Source RPM  : jre1.8.0_141-1.8.0_141-fcs.src.rpm
Build Date  : Wed 12 Jul 2017 04:47:52 AM PDT
Build Host  : jdk7-lin2-amd64
Relocations : /usr/java 
Packager    : Java Software <>
Vendor      : Oracle Corporation
URL         : URL_REF
Summary     : Java Platform Standard Edition Runtime Environment
Description :
The Java Platform Standard Edition Runtime Environment (JRE) contains
everything necessary to run applets and applications designed for the
Java platform. This includes the Java virtual machine, plus the Java
platform classes and supporting files.
The JRE is freely redistributable, per the terms of the included license.

Confirm Java and JRE Installation

You can check the current installed version of Java and JRE by using the alternatives utility with the --config option and the keyword of java or jre.

sh-4.2# alternatives --config java

It should generate the following list when you check for the java library:

There are 3 programs which provide 'java'.
  Selection    Command
*  1           /usr/lib/jvm/java-1.7.0-openjdk-
 + 2           /usr/lib/jvm/jre-1.8.0-openjdk.x86_64/bin/java
   3           /usr/java/jre1.8.0_141/bin/java
Enter to keep the current selection[+], or type selection number:

It should generate the following list when you check for the javac library:

There are 2 programs which provide 'javac'.
  Selection    Command
*  1           /usr/lib/jvm/java-1.7.0-openjdk-
 + 2           /usr/lib/jvm/java-1.8.0-openjdk.x86_64/bin/javac
Enter to keep the current selection[+], or type selection number:

After installing and selecting them as the designated alternative, if you have more than one Java or JRE installed on your OS, you should create a configuration file for the root user. You should include the following to set your $PATH, $JAVA_HOME, and $JRE_HOME environment variables:

# Add the Java and JRE paths to the $PATH environments.
export set PATH=$PATH:/usr/lib/jvm/java-1.8.0-openjdk-
# Add the $JAVA_HOME and $JRE_HOME environment variables.
export set JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-
export set JRE_HOME=/usr

Install Apache Cassandra

The yum utility is the best way to install Apache Cassandra. However, you will need to configure the /etc/yum.repos.d/cassandra.repo before you attempt to install Cassandra 3.11 from the Apache organization, like this:

name=Apache Cassandra

After you’ve added the necessary yum configuration file and ensured you’re using both Java 1.8 and JRE 1.8, you can install Apache Cassandra with the following yum command as the root user or as a sudoer member with the sudo command:

yum install -y cassandra

If successful, you should see the following output:

Loaded plugins: langpacks, refresh-packagekit
cassandra/signature                                         |  819 B  00:00     
cassandra/signature                                         | 2.9 kB  00:00 !!! 
mysql-connectors-community                                  | 2.5 kB  00:00     
mysql-tools-community                                       | 2.5 kB  00:00     
mysql56-community                                           | 2.5 kB  00:00 [Errno 14] HTTP Error 404 - Not Found
Trying other mirror.
updates/20/x86_64/metalink                                  | 2.6 kB  00:00     
Resolving Dependencies
--> Running transaction check
---> Package cassandra.noarch 0:3.11.0-1 will be installed
--> Finished Dependency Resolution
Dependencies Resolved
 Package            Arch            Version            Repository          Size
 cassandra          noarch          3.11.0-1           cassandra           28 M
Transaction Summary
Install  1 Package
Total download size: 28 M
Installed size: 37 M
Downloading packages:
warning: /var/cache/yum/x86_64/20/cassandra/packages/cassandra-3.11.0-1.noarch.rpm: Header V4 RSA/SHA256 Signature, key ID fe4b2bda: NOKEY
Public key for cassandra-3.11.0-1.noarch.rpm is not installed
cassandra-3.11.0-1.noarch.rpm                               |  28 MB  00:07     
Retrieving key from
Importing GPG key 0xF2833C93:
 Userid     : "Eric Evans <>"
 Fingerprint: cec8 6bb4 a0ba 9d0f 9039 7cae f835 8fa2 f283 3c93
 From       :
Importing GPG key 0x8D77295D:
 Userid     : "Eric Evans <>"
 Fingerprint: c496 5ee9 e301 5d19 2ccc f2b6 f758 ce31 8d77 295d
 From       :
Importing GPG key 0x2B5C1B00:
 Userid     : "Sylvain Lebresne (pcmanus) <>"
 Fingerprint: 5aed 1bf3 78e9 a19d ade1 bcb3 4bd7 36a8 2b5c 1b00
 From       :
Importing GPG key 0x0353B12C:
 Userid     : "T Jake Luciani <>"
 Fingerprint: 514a 2ad6 31a5 7a16 dd00 47ec 749d 6eec 0353 b12c
 From       :
Importing GPG key 0xFE4B2BDA:
 Userid     : "Michael Shuler <>"
 Fingerprint: a26e 528b 271f 19b9 e5d8 e19e a278 b781 fe4b 2bda
 From       :
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction (shutdown inhibited)
Warning: RPMDB altered outside of yum.
  Installing : cassandra-3.11.0-1.noarch                                    1/1 
  Verifying  : cassandra-3.11.0-1.noarch                                    1/1 
  cassandra.noarch 0:3.11.0-1                                                   

Starting Cassandra

You should start Cassandra as the cassandra user. Before starting Cassandra, you need to create a .bashrc file for the cassandra user because one isn’t created by default since you can’t log on to the Linux OS as the cassandra user. The home directory for the cassandra user is /var/lib/cassandra and the owner of that directory is the root user.

As the root user, create the following .bashrc file for the cassandra user:

# Wrap sqlplus with rlwrap to edit prior lines with the
# up, down, left and right keys.
  if [ "$RLWRAP" = "0" ]; then
    cqlsh "$@"
    rlwrap cqlsh "$@"
# Set vi as a command line editor.
set -o vi
# Add the Java and JRE paths to the $PATH environments.
export set PATH=$PATH:/usr/lib/jvm/java-1.8.0-openjdk-
# Add the $JAVA_HOME and $JRE_HOME environment variables.
export set JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-
export set JRE_HOME=/usr

You should start Cassandra in background, like this:


Using Cassandra

As the student user in my developer Fedora instance, you should be able to connect using the following:


You will see the following:

Connected to Test Cluster at
[cqlsh 5.0.1 | Cassandra 3.11.0 | CQL spec 3.4.4 | Native protocol v4]
Use HELP for help.
cqlsh> HELP
Documented shell commands:
CQL help topics:
AGGREGATES               CREATE_KEYSPACE           DROP_TRIGGER      TEXT     
ALTER_TABLE              CREATE_TABLE              FUNCTIONS         TRUNCATE 
ALTER_TYPE               CREATE_TRIGGER            GRANT             TYPES    
ALTER_USER               CREATE_TYPE               INSERT            UPDATE   
APPLY                    CREATE_USER               INSERT_JSON       USE      
ASCII                    DATE                      INT               UUID     
BATCH                    DELETE                    JSON            
BEGIN                    DROP_AGGREGATE            KEYWORDS        
BOOLEAN                  DROP_FUNCTION             LIST_ROLES      
COUNTER                  DROP_INDEX                LIST_USERS      
CREATE_FUNCTION          DROP_ROLE                 SELECT          
CREATE_INDEX             DROP_TABLE                SELECT_JSON

Written by maclochlainn

July 25th, 2017 at 9:23 pm