Cassandra on Fedora 30
The first thing to do with Fedora 30 is to check what part of Apache Cassandra is installed. You can use the following rpm
command to determine that:
rpm -qa | grep cassandra |
My Fedora 30 returned the following values:
cassandra-java-libs-3.11.1-12.fc30.x86_64 cassandra-python2-cqlshlib-3.11.1-12.fc30.x86_64 cassandra-3.11.1-12.fc30.x86_64 python2-cassandra-driver-3.18.0-1.fc30.x86_64 |
Notably missing from the list of rpm
list is the cassandra-server
package. You install cassandra-server with the def utility:
dnf install -y cassandra-server |
You should get an installation log like the following for the cassandra-server
package:
Display detailed console log →
Last metadata expiration check: 0:26:07 ago on Wed 11 Sep 2019 09:10:08 PM MDT. Package cassandra-3.11.1-12.fc30.x86_64 is already installed. Dependencies resolved. =========================================================================================================================== Package Architecture Version Repository Size =========================================================================================================================== Installing: cassandra-server x86_64 3.11.1-12.fc30 fedora 180 k Installing dependencies: sigar x86_64 1.6.5-0.20.git58097d9.fc27 fedora 76 k Transaction Summary =========================================================================================================================== Install 2 Packages Total download size: 255 k Installed size: 738 k Is this ok [y/N]: y Downloading Packages: (1/2): sigar-1.6.5-0.20.git58097d9.fc27.x86_64.rpm 131 kB/s | 76 kB 00:00 (2/2): cassandra-server-3.11.1-12.fc30.x86_64.rpm 233 kB/s | 180 kB 00:00 --------------------------------------------------------------------------------------------------------------------------- Total 116 kB/s | 255 kB 00:02 Running transaction check Transaction check succeeded. Running transaction test Transaction test succeeded. Running transaction Preparing : 1/1 Installing : sigar-1.6.5-0.20.git58097d9.fc27.x86_64 1/2 Running scriptlet: sigar-1.6.5-0.20.git58097d9.fc27.x86_64 1/2 Running scriptlet: cassandra-server-3.11.1-12.fc30.x86_64 2/2 Installing : cassandra-server-3.11.1-12.fc30.x86_64 2/2 Running scriptlet: cassandra-server-3.11.1-12.fc30.x86_64 2/2 Verifying : cassandra-server-3.11.1-12.fc30.x86_64 1/2 Verifying : sigar-1.6.5-0.20.git58097d9.fc27.x86_64 2/2 Installed: cassandra-server-3.11.1-12.fc30.x86_64 sigar-1.6.5-0.20.git58097d9.fc27.x86_64 Complete! |
Fedora Magazine has a great Get Started with Apache Cassandra on Fedora article on all the steps required to setup clusters. This article only covers creating and enabling the Cassandra service, and setting up a single node Cassandra instance.
You start Cassandra with the following command as the root
user:
systemctl start cassandra |
You enable Cassandra with the following command as the root user:
systemctl enable cassandra |
It creates the following symlink:
Created symlink /etc/systemd/system/multi-user.target.wants/cassandra.service → /usr/lib/systemd/system/cassandra.service. |
You can connect to the Test cluster with the following command:
cqlsh |
You should see the following:
Connected to Test Cluster at 127.0.0.1:9042. [cqlsh 5.0.1 | Cassandra 3.11.1 | CQL spec 3.4.4 | Native protocol v4] Use HELP for help. |
You can see the options by typing the help
command:
Documented shell commands: =========================== CAPTURE CLS COPY DESCRIBE EXPAND LOGIN SERIAL SOURCE UNICODE CLEAR CONSISTENCY DESC EXIT HELP PAGING SHOW TRACING CQL help topics: ================ AGGREGATES CREATE_KEYSPACE DROP_TRIGGER TEXT ALTER_KEYSPACE CREATE_MATERIALIZED_VIEW DROP_TYPE TIME ALTER_MATERIALIZED_VIEW CREATE_ROLE DROP_USER TIMESTAMP ALTER_TABLE CREATE_TABLE FUNCTIONS TRUNCATE ALTER_TYPE CREATE_TRIGGER GRANT TYPES ALTER_USER CREATE_TYPE INSERT UPDATE APPLY CREATE_USER INSERT_JSON USE ASCII DATE INT UUID BATCH DELETE JSON BEGIN DROP_AGGREGATE KEYWORDS BLOB DROP_COLUMNFAMILY LIST_PERMISSIONS BOOLEAN DROP_FUNCTION LIST_ROLES COUNTER DROP_INDEX LIST_USERS CREATE_AGGREGATE DROP_KEYSPACE PERMISSIONS CREATE_COLUMNFAMILY DROP_MATERIALIZED_VIEW REVOKE CREATE_FUNCTION DROP_ROLE SELECT CREATE_INDEX DROP_TABLE SELECT_JSON |
Here’s my script that creates Cassandra keyspace, which is more or less a database. You use the USE
command to connect to the keyspace or database, like you would in MySQL. You do not have sequences in Cassandra because they’re not a good fit for a distributed architecture. Cassandra does not support a native procedural extension like relational databases. You must create User-defined functions (UDFs) by embedding the logic in Java.
This script does the following:
- Creates a keyspace
- Uses the keyspace
- Conditionally drops tables and functions
- Creates two tables
- Inserts data into the two tables
- Queries data from the tables
I also included a call to a UDF inside a query in two of the examples. One of the queries demonstrates how to return a JSON structure from a query. To simplify things and provide clarification of the scripts behaviors, the details are outlined below.
- The first segment of the script creates the keyspace, changes the scope to use the keyspace, conditionally drop tables, create tables, and insert values into the tables:
/* Create a keyspace in Cassandra, which is like a database in MySQL or a schema in Oracle. */ CREATE KEYSPACE IF NOT EXISTS student WITH REPLICATION = { 'class':'SimpleStrategy' ,'replication_factor': 1 } AND DURABLE_WRITES = true; /* Use the keyspace or connect to the database. */ USE student; /* Drop the member table from the student keyspace. */ DROP TABLE IF EXISTS member; /* Create a member table in the student keyspace. */ CREATE TABLE member ( member_number VARCHAR , member_type VARCHAR , credit_card_number VARCHAR , credit_card_type VARCHAR , PRIMARY KEY ( member_number )); /* Conditionally drop the contact table from the student keyspace. */ DROP TABLE IF EXISTS contact; /* Create a contact table in the student keyspace. */ CREATE TABLE contact ( contact_number VARCHAR , contact_type VARCHAR , first_name VARCHAR , middle_name VARCHAR , last_name VARCHAR , member_number VARCHAR , PRIMARY KEY ( contact_number )); /* Insert a row into the member table. */ INSERT INTO member ( member_number, member_type, credit_card_number, credit_card_type ) VALUES ('SFO-12345','GROUP','2222-4444-5555-6666','VISA'); /* Insert a row into the contact table. */ INSERT INTO contact ( contact_number, contact_type, first_name, middle_name, last_name, member_number ) VALUES ('CUS_00001','FAMILY','Barry', NULL,'Allen','SFO-12345'); /* Insert a row into the contact table. */ INSERT INTO contact ( contact_number, contact_type, first_name, middle_name, last_name, member_number ) VALUES ('CUS_00002','FAMILY','Iris', NULL,'West-Allen','SFO-12345'); /* Insert a row into the member table. */ INSERT INTO member ( member_number, member_type, credit_card_number, credit_card_type ) VALUES ('SFO-12346','GROUP','3333-8888-9999-2222','VISA'); /* Insert a row into the contact table. */ INSERT INTO contact ( contact_number, contact_type, first_name, middle_name, last_name, member_number ) VALUES ('CUS_00003','FAMILY','Caitlin','Marie','Snow','SFO-12346');
The following queries the member table:
/* Select all columns from the member table. */ SELECT * FROM member;
It returns the following:
member_number | credit_card_number | credit_card_type | member_type ---------------+---------------------+------------------+------------- SFO-12345 | 2222-4444-5555-6666 | VISA | GROUP SFO-12346 | 3333-8888-9999-2222 | VISA | GROUP
- Create a
concatenate
User-defined function (UDF) for Cassandra. The first step requires you to edit thecassandra.yaml
file, which you find in the/etc/cassandra/default.conf
directory. There is a single parameter that you need to edit, and it is theenable_user_defined_functions
parameter. By default the parameter is set tofalse
, and you need to enable it to create UDFs.If you open the
cassandra.yaml
file as theroot
user, you should find the parameter on line 987, like:983 984 985 986 987
# If unset, all GC Pauses greater than gc_log_threshold_in_ms will log at # INFO level # UDFs (user defined functions) are disabled by default. # As of Cassandra 3.0 there is a sandbox in place that should prevent execution of evil code. enable_user_defined_functions: false
After you make the edit, the
cassandra.yaml
file should look like this:983 984 985 986 987
# If unset, all GC Pauses greater than gc_log_threshold_in_ms will log at # INFO level # UDFs (user defined functions) are disabled by default. # As of Cassandra 3.0 there is a sandbox in place that should prevent execution of evil code. enable_user_defined_functions: true
After you make the change, you can create your own UDF. The following UDF formats the first, middle, and last name so there’s only one whitespace between the first and last name when there middle name value is null.
This type of function must use a
CALLED ON NULL INPUT
clause in lieu of aRETURNS NULL ON NULL INPUT
clause. The latter would force the function to return a null value if any one of the parameters were null./* Drop the concatenate function because a replace disallows changing a RETURNS NULL ON NULL INPUT with a CALLED ON NULL INPUT without raising an "89: InvalidRequest" exception. */ DROP FUNCTION concatenate; /* Create a user-defined function to concatenate names. */ CREATE OR REPLACE FUNCTION concatenate (first_name VARCHAR, middle_name VARCHAR, last_name VARCHAR) CALLED ON NULL INPUT RETURNS VARCHAR LANGUAGE java AS $$ /* Concatenate first and last names when middle name is null, and first, middle, and last names when middle name is not null. */ String name; /* Check for null middle name. */ if (middle_name == null) { name = first_name + " " + last_name; } else { name = first_name + " " + middle_name + " " + last_name; } return name; $$;
Query the values from the contact
table with the UDF function in the SELECT
-list:
/* Query the contact information. */ SELECT member_number , contact_number , contact_type , concatenate(first_name, middle_name, last_name) AS full_name FROM contact; |
It returns the following:
member_number | contact_number | contact_type | full_name ---------------+----------------+--------------+-------------------- SFO-12345 | CUS_00001 | FAMILY | Barry Allen SFO-12345 | CUS_00002 | FAMILY | Iris West-Allen SFO-12346 | CUS_00003 | FAMILY | Caitlin Marie Snow |
Query the values from the contact
table with a JSON format:
/* Query the contact information and return in a JSON format. */ SELECT JSON contact_number , contact_type , concatenate(first_name, middle_name, last_name) AS full_name FROM contact; |
It returns the following:
[json] ------------------------------------------------------------------------------------------------- {"contact_number": "CUS_00001", "contact_type": "FAMILY", "full_name": "Barry Allen"} {"contact_number": "CUS_00002", "contact_type": "FAMILY", "full_name": "Iris West-Allen"} {"contact_number": "CUS_00003", "contact_type": "FAMILY", "full_name": "Caitlin Marie Snow"} |