Awesome
Working with big data databases in Delphi – Cassandra, Couchbase and MongoDB (Part 1 of 3)
This is the first part of a three-part series on working with big data databases directly from Delphi. In the first part we focus on a basic class framework for working with Cassandra along with an example application.
Part 1 focuses on Cassandra, Part 2 focuses on Couchbase and Part 3 focuses on MongoDB.
For more information about us, our support and services visit the Grijjy homepage or the Grijjy developers blog.
The source code and related example repository is hosted on GitHub at https://github.com/grijjy/DelphiCassandra.
Introduction to Cassandra
Apache Cassandra is an open-source database system that was designed to easily scale outwards in a distributed model which is common in cloud computing environments today. Originally developed by Facebook to solve traditional performance issues related to relational databases, Cassandra uses a NoSQL approach and flexible model focused on delivering performance to reads, writes and queries.
Today Cassandra is widely used in many Internet enabled services and private organizations where scale, performance and ease of administration which is critical. Cassandra is able to store a dynamic number of not just rows, but columns and tables which are indexed by keys which makes it highly regarded in solving problems that relate to dynamic and content without a fixed schema.
This is by no means an exhaustive look at the benefits of Cassandra. If you are truly interested there are wealth of resources online on the how to use Cassandra and to what purpose it suits best.
Delphi and Cassandra
To date there are very few examples of leveraging Cassandra within a Delphi application. One of the few examples shows the older API model (before the Cassandra CQL) using Thrift. For our purposes here we are discussing the latest API model as of 2017 using the CQL.
Cassandra also has an extensive and exhaustive API library. While we have created a header translation for most of the library (from the C library interface) we are demonstrating a basic Delphi framework only for the common CRUD related operations.
The examples here is for Delphi on Windows using a Cassandra remote database running on Linux. We use Ubuntu 16.04 LTS for our examples.
DataStax Cassandra Driver
This framework relies on the DataStax C/C++ Driver to communicate with Cassandra. It wraps the interface so it can be consumed by Delphi. You can find out more information here, http://datastax.github.io/cpp-driver/ including a prepackaged installer for Windows binaries.
Installing Cassandra Server
The official Cassandra installation instructions focus on Debian, but since we prefer Ubuntu we are only going to discuss those steps here. Installing Cassandra involves 3 main steps:
- Install Ubuntu 16.04 LTS.
- Install Java 8
- Install Cassandra
To install Java 8:
sudo add-apt-repository -y ppa:webupd8team/java
sudo apt-get update
sudo apt-get -y install oracle-java8-installer
To install Cassandra 3.9:
echo "deb http://www.apache.org/dist/cassandra/debian 39x main" | sudo tee -a /etc/apt/sources.list.d/cassandra.list
Note: Cassandra 3.9 is the current version, so replace the 39x above with whatever is the latest edition.
Add the public key for Cassandra:
gpg --keyserver pgp.mit.edu --recv-keys 749D6EEC0353B12C
gpg --export --armor 749D6EEC0353B12C | sudo apt-key add -
Update and install:
sudo apt-get update
sudo apt-get install cassandra
Other tips
On Ubuntu if you want Cassandra to always start when the system restarts, then type:
systemctl start cassandra
systemctl enable cassandra
To see if your single Cassandra node is operating, you can use the nodetool command:
nodetool status
Note: The UN letters that appear before the IP address of the node, means the node is UP (U) and the node in normal (N).
Note: If you are operating a single node cluster, then you must modify the /etc/cassandra/cassandra.yaml and set the rpc_address to 0.0.0.0 and the broadcast_rpc_address to the IP of the server itself.
Hello World Example
To connect to a Cassandra cluster you provide the contact points. In our example this is a single server's IP address, but it could be multiple nodes in a cluster or variety of other supported Cassandra related syntax.
Cassandra := TgoCassandra.Create;
Cassandra.ContactPoints := '1.2.3.4';
if Cassandra.Connect then
begin
// Connected
end
Note: Before you can perform any basic database operation, you must create both a Cassandra keyspace and a table.
To create a keyspace:
var
Statement: TgoCassStatement;
begin
Statement := TgoCassStatement.Create(
'CREATE KEYSPACE keyspace_test WITH REPLICATION = { ''class'' : ''SimpleStrategy'', ''replication_factor'' : 1 };');
try
if Cassandra.Execute(Statement) then
begin
// Keyspace created
end;
finally
Statement.Free;
end;
end;
To create a table:
var
Statement: TgoCassStatement;
begin
Statement := TgoCassStatement.Create(
'CREATE TABLE keyspace_test.table_test (' +
'user_id uuid,' +
'from_user text,' +
'time timestamp,' +
'PRIMARY KEY (from_user, time)' +
')' +
' WITH CLUSTERING ORDER BY (time ASC);');
try
if Cassandra.Execute(Statement) then
begin
// Table created
end;
finally
Statement.Free;
end;
end;
Cassandra Uuids
Cassandra uses unique identifiers in a specialized format. This is exposed through the TgoCassUuidGen class.
CassUuidGen := TgoCassUuidGen.Create;
Inserting
To insert data you construct a CassStatement related to the keyspace and the table and bind the values to the statement. The values can be simple data types, uuids or timestamps, as an example.
var
Statement: TgoCassStatement;
begin
CassUuidGen := TgoCassUuidGen.Create;
Statement := TgoCassStatement.Create('INSERT INTO keyspace_test.table_test (user_id, from_user, time) VALUES (?, ?, ?);', 3);
try
Statement.Bind(0, CassUuidGen.New);
Statement.Bind(1, 'user4');
Statement.Bind(2, gODateTimeToMillisecondsSinceEpoch(TTimeZone.Local.ToUniversalTime(Now), True));
if Cassandra.Execute(Statement) then
begin
// Insert success
end;
finally
Statement.Free;
end;
end;
Query a single row
To query a single row you construct a statement and upon success, examine the first row.
var
Statement: TgoCassStatement;
QueryFuture: TgoCassFuture;
CassResult: TgoCassResult;
Row: TgoCassRow;
begin
Statement := TgoCassStatement.Create('SELECT * FROM keyspace_test.table_test WHERE from_user = ''user4'';');
try
if Cassandra.Execute(Statement, QueryFuture) then
begin
// Query success
CassResult := TgoCassResult.Create(QueryFuture);
try
if CassResult.Success then
begin
// At least one result was returned
Row := CassResult.FirstRow;
if Row <> nil then
begin
Writeln('user_id = ' + FCassUuidGen.AsString(Row.GetUuid('user_id')));
Writeln('from_user = ' + Row.GetString('from_user'));
Writeln('time = ' + Row.GetInt64('time').ToString);
end;
end;
finally
CassResult.Free;
end;
end
else
Writeln('Query Failure = ' + Cassandra.LastErrorDesc);
finally
Statement.Free;
end;
end;
Query multiple rows
To query a multiple rows you construct a statement and upon success, iterate the rows.
var
Statement: TgoCassStatement;
QueryFuture: TgoCassFuture;
CassResult: TgoCassResult;
Row: TgoCassRow;
Iterator: TgoCassIterator;
begin
Statement := TgoCassStatement.Create('SELECT * FROM keyspace_test.table_test;');
try
if Cassandra.Execute(Statement, QueryFuture) then
begin
// Query success
CassResult := TgoCassResult.Create(QueryFuture);
try
if CassResult.Success then
begin
Iterator := CassResult.Iterator;
while Iterator.Next do
begin
Row := Iterator.GetRow;
Writeln('user_id = ' + FCassUuidGen.AsString(Row.GetUuid('user_id')));
Writeln('from_user = ' + Row.GetString('from_user'));
Writeln('time = ' + Row.GetInt64('time').ToString);
end;
end;
finally
CassResult.Free;
end;
end
else
Writeln('Query Failure = ' + Cassandra.LastErrorDesc);
finally
Statement.Free;
end;
end;
Handling errors
If a method fails the the Cassandra.LastErrorDesc
and Cassandra.LastErrorCode
will be updated with error related information.
License
TgoCassandra and DelphiCassandra is licensed under the Simplified BSD License. See License.txt for details.