Home

Awesome

README

CRAN_Status_Badge

ClickHouse (https://clickhouse.com/) is an open-source, high performance columnar OLAP (online analytical processing of queries) database management system for real-time analytics using SQL. This DBI backend relies on the ‘ClickHouse’ HTTP interface and support HTTPS protocol.

This package has been developed as an alternative to the excellent RClickhouse to provide secured connection through SSL using HTTPS (unfortunately SSL connection is not yet supported by RClickhouse).

The ClickHouseHTTP R package is licensed under GPL-3.

Installation

From CRAN

install.packages("ClickHouseHTTP")

Dependencies

The following R packages available on CRAN are required:

And those are suggested:

From github

devtools::install_github("patzaw/ClickHouseHTTP")

Documentation

Usage

Connection

library(DBI)
## HTTP connection
con <- dbConnect(
   ClickHouseHTTP::ClickHouseHTTP(), host="localhost",
   port=8123
)
## HTTPS connection (without ssl peer verification)
con <- dbConnect(
   ClickHouseHTTP::ClickHouseHTTP(), host="localhost",
   port=8443, https=TRUE, ssl_verifypeer=FALSE
)

Write a table in the database

library(dplyr)
data("mtcars")
mtcars <- as_tibble(mtcars, rownames="car")
dbWriteTable(con, "mtcars", mtcars)

Query the database

carsFromDB <- dbReadTable(con, "mtcars")
dbGetQuery(con, "SELECT car, mpg, cyl, hp FROM mtcars WHERE hp>=110")

By default, ClickHouseHTTP relies on the Apache Arrow format provided by ClickHouse. However, as described in the documentation, the following types are not supported in the current implementation of this format: TIME32, FIXED_SIZE_BINARY, JSON, UUID, ENUM. The format argument of the dbGetQuery() function can be used to rely on the TabSeparatedWithNamesAndTypes format.

selCars <- dbGetQuery(
   con, "SELECT car, mpg, cyl, hp FROM mtcars WHERE hp>=110",
   format="TabSeparatedWithNamesAndTypes"
)
## Identifying the original ClickHouse data types
attr(selCars, "type")

Using alternative databases stored in ClickHouse

dbSendQuery(con, "CREATE DATABASE swiss")
dbSendQuery(con, "USE swiss")

The chosen database is used until the session expires. It can also be chosen when connecting using the dbname argument of the dbConnect() function.

The example below shows that spaces in column names are supported. It also shows the support of R list using the Array ClickHouse type.

data("swiss")
swiss <- as_tibble(swiss, rownames="province")
swiss <- mutate(swiss, "pr letters"=strsplit(province, ""))
dbWriteTable(
   con, "swiss", swiss,
   engine="MergeTree() ORDER BY (Fertility, province)"
)
swissFromDB <- dbReadTable(con, "swiss")

A table from another database can also be accessed as following:

dbReadTable(con, SQL("default.mtcars"))

Setting up a ClickHouse database using docker

Configuration

CH_HOME=~/Documents/Projects/Test_CH
mkdir -p $CH_HOME
mkdir -p ${CH_HOME}/data
mkdir -p ${CH_HOME}/conf
mkdir -p ${CH_HOME}/log

Configuration files are shared in the supp/ClickHouse-Conf-Files folder in this repository.

cp supp/ClickHouse-Conf-Files/users.xml ${CH_HOME}/conf/
cp supp/ClickHouse-Conf-Files/config.xml ${CH_HOME}/conf/

SSL certificate

openssl req -subj "/CN=localhost" -new -newkey rsa:2048 -days 365 -nodes -x509 -keyout ${CH_HOME}/conf/server.key -out ${CH_HOME}/conf/server.crt
openssl dhparam -out ${CH_HOME}/conf/dhparam.pem 4096

Container

The following ports are supported in the shared config.xml file:

chmod -R a+rwx ${CH_HOME}
docker run -d --name Test_CH \
    --ulimit nofile=262144:262144 \
    --volume ${CH_HOME}/data:/var/lib/clickhouse \
    --publish=9000:9000 \
    --publish=9440:9440 \
    --publish=8123:8123 \
    --publish=8443:8443 \
    --volume ${CH_HOME}/conf/users.xml:/etc/clickhouse-server/users.xml \
    --volume ${CH_HOME}/conf/config.xml:/etc/clickhouse-server/config.xml \
    --volume ${CH_HOME}/conf/server.crt:/etc/clickhouse-server/server.crt \
    --volume ${CH_HOME}/conf/server.key:/etc/clickhouse-server/server.key \
    --volume ${CH_HOME}/conf/dhparam.pem:/etc/clickhouse-server/dhparam.pem \
    --volume ${CH_HOME}/log:/var/log/clickhouse-server \
    clickhouse/clickhouse-server:22.2.3.5

Alternatives

Acknowledgments

This work was entirely supported by UCB Pharma (Early Solutions department).