Home

Awesome

HexTuples

Status: draft

Version: 0.3.0

HexTuples is a simple datamodel for dealing with linked data. This document both describes the model and concepts of HexTuples, as well as the (at this moment only) serialization format: HexTuples-NDJSON. It is very easy to parse, can be used for streaming parsing and is designed to be highly performant in JS contexts.

Concepts

HexTuple

A single HexTuple is an atomic piece of data, similar to an RDF Triple (also known as Statements or Quads). A HexTuple cotains a small piece of information. HexTuples consist of six fields: subject, predicate, value, datatype, language and graph.

Let's encode the following sentence in HexTuples:

Tim Berners-Lee, the director of W3C, is born in London on the 8th of June, 1955.

SubjectPredicateValueDataTypeLanguageGraph
TimbirthPlaceLondon
TimbirthDate1955-06-08xsd:date
TimjobTitleDirector of W3Crdf:langStringen-US

URI

URI stands for Uniform Resource Identifier, specified in RDF 3986. The best known type of URI is the URL. Although it is currently best practice to use mostly HTTPS URLs as URIs, HexTuples works with any type of URI.

Subject

Predicate

Value

Datatype

Language

Relation to RDF

The HexTuples datamodel closely resembles the RDF Data Model, which is the de-facto standard for linked data. RDF statements are often called Triples, because they consist of a subject, predicate and value. The object field is either a single URI (in Named Nodes), or a combination of three fields (in Literal): value, datatype, language. This means that a single Triple can actually consist of five fields: the subject, predicate, value, datatype and the language. A Quad statement also has a graph, which totals to six fields, hence the name: HexTuples. Instead of making a distinction between Literal statements and NamedNode statements (which have two different models), HexTuples uses a single model that describes both. Having a single model for all statements (HexTuples), makes it easier to serialize, query and store data.

HexTuples-NDJSON

This document serves as a work in progress / draft specification

HexTuples-NDJSON is an NDJSON (Newline Delimited JSON) based HexTuples / RDF serialization format. It is desgined to support streaming parsing and provide great performance in a JS context (i.e. the browser).

Example

English:

Tim Berners-Lee was born in London, on the 8th of june in 1955.

Turtle / N-Triples:

<https://www.w3.org/People/Berners-Lee/> <http://schema.org/birthDate> "1955-06-08"^^<http://www.w3.org/2001/XMLSchema#date>.
<https://www.w3.org/People/Berners-Lee/> <http://schema.org/birthPlace> <http://dbpedia.org/resource/London>.

Expresed in HexTuples:

["https://www.w3.org/People/Berners-Lee/", "http://schema.org/birthDate", "1955-06-08", "http://www.w3.org/2001/XMLSchema#date", "", ""]
["https://www.w3.org/People/Berners-Lee/", "http://schema.org/birthPlace", "http://dbpedia.org/resource/London", "globalId", "", ""]

Implementations

Ontola TypeScript HexTuples Parser

This Typescript code should give you some idea of how to write a parser for HexTuples.

const object = (value: string, datatype: string, language: string): SomeTerm => {
  if (language) {
    return literal(value, language);
  } else if (datatype === 'globalId') {
    return namedNode(value);
  } else if (datatype === 'localId') {
    return blankNode(value);
  }

  return literal(value, namedNode(datatype));
};

const lineToQuad = (h: string[]) => quad(
  h[0].startsWith('_:') ? blankNode(h[0]) : namedNode(h[0]),
  namedNode(h[1]),
  object(h[2], h[3], h[4]),
  h[5] ? namedNode(h[5]) : defaultGraph(),
);

Python RDFlib

An RDF format conversion tool using RDFLib that can convert from/to HexTuples is online at https://tools.dev.kurrawong.ai/convert.

Motivation for HexTuples-NDJSON

HexTuples was designed by Thom van Kalkeren (CTO of Ontola) because he noticed that parsing / serialization was unnecessarily costly in our full-RDF stack, even when using the relatively performant n-quads format.