Home

Awesome

Sparqlify SPARQL->SQL rewriter

Build Status

Introduction

Sparqlify is a scalable SPARQL-SQL rewriter whose development began in April 2011 in the course of the LinkedGeoData project.

This system's features/traits are:

Functions

SPARQL-to-SQL function mappings are specified in the file functions.xml.

<details> <summary>Standard SPARQL functions</summary>
SPARQL functionSQL Definition
boolean strstarts(string, string)strpos($1$, $2$) = 1
TODO
</details> <details> <summary>Spatial Function Extensions </summary>
SPARQL functionSQL Definition
TODO
</details>

Supported SPARQL language features

Debian packages

Sparqlify Debian packages can be obtained by following means:

Public repositories

After setting up any of the repositories below, you can install sparqlify with apt using

Linked Data Stack (this is what you want)

Sparqlify is distributed at the Linked Data Stack, which offers many great tools done by various contributors of the Semantic Web community.

# !!! Replace stable with nightly or testing as needed !!!

# Download the repository package
wget http://stack.linkeddata.org/ldstable-repository.deb

# Install the repository package
sudo dpkg -i ldstable-repository.deb

# Update the repository database
sudo apt-get update

Bleeding Edge (Not recommended for production)

For the latest development version (built on every commit) perform the following steps

Import the public key with

wget -qO - http://cstadler.aksw.org/repos/apt/conf/packages.precise.gpg.key  | sudo apt-key add -

Add the repository

echo 'deb http://cstadler.aksw.org/repos/apt precise main contrib non-free' | sudo tee -a /etc/apt/sources.list.d/cstadler.aksw.org.list

Note that this also works with distros other than "precise" (ubuntu 12.04) such as ubuntu 14.04 or 16.04.

Building

Building the repository creates the JAR files providing the sparqlify-* tool suite.

One of the plugins requires the xjc command (for compiling an XML schema to Java classes) which is no longer part of the jdk. The following package provides it:

sudo apt install jaxb

Debian package

Building debian packages from this repo relies on the Debian Maven Plugin plugin, which requires a debian-compatible environment. If such an environment is present, the rest is simple:

# Install all shell scripts necessary for creating deb packages
sudo apt-get install devscripts

# Execute the follwing from the `<repository-root>/sparqlify-core` folder:
mvn clean install deb:package

# Upon sucessful completion, the debian package is located under `<repository-root>/sparqlify-core/target`
# Install using `dpkg`
sudo dpkg -i sparqlify_<version>.deb

# Uninstall using dpkg or apt:
sudo dpkg -r sparqlify
sudo apt-get remove sparqlify

Assembly based

Another way to build the project is run the following commands at <repository-root>

mvn clean install

cd sparqlify-cli
mvn assembly:assembly

This will generate a single stand-alone jar containing all necessary dependencies. Afterwards, the shell scripts under sparqlify-core/bin should work.

Tool suite

If Sparqlify was installed from the debian package, the following commands are available system-wide:

These tools write their output (such as RDF data in the N-TRIPLES format) to STDOUT. Log output goes to STDERR.

sparqlify

Usage: sparqlify [options]

Options are:

Example

The following command will start the Sparqlify HTTP server on the default port.

sparqlify -h localhost -u postgres -p secret -d mydb -m mydb-mappings.sml -n 1000 -t 30

Agents can now access the SPARQL endpoint at http://localhost:7531/sparql

sparqlify-csv

Usage: sparqlify-csv [options]

sparqlify-platform (Deprecated; about to be superseded by sparqlify-web-admin)

The Sparqlify Platform (under /sparqlify-platform) bundles Sparqlify with the Linked Data wrapper Pubby and the SPARQL Web interface Snorql.

Usage: sparqlify-platform config-dir [port]

For building, at the root of the project (outside of the sparqlify-* directories), run mvn compile to build all modules. Afterwards, lauch the platform using:

cd sparqlify-platform/bin
./sparqlify-platform <path-to-config> <port>

Assuming the platform runs under http://localhost:7531, you can access the following services relative to this base url:

Configuration

The configDirectory argument is mandatory and must contain a sub-directory for the context-path (i.e. sparqlify-platform) in turn contains the files:

I recommend to first create a copy of the files in /sparqlify-platform/config/example under a different location, then adjust the parameters and finally launch the platform with -DconfigDirectory=... set appropriately.

The platform applies autoconfiguration to Pubby and Snorql:

Additionally you probably want to make the URIs nice by e.g. configuring an apache reverse proxy:

Enable the apache proxy_http module:

sudo a2enmod proxy_http

Then in your /etc/apache2/sites-available/default add lines such as

ProxyRequest Off
ProxyPass /resource http://localhost:7531/pubby/bizer/bsbm/v01/ retry=1
ProxyPassReverse /resource http://localhost:7531/pubby/bizer/bsbm/v01/

These entries will enable requests to http://localhost/resource/... rather than http//localhost:7531/pubby/bizer/bsbm/v01/.

The retry=1 means, that apache only waits 1 seconds before retrying again when it encounters an error (e.g. HTTP code 500) from the proxied resource.

IMPORTANT: ProxyRequests are off by default; DO NOT ENABLE THEM UNLESS YOU KNOW WHAT YOU ARE DOING. Simply enabling them potentially allows anyone to use your computer as a proxy.

SML Mapping Syntax:

A Sparqlification Mapping Language (SML) configuration is essentially a set of CREATE VIEW statements, somewhat similar to the CREATE VIEW statement from SQL. Probably the easiest way to learn to syntax is to look at the following resources:

Two more examples are from

Additionally, for convenience, prefixes can be declared, which are valid throughout the config file. As comments, you can use //, /* */, and #.

For a first impression, here is a quick example:

/* This is a comment
 * /* You can even nest them! */
 */
// Prefixes are valid throughout the file
Prefix dbp:<http://dbpedia.org/ontology/>
Prefix ex:<http://ex.org/>

Create View myFirstView As
    Construct {
        ?s a dbp:Person .
        ?s ex:workPage ?w .
    }
With
    ?s = uri('http://mydomain.org/person', ?id) // Define ?s to be an URI generated from the concatenation of a prefix with mytable's id-column.
    ?w = uri(?work_page) // ?w is assigned the URIs in the column 'work_page' of 'mytable'
Constrain
    ?w prefix "http://my-organization.org/user/" // Constraints can be used for optimization, e.g. to prune unsatisfiable join conditions
From
    mytable; // If you want to use an SQL query, the query (without trailing semicolon) must be enclosed in double square brackets: [[SELECT id, work_page FROM mytable]]

Notes for sparqlify-csv

For sparqlify-csv view definition syntax is almost the same as above; the differences being:

Colums can be referenced either by name (see the -h option) or by index (1-based).

Example

// Assume a CSV file with the following columns (osm stands for OpenStreetMap)
(city\_name, country\_name, osm\_entity\_type, osm\_id, longitude, latitude)

Prefix fn:<http://aksw.org/sparqlify/> //Needed for urlEncode and urlDecode.
Prefix rdfs:<http://www.w3.org/2000/01/rdf-schema#>
Prefix owl:<http://www.w3.org/2002/07/owl#>
Prefix xsd:<http://www.w3.org/2001/XMLSchema#>
Prefix geo:<http://www.w3.org/2003/01/geo/wgs84_pos#>

Create View Template geocode As
  Construct {
    ?cityUri
      owl:sameAs ?lgdUri .

    ?lgdUri
      rdfs:label ?cityLabel ;
      geo:long ?long ;
      geo:lat ?lat .
  }
  With
    ?cityUri = uri(concat("http://fp7-pp.publicdata.eu/resource/city/", fn:urlEncode(?2), "-", fn:urlEncode(?1)))
    ?cityLabel = plainLiteral(?1)
    ?lgdUri = uri(concat("http://linkedgeodata.org/triplify/", ?4, ?5))
    ?long = typedLiteral(?6, xsd:float)
    ?lat = typedLiteral(?7, xsd:float)