Home

Awesome

basex-rdf

RDF parsing for XQuery (in BaseX)

Table of Contents

Overview and status

This is an extension module for parsing RDF data with the BaseX XQuery processor. It provides an XQuery wrapper for a Java parser generated by Gunther Rademacher's REx Parser Generator. The parser was generated from the EBNF for the TriG serialization of RDF, which provides a syntax for encoding named graphs, as an extension of the Turtle and N-Triples serializations. The parser itself will generate XML parse trees for any of these three serializations. However, the raw parse trees are not particularly useful and need to be normalized before being further processed and queried. Internally, the normalization stylesheet maps Turtle, N-Triples, and TriG to a common XML serialization format.

Modeling and schema design possibilities for the normalized XML representation of the parsed RDF data are still under consideration, and current approaches are subject to change.

Dependencies

Packaging

Installation

See the BaseX wiki for detailed documentation about installing and using BaseX.

Usage

Currently, basex-rdf includes an XQuery library module, basex-rdf.xqm, and two XSLT stylesheets, process.xsl and postprocess.xsl. The first stylesheet (process.xsl) normalizes the raw XML parse tree, and the second exposes some simple abstractions for querying the RDF data. The XQuery library module acts as a controller for calling the stylesheets.

Namespaces

The parser component of the combined Java/XQuery module is bound to the following namespace:

http://basex.org/modules/rdf/Graphs (here bound to the prefix "graphs")

The XQuery library module takes the following namespace:

https://metadatafram.es/basex/modules/rdf/graphs/ (here bound to the prefix "basex-rdf")

Functions

graphs:parse(xs:string)
  Parses RDF data as a string and returns an XML parse tree.
basex-rdf:transform(xs:string)
  Accepts a string of RDF data and calls graphs:parse() to return an XML parse tree. 
  Passes the parsed data to the process.xsl stylesheet.
  Returns a normalized XML document of RDF statements.  
basex-rdf:pass-options(element(options))
  Helper function. Accepts a set of options in an <options> element.
  Passes the options to the basex-rdf:query() function.  
basex-rdf:query(document-node(), element(options))
  Accepts the normalized RDF document and query options.
  Calls the postprocess.xsl stylesheet to query the data.  

Getting started

The basex-rdf:query function implements a set of simple abstractions for navigating RDF graphs, loosely based on the navigation functions implemented by the RDFLib Python package.

This function can be used to query small datasets, but for anything larger, internal BaseX value indexes for text and attributes would be advised.

The basex-rdf.xq main module provides a simple example of querying parsed RDF data in BaseX.

The options element can be used to declare basic graph query patterns. For example:

<options>
  <subject></subject>
  <verb>rdf:type</verb>
  <object></object>  
</options>

can be used to list all subject-object pairs linked by the rdf:type predicate. Note the use of the qualified name in the predicate; prefixes must match the ones declared in the RDF source data. Alternatively, an unqualified IRI (e.g., http://www.w3.org/1999/02/22-rdf-syntax-ns#type) may be used instead.

TODO

License

The basex-rdf module is licensed under GPLv3. Parsers generated by the REx Parser Generator are supplied under the Apache 2.0 license.