Awesome
basex-rdf
RDF parsing for XQuery (in BaseX)
Table of Contents
Overview and status
This is an extension module for parsing RDF data with the BaseX XQuery processor. It provides an XQuery wrapper for a Java parser generated by Gunther Rademacher's REx Parser Generator. The parser was generated from the EBNF for the TriG serialization of RDF, which provides a syntax for encoding named graphs, as an extension of the Turtle and N-Triples serializations. The parser itself will generate XML parse trees for any of these three serializations. However, the raw parse trees are not particularly useful and need to be normalized before being further processed and queried. Internally, the normalization stylesheet maps Turtle, N-Triples, and TriG to a common XML serialization format.
Modeling and schema design possibilities for the normalized XML representation of the parsed RDF data are still under consideration, and current approaches are subject to change.
Dependencies
- BaseX 9.0 (currently in beta; see latest development snapshot)
- Saxon-HE 9.8.x (download from SourceForge)
Packaging
- By default, the BaseX 9.0 combined packaging feature is used. This (beta) feature optimizes the packaging of Java extension code in BaseX.
- Alternative packages, such as the EXPath packaging model, may be added in the future (although the package would still be dependent on BaseX).
Installation
See the BaseX wiki for detailed documentation about installing and using BaseX.
- Once BaseX has been downloaded, the easiest way to add the
basex-rdf
module is by launching the BaseX GUI. From theOptions
menu, selectPackages
and install theGraphs.jar
file from the repo directory of the basex-rdf repository. - Before executing functions from the module, ensure that the Saxon-HE 9.8.x JAR file is saved in the
lib
subdirectory of the BaseX installation directory.
Usage
Currently, basex-rdf includes an XQuery library module, basex-rdf.xqm
, and two XSLT stylesheets, process.xsl
and postprocess.xsl
. The first stylesheet (process.xsl) normalizes the raw XML parse tree, and the second exposes some simple abstractions for querying the RDF data. The XQuery library module acts as a controller for calling the stylesheets.
Namespaces
The parser component of the combined Java/XQuery module is bound to the following namespace:
http://basex.org/modules/rdf/Graphs
(here bound to the prefix "graphs")
The XQuery library module takes the following namespace:
https://metadatafram.es/basex/modules/rdf/graphs/
(here bound to the prefix "basex-rdf")
Functions
graphs:parse(xs:string)
Parses RDF data as a string and returns an XML parse tree.
basex-rdf:transform(xs:string)
Accepts a string of RDF data and calls graphs:parse() to return an XML parse tree.
Passes the parsed data to the process.xsl stylesheet.
Returns a normalized XML document of RDF statements.
basex-rdf:pass-options(element(options))
Helper function. Accepts a set of options in an <options> element.
Passes the options to the basex-rdf:query() function.
basex-rdf:query(document-node(), element(options))
Accepts the normalized RDF document and query options.
Calls the postprocess.xsl stylesheet to query the data.
Getting started
The basex-rdf:query function implements a set of simple abstractions for navigating RDF graphs, loosely based on the navigation functions implemented by the RDFLib Python package.
This function can be used to query small datasets, but for anything larger, internal BaseX value indexes for text and attributes would be advised.
The basex-rdf.xq main module provides a simple example of querying parsed RDF data in BaseX.
The options element can be used to declare basic graph query patterns. For example:
<options>
<subject></subject>
<verb>rdf:type</verb>
<object></object>
</options>
can be used to list all subject-object pairs linked by the rdf:type
predicate. Note the use of the qualified name in the predicate; prefixes must match the ones declared in the RDF source data. Alternatively, an unqualified IRI (e.g., http://www.w3.org/1999/02/22-rdf-syntax-ns#type
) may be used instead.
TODO
- Incorporate test suite using XSpec
- Back-convert from internal XML representation to Turtle, etc.
- Map internal XML representations to JSON-LD
- Implement internal transformation as XQuery rather than XSLT
- Add further examples, etc.
License
The basex-rdf module is licensed under GPLv3. Parsers generated by the REx Parser Generator are supplied under the Apache 2.0 license.