Home

Awesome

Tabular Data RDF Reader and JSON serializer

CSV reader for RDF.rb and fully JSON serializer.

Gem Version Build Status Coverage Status Gitter chat

Features

RDF::Tabular parses CSV or other Tabular Data into RDF and JSON using the W3C CSVW specifications, currently undergoing development.

Installation

Install with gem install rdf-tabular

Description

RDF::Tabular parses CSVs, TSVs, and potentially other tabular data formats. Using rules defined for W3C CSVW, it can also parse metadata files (in JSON-LD format) to find a set of tabular data files, or locate a metadata file given a CSV:

Metadata can then provide datatypes for the columns, express foreign key relationships, and associate subjects and predicates with columns. An example metadata file for the project DOAP description is:

{
  "@context": "http://www.w3.org/ns/csvw",
  "url": "doap.csv",
  "tableSchema": {
    "aboutUrl": "https://rubygems.org/gems/rdf-tabular",
    "propertyUrl": "http://usefulinc.com/ns/doap#{_name}",
    "null": "",
    "columns": [
      {"titles": "name"},
      {"titles": "type", "propertyUrl": "rdf:type", "valueUrl": "{+type}"},
      {"titles": "homepage", "valueUrl": "{+homepage}"},
      {"titles": "license", "valueUrl": "{+license}"},
      {"titles": "shortdesc", "lang": "en"},
      {"titles": "description", "lang": "en"},
      {"titles": "created", "datatype": {"base": "date", "format": "M/d/yyyy"}},
      {"titles": "programming_language", "propertyUrl": "http://usefulinc.com/ns/doap#programming-language"},
      {"titles": "implements", "valueUrl": "{+implements}"},
      {"titles": "category", "valueUrl": "{+category}"},
      {"titles": "download_page", "propertyUrl": "http://usefulinc.com/ns/doap#download-page", "valueUrl": "{+download_page}"},
      {"titles": "mailing_list", "propertyUrl": "http://usefulinc.com/ns/doap#mailing-list", "valueUrl": "{+mailing_list}"},
      {"titles": "bug_database", "propertyUrl": "http://usefulinc.com/ns/doap#bug-database", "valueUrl": "{+bug_database}"},
      {"titles": "blog", "valueUrl": "{+blog}"},
      {"titles": "developer", "valueUrl": "{+developer}"},
      {"titles": "maintainer", "valueUrl": "{+maintainer}"},
      {"titles": "documenter", "valueUrl": "{+documenter}"},
      {"titles": "maker", "propertyUrl": "foaf:maker", "valueUrl": "{+maker}"},
      {"titles": "dc_title", "propertyUrl": "dc:title"},
      {"titles": "dc_description", "propertyUrl": "dc:description", "lang": "en"},
      {"titles": "dc_date", "propertyUrl": "dc:date", "datatype": {"base": "date", "format": "M/d/yyyy"}},
      {"titles": "dc_creator", "propertyUrl": "dc:creator", "valueUrl": "{+dc_creator}"},
      {"titles": "isPartOf", "propertyUrl": "dc:isPartOf", "valueUrl": "{+isPartOf}"}
    ]
  }
}

This associates the metadata with the CSV doap.csv, creates a common subject for all rows in the file, and a common predicate using the URI Template URI Template http://usefulinc.com/ns/doap#\{_name\} which uses the name of each column (defaulted from titles) to construct a URI in the DOAP vocabulary, and constructs object URIs for object-valued properties from the contents of the column cells. In some cases, the predicates are changed on a per-column basis by using a different propertyUrl property on a given column.

This results in the following Turtle:

@prefix csvw: <http://www.w3.org/ns/csvw#> .
@prefix dc: <http://purl.org/dc/terms/> .
@prefix doap: <http://usefulinc.com/ns/doap#> .
@prefix earl: <http://www.w3.org/ns/earl#> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix prov: <http://www.w3.org/ns/prov#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

<https://rubygems.org/gems/rdf-tabular> a doap:Project,
    earl:TestSubject,
    earl:Software;
  dc:title "RDF::Tabular";
  dc:creator <http://greggkellogg.net/foaf#me>;
  dc:date "2015-01-05"^^xsd:date;
  dc:description "RDF::Tabular processes tabular data with metadata creating RDF or JSON output."@en;
  dc:isPartOf <https://rubygems.org/gems/rdf>;
  doap:blog <http://greggkellogg.net/>;
  doap:bug-database <https://github.com/ruby-rdf/rdf-tabular/issues>;
  doap:category <http://dbpedia.org/resource/Resource_Description_Framework>,
    <http://dbpedia.org/resource/Ruby_(programming_language)>;
  doap:created "2015-01-05"^^xsd:date;
  doap:description "RDF::Tabular processes tabular data with metadata creating RDF or JSON output."@en;
  doap:developer <http://greggkellogg.net/foaf#me>;
  doap:documenter <http://greggkellogg.net/foaf#me>;
  doap:download-page <https://rubygems.org/gems/rdf-tabular>;
  doap:homepage <https://ruby-rdf.github.io/rdf-tabular>;
  doap:implements <http://www.w3.org/TR/tabular-data-model/>,
    <http://www.w3.org/TR/tabular-metadata/>,
    <http://www.w3.org/TR/csv2rdf/>,
    <http://www.w3.org/TR/csv2json/>;
  doap:license <https://unlicense.org/1.0/>;
  doap:mailing-list <http://lists.w3.org/Archives/Public/public-rdf-ruby/>;
  doap:maintainer <http://greggkellogg.net/foaf#me>;
  doap:name "RDF::Tabular";
  doap:programming-language "Ruby";
  doap:shortdesc "Tabular Data RDF Reader and JSON serializer."@en;
  foaf:maker <http://greggkellogg.net/foaf#me> .

 [
    a csvw:TableGroup;
    csvw:table [
      a csvw:Table;
      csvw:row [
        a csvw:Row;
        csvw:describes <https://rubygems.org/gems/rdf-tabular>;
        csvw:rownum 1;
        csvw:url <https://raw.githubusercontent.com/ruby-rdf/rdf-tabular/develop/etc/doap.csv#row=2>
      ], [
        a csvw:Row;
        csvw:describes <https://rubygems.org/gems/rdf-tabular>;
        csvw:rownum 2;
        csvw:url <https://raw.githubusercontent.com/ruby-rdf/rdf-tabular/develop/etc/doap.csv#row=3>
      ], [
        a csvw:Row;
        csvw:describes <https://rubygems.org/gems/rdf-tabular>;
        csvw:rownum 3;
        csvw:url <https://raw.githubusercontent.com/ruby-rdf/rdf-tabular/develop/etc/doap.csv#row=4>
      ], [
        a csvw:Row;
        csvw:describes <https://rubygems.org/gems/rdf-tabular>;
        csvw:rownum 4;
        csvw:url <https://raw.githubusercontent.com/ruby-rdf/rdf-tabular/develop/etc/doap.csv#row=5>
      ];
      csvw:url <https://raw.githubusercontent.com/ruby-rdf/rdf-tabular/develop/etc/doap.csv>
    ];
    prov:wasGeneratedBy [
      a prov:Activity;
      prov:endedAtTime "2022-04-20T12:45:20.616-07:00"^^xsd:dateTime;
      prov:qualifiedUsage [
        a prov:Usage;
        prov:entity <https://raw.githubusercontent.com/ruby-rdf/rdf-tabular/develop/etc/doap.csv>;
        prov:hadRole csvw:csvEncodedTabularData
      ], [
        a prov:Usage;
        prov:entity <https://raw.githubusercontent.com/ruby-rdf/rdf-tabular/develop/etc/doap.csv-metadata.json>;
        prov:hadRole csvw:tabularMetadata
      ];
      prov:startedAtTime "2022-04-20T12:45:20.351-07:00"^^xsd:dateTime;
      prov:wasAssociatedWith <https://rubygems.org/gems/rdf-tabular>
    ]
  ] .

The provenance on table-source information can be excluded by using the :minimal option to the reader.

It can also generate JSON output (not complete JSON-LD, but compatible with it), using the {RDF::Tabular::Reader#to_json} method:

{
  "tables": [{
    "url": "https://raw.githubusercontent.com/ruby-rdf/rdf-tabular/develop/etc/doap.csv",
    "row": [{
      "url": "https://raw.githubusercontent.com/ruby-rdf/rdf-tabular/develop/etc/doap.csv#row=2",
      "rownum": 1,
      "describes": [{
        "@id": "https://rubygems.org/gems/rdf-tabular",
        "http://usefulinc.com/ns/doap#name": "RDF::Tabular",
        "@type": "http://usefulinc.com/ns/doap#Project",
        "http://usefulinc.com/ns/doap#homepage": "https://ruby-rdf.github.io/rdf-tabular",
        "http://usefulinc.com/ns/doap#license": "https://unlicense.org/1.0/",
        "http://usefulinc.com/ns/doap#shortdesc": "Tabular Data RDF Reader and JSON serializer.",
        "http://usefulinc.com/ns/doap#description": "RDF::Tabular processes tabular data with metadata creating RDF or JSON output.",
        "http://usefulinc.com/ns/doap#created": "2015-01-05",
        "http://usefulinc.com/ns/doap#programming-language": "Ruby",
        "http://usefulinc.com/ns/doap#implements": "http://www.w3.org/TR/tabular-data-model/",
        "http://usefulinc.com/ns/doap#category": "http://dbpedia.org/resource/Resource_Description_Framework",
        "http://usefulinc.com/ns/doap#download-page": "https://rubygems.org/gems/rdf-tabular",
        "http://usefulinc.com/ns/doap#mailing-list": "http://lists.w3.org/Archives/Public/public-rdf-ruby/",
        "http://usefulinc.com/ns/doap#bug-database": "https://github.com/ruby-rdf/rdf-tabular/issues",
        "http://usefulinc.com/ns/doap#blog": "http://greggkellogg.net/",
        "http://usefulinc.com/ns/doap#developer": "http://greggkellogg.net/foaf#me",
        "http://usefulinc.com/ns/doap#maintainer": "http://greggkellogg.net/foaf#me",
        "http://usefulinc.com/ns/doap#documenter": "http://greggkellogg.net/foaf#me",
        "foaf:maker": "http://greggkellogg.net/foaf#me",
        "dc:title": "RDF::Tabular",
        "dc:description": "RDF::Tabular processes tabular data with metadata creating RDF or JSON output.",
        "dc:date": "2015-01-05",
        "dc:creator": "http://greggkellogg.net/foaf#me",
        "dc:isPartOf": "https://rubygems.org/gems/rdf"
      }]
    }, {
      "url": "https://raw.githubusercontent.com/ruby-rdf/rdf-tabular/develop/etc/doap.csv#row=3",
      "rownum": 2,
      "describes": [{
        "@id": "https://rubygems.org/gems/rdf-tabular",
        "@type": "http://www.w3.org/ns/earl#TestSubject",
        "http://usefulinc.com/ns/doap#implements": "http://www.w3.org/TR/tabular-metadata/",
        "http://usefulinc.com/ns/doap#category": "http://dbpedia.org/resource/Ruby_(programming_language)"
      }]
    }, {
      "url": "https://raw.githubusercontent.com/ruby-rdf/rdf-tabular/develop/etc/doap.csv#row=4",
      "rownum": 3,
      "describes": [{
        "@id": "https://rubygems.org/gems/rdf-tabular",
        "@type": "http://www.w3.org/ns/earl#Software",
        "http://usefulinc.com/ns/doap#implements": "http://www.w3.org/TR/csv2rdf/"
      }]
    }, {
      "url": "https://raw.githubusercontent.com/ruby-rdf/rdf-tabular/develop/etc/doap.csv#row=5",
      "rownum": 4,
      "describes": [{
        "@id": "https://rubygems.org/gems/rdf-tabular",
        "http://usefulinc.com/ns/doap#implements": "http://www.w3.org/TR/csv2json/"
      }]
    }]
  }]
}

Tutorials

Command Line

When the linkeddata gem is installed, RDF.rb includes a rdf executable which acts as a wrapper to perform a number of different operations on RDF files using available readers and writers, including RDF::Tabular. The commands specific to RDF::Tabular is

To use RDF::Tabular specific features, you must use the --input-format tabular option to the rdf executable.

Other rdf commands and options treat CSV as a standard RDF format.

Example usage:

rdf serialize https://raw.githubusercontent.com/ruby-rdf/rdf-tabular/develop/etc/doap.csv \
  --output-format ttl
rdf tabular-json --input-format tabular https://raw.githubusercontent.com/ruby-rdf/rdf-tabular/develop/etc/doap.csv
rdf validate https://raw.githubusercontent.com/ruby-rdf/rdf-tabular/develop/etc/doap.csv --validate

Note that the --validate option must be used with the validate (or other) command to detect parse-time errors in addition to validating any resulting RDF triples.

RDF Reader

RDF::Tabular also acts as a normal RDF reader, using the standard RDF.rb Reader interface:

graph = RDF::Graph.load("etc/doap.csv", minimal: true)

Documentation

Full documentation available on RubyDoc

Principal Classes

Dependencies

Installation

The recommended installation method is via RubyGems. To install the latest official release of the RDF::Tabular gem, do:

% [sudo] gem install rdf-tabular

Mailing List

Author

Contributing

License

This is free and unencumbered public domain software. For more information, see https://unlicense.org/ or the accompanying {file:UNLICENSE} file.