Awesome
Awesome KGC Tools
Links and description of Knowledge Graphs Construction Tools
KGC Materializers
- Morph-KGC - R2RML, RML and RML-star processor to generate RDF and RDF-star knowledge graphs from heterogeneous data sources at scale.
- Chimera - Framework based on Apache Camel to define composable semantic data transformation pipelines (lifting/lowering to/from RDF)
- RMLMapper - The RMLMapper executes RML rules to generate high-quality Linked Data from multiple originally (semi-)structured data sources
- RMLStreamer - The RMLStreamer executes RML rules to generate high-quality Linked Data from multiple originally (semi-)structured data sources in a streaming way.
- xls2rdf - converts Excel files containing a "magic line" into RDF.
- Morph-xR2RML - Implementation of the xR2RML mapping language (extending R2RML and reusing RML terms) for MongoDB databases. Can be used to map JSON data but also any format that can be imported in MongoDB, in particular CSV/TSV. Was used in different projects to produce 2.4 billion triples so far.
- SDM-RDFizer - An efficient scaled-up RML-compliant engine for knowledge graph construction from heterogeneous data sources.
- CARML - An extensible RML processor to generate RDF knowledge graphs from heterogeneous data sources.
- R2RML-F - An R2RML processor with support for functions in JavaScript. Allows one to transform the contents of CSV files as virtual relational tables.
- RocketRML - RML processor to generate RDF knowledge graphs from heterogeneous data sources, implemented in JavaScript.
- PyRML - Python-based engine for processing RML files.
- FlexRML - A Memory-Efficient Interpreter for RML written in C++.
- SPARQL Anything - SPARQL Anything is a tool for querying anything with SPARQL.
- Helio Ecosystem - A framework based on plugins for generate and manipulate RDF knowledge graphs with RML or a custom mapping language based on Freemarker
KGC Virtualizers
- Ontop - Ontop is a platform to query relational databases as Virtual RDF Graphs using SPARQL (R2RML)
KGC Pre-processors
- MEL - (Metadata Extractor & Loader) - A tool to extract metadata (and textual content) from various file formats, as JSON objects.
- Dragoman - An efficient RML+FnO-compliant engine for translating and executing complex functions in RML mapping rules and transfer the data integration system into a function-free one.
- EABlock - A computational block to solve entity alignment over textual attributes in a knowledge graph creation pipeline.
- FunMap - Efficient preprocessing of transformation rules described in RML+FnO mappings.
- Excel in RML - RMLMapper extension to support Excel spreadsheets.
NLP for KGC
- TNNT - (The NLP/NER Toolkit) - A tool that automates the extraction of categorised named entities from the unstructured information encoded in the source documents, using diverse NLP tools and NER models.
Mapping Specifications
- RML by KG Construction W3C Community Group - Modular redesign of the RML mapping language including support for collections and containers, input/output, rdf-star, and functions.
- YARRRML - YARRRML is a human-readable text-based representation for declarative generation rules.
- J2RM - J2RM mappings and its engine compose a tool to process mappings from JSON data to RDF triples guided by an OWL2 ontology structure.
- xls2rdf - The documentation for the "magic line" of the xls2rdf converter.
- xR2RML - xR2RML is a language for expressing customized mappings from various types of databases (XML, object-oriented, NoSQL) to RDF datasets.
Previous RML version (and extensions)
- RML - The RDF Mapping Language (RML) is a mapping language defined to express customized mapping rules from heterogeneous data structures and serializations to the RDF data model.
- Target in RML - Alignment between RML and Target to describe how your knowledge graph should be exported to one or multiple targets.
- DataIO - Target, a formal model and a common representation for specifying how a Knowledge Graph should be exported to a given target
- FnO - Function Ontology (FnO), a way to semantically declare and describe implementation-independent functions, and their relations to related concepts such as parameters, outputs, related problems, algorithms, mappings to concrete implementations, and executions.
Mapping Editors
- JUMA - Jigsaw Puzzles for Representing Mappings
- Mapeathor - Definition of Excel-based mappings and translation to [R2]RML mappings.
- Matey - Matey is a web based editor for YARRRML rules.
- RMLEditor - RMLEditor offers a Graphical User Interface to enable data publishers, who are domain experts, to model knowledge derived from heterogeneous distributed data.
- RMLx Visual Editor - A web based editor for RML rules.
- Square - SPARQL Queries and R2RML mappings Environment
- Map-On - A web-based editor for visual ontology mapping for R2RML documents (DEPRECATED)
- Karma - A web-based editor for visually creating R2RML mappings in order to create RDF from databases, spreadsheets, delimited text files, XML, JSON, KML and Web APIs according to an ontology of the user's choice.
- Helio Playground - A web-base playground to edit and use RML mappings and custom Freemarker-based mappings.
Mapping Translators
- YARRRML-parser - JavaScript engine that translates from YARRRML/RML/R2RML to YARRRML/RML/R2RML
- YATTER - Python engine, translating from YARRRML/RML/R2RML to YARRRML/RML/R2RML with support for RML-star and easy-to-read outputs
- Mapeathor - From Excel-based mappings to [R2]RML mappings.
Mapping Generators
- Spread2RML - Suggests RML mappings on messy spreadsheets.
- OWL2YARRRML - Generates a mapping template in YARRRML given an ontology.
KGC Pipelines
- KGCP - "KG Construction Pipeline" - A suite of software artifacts to automate the creation of KGs from heterogeneous data sources.
KG subgraph extractors
- KGPrune - An API and Web application for extracting subgraphs of interest from Wikidata based on user-input seed entities, to bootstrap a new KG.
KGC Evaluation
- KROWN - A Benchmark for RDF Graph Materialization
- Data Sprout - Excel spreadsheet generator for evaluating KG construction.
- GTFS-Madrid-Bench - Benchmark to evaluate performance & scalability of declarative KG construction engines.
- SDM-Genomics - Dataset to test simple and complex mapping operations in RML.
- LUBM4OBDA - OBDA benchmark for inference and meta knowledge evaluation.