Awesome
<div align="center"> <a href="https://github.com/Materials-Data-Science-and-Informatics/awesome-fair/"><img width="363" height="246" src="img/awesome_fair_data_logo.png" alt="pyds"></a> <br> <br> <br> </div>Awesome FAIR
<div align="center"><a href="https://github.com/sindresorhus/awesome"> <img src="https://cdn.rawgit.com/sindresorhus/awesome/d7305f38d29fed78fa85652e3a63e154dd8e8829/media/badge.svg" alt="Awesome" border="0"> </a> </div> </br>A by the Helmholtz metadata collaboration (HMC) curated list of awesome stuff around the FAIR principles for (scientific) data, i.e that data is findable, accessable, interoperable and re-usable. The list is organized in use cases of data producers, data users, data curators and data provides. 'FAIR' is not the same as 'open', but there is overlap.
Contents
- Resources about the FAIR principles
- FAIR assessment
- Organizations and Communities
- Metadata formats and standards
- Ontology services
- Finding datasets and software
- Software and software publications
- Provenance tracking
- Metadata management
- Your own repository setup
- Awesome meta data sources
- Related lists
Resources about the FAIR principles
-
Barend Mons article in Nature 578, 491 (2020) - Proposition to invest 5% of research funds in ensuring data are reusable.
-
Cost of not having FAIR research data - A 2018 European Commission Cost-benefit analysis for FAIR research data (Written by PwC EU Services).
-
The FAIR Guiding Principles for scientific data management and stewardship - This Comment in Sci Data is the first formal publication of the FAIR Principles (2016).
-
GO FAIR Zotero Library - Nice collection of publications around the FAIR principles.
FAIR Digital Object and related projects
-
DataPLANT ARC Tool Talk - NFDI4plants interpretation of the FDO based on GitHub repository and RO Crate.
-
DONA Suggested Reading - The history of the Digital Object Architecture (DOA) back into the 80s.
-
FAIR Digital Objects Forum - General platform for discussions on the advancement and development of FAIR Digital Objects.
-
FAIR Digital Object Framework - A WIP specification for an FDO infrastructure based on linked data / RDF.
-
FAIR DO publications - Relevant publications (concept papers and specs) by RDA working groups on FDOs.
-
RO Crate - Pragmatic approach combining existing technologies and ontologies into a metadata standard for annotating scientific. datasets.
-
PID services registry - A searchable registry for PID services.
FAIR assessment
-
FAIR Evaluation Services - A FAIR assessment tool from FAIRsharing, code.
-
F-uji - An (online) tool which can provide a FAIR score for a given PID based on a metric created by FAIRsFAIR, code.
Organizations and Communities
-
EuDat - Collaborative European data infrastructure.
-
FAIRsharing - A curated resource on data and metadata standards, inter-related to databases and data policies.
-
Research Data Alliance - International organization and communication platform for establishing standards and recommendations concerning research data publication.
-
The Turing Way - Handbook and community for reproducible, ethical and collaborative data science.
Metadata formats and standards
-
DataCite - Metadata schema developed by international community with increasing adoption by repositories
-
Data Catalog (DCAT) - RDF vocabulary designed to facilitate interoperability between data catalogs published on the Web.
-
Dublin Core Metadata Initiative Terms - Dublin Core Metadata Element Set, is a set of fifteen "core" elements for describing resources.
-
JSON LD Playground - Convert JSON-LD data between various representations.
-
JSON Schema - Standard for the description of structural constraints in order to do validation of JSON objects.
-
Provenance Primer (PROV) - This primer document provides an accessible introduction to the PROV data model for provenance interchange on the Web.
-
Resource Description Framework (RDF) - RDF is a standard model for data interchange on the Web.
-
Schema.org - Well-established and industry-accepted vocabulary providing semantics for common entities like Person, Organization, Dataset, etc.
-
SKOS - The Simple Knowledge Organization System (SKOS) is a common data model for sharing and linking knowledge organization systems via the Semantic Web.
Ontology services
-
Ontobee - A linked ontology data server to support ontology term dereferencing, linkage, query and integration. See also this publication.
-
Ontology Lookup Service - OLS is a repository for biomedical ontologies that aims to provide a single point of access to the latest ontology versions.
Related semantics lists
Also see
- awesome-ontology - A curated list of ontology things.
- awesome-semantic-tools - List of projects related to Ontology engineering and Semantic Web technologies.
Finding datasets and software
-
Datacite commons - Search through the metadata indexed by Datacite.
-
EuDat B2find - Search through metadata of datasets accumulated by EuDat.
-
Microsoft academy - Mircosoft academy search through a pid graph created by microsoft (shutdown end of 2021).
-
OpenAIRE explorer - Search through the metadata indexed by openaire.
-
Schole explorer - A data literature interlinking service (former scholix), indexes links between data and journal publications. It also provides interfaces and APIs to query the graph.
-
Research Software Repository - Aggregates research software from various sources with information about the problem it solves and its scientific domain.
Software and software publications
-
CITATION.CFF - Plain text files with human- and machine-readable citation information for software (and datasets). Supported by GitHub, Zenodo, Zotero.
-
Citable code with Zenodo & GitHub - Make GitHub repositories citable with Zenodo DOI.
-
CodeMeta - CodeMeta works on providing a minimal metadata schema for science software and code, in JSON and XML to create a concept vocabulary that can be used to standardize the exchange of software metadata across repositories and organizations.
-
fossology - FOSSology is an open source license compliance software system and toolkit. You can run license, copyright and export control scans from the command line.
-
HERMES - A CI based workflow to create and publish software publications to well known repositories.
-
SOMEF - Extract software publication metadata from README and other docs automatically using ML and other techniques to reduce the amount of boilerplate work for the developer.
Related research software lists
- awesome-research-software-registies - Awesome list for where one can register or upload research software.
- awesome-rse - An awesome collection of resources around research software engineering.
Provenance tracking
-
AiiDA - Automated Interactive Infrastructure and Database for Computational Science (AiiDA) to automatically track provenance of simulation workflows and all associated data, code.
-
DataLad - A free and open-source distributed data management system for everyone. It is based on git-annex with manual to automatic provenance tracking, code.
-
MLflow - Tool to track the provenance of machine learning applications, code.
-
CWL - Domain-agnostic and community-driven open standard for description and execution of research workflows that supports provenance tracking (CWLProv) in a standard-compliant way using the existing RO Crate, PROV and BagIt standards.
-
PROV-O Primer - An introduction to the data model of Provenance Ontlogy (PROV-O)
Related workflow tools lists
There is overlap with these more general lists of workflow tools, but not every pipeline or workflow manager includes good provenance tracking.
- awesome-pipeline - A curated list of awesome pipeline toolkits inspired by Awesome Sysadmin.
- Awesome workflow engines - Curated list of awesome open source workflow engines.
- Computational Data Analysis Workflow Systems - A list of existing workflow systems.
Metadata management
Your own repository setup
-
Dataverse - Open source research data repository software code.
-
EuDat B2share - A repository by EuDat, but the software is open sourc, bases in invenio and one can setup own instances of it, code.
-
Invenio - Open source customizable software to setup large scale digital repositories, library systems and data repositories, code.
-
InvenioRDM - The turn-key research data management repository based on Invenio framework and Zenodo.
Awesome meta data sources
-
Microsoft academy graph - All the data and links from Mircosoft academy (shutdown end of 2021).
-
Openaire graph - All metadata contained in the openaire graph.
-
Scholix - A schema for scholarly links. Implemented and deployed by several scholarly link providers.
-
CrossRef - Organization building connections between related entities, building a queryable graph.
Related lists
Awesome lists related to several points.
-
awesome-rse - An awesome list by HIFIS collecting information about research software engineering, touching FAIRness and sustainability
-
awesome-rse-policies - An awesome list by HIFIS collecting information about research software engineering policies, touching FAIRness and sustainability
-
Awesome-open-climate-science - An open science related list specific to the domain of Atmospheric, Ocean, and Climate science.
-
Awesome-open-science-software - A list of open science resources and software.
-
Awesome Curated Tools - A curated list of digital tools we use, ranging from accounting and data science to scientific research and liquid democracy.
Contributing
Contributions are welcome! :sunglasses: </br> If you want to contribute please read the <a href=https://github.com/Materials-Data-Science-and-Informatics/awesome-fair/blob/main/CONTRIBUTING.md>contribution guideline</a>.