Home

Awesome

Notes

Materials-Related Databases

These databases are sorted by the number of claimed data entries and by existence of a method of programmatic access

DatabaseDescriptionContactAPI/DocsSize
ChemspiderChemSpider is a free chemical structure database providing fast access to over 32 million structures, properties, and associated information. By integrating and linking compounds from ~500 data sources, ChemSpider enables researchers to discover the most comprehensive view of freely available chemical data from a single online search. It is owned by the Royal Society of Chemistry.REST python (chemspipy)30 M chemicals and compounds
CCDC (Cambridge Crystallographic Data Centre)CSD is the world’s repository for small-molecule organic and metal-organic crystal structures. Containing the results of over half-a-million x-ray and neutron diffraction analyses this unique database of accurate 3D structures has become an essential resource to scientists around the world.>700k X-ray and neutron diffraction analyses and 3D structures
Automatic Flow for Materials Discovery (AFLOW)A distributed materials properties repository from high-throughput ab initio calculationsStefano CurtaroloREST630000 compounds
Open Quantum Materials Database (OQMD)The OQMD is a database DFT calculated thermodynamic and structural properties. We are providing this online interface for convenient, small scale access; however for more powerful utilization we recommend downloading the entire database and the API for interfacing with it, detailed in the link below.Chris Wolvertonpython (qmpy)285780 compounds
The Materials ProjectHarnessing the power of supercomputing and state of the art electronic structure methods, the Materials Project provides open web-based access to computed information on known and predicted materials as well as powerful analysis tools to inspire and design novel materials.Kristin Persson, Gerbrand Cederpython58000 compounds, 33000 band structures
Computational Materials Repository (CMR)The Computational Materials Repository (CMR) provides ways to easily store, retrieve and to search for your electronic structure calculations.docs python
3D Materials AtlasThe Materials Atlas contains a repository for 3D experiments and simulations on a variety of material systems.
Interatomic Potentials Repository ProjectThis repository provides a source for interatomic potentials (force fields), related files, and evaluation tools to help researchers obtain interatomic models and judge their quality and applicability.Chandler Becker (NIST), Zachary Trautt (NIST)
Web Force-Field (WebFF)The Web Force-Field (WebFF) repository consists of three main components: 1) a database, 2) a software engine, and 3) a web-client interface. The repository database supports a multi-table format, where each table is a distinct force field.Frederick Phelan (NIST)
ThermoMLThis page contains links to ThermoML files, which represent experimental thermophysical and thermochemical property data reported in the corresponding articles published by major journals in the field. These files are posted here through cooperation between the Thermodynamics Research Center (TRC) at the National Institute of Standards and Technology (NIST) and the journal publishers.
GBRV Pseudopotential LibraryThis site hosts the GBRV pseudopotential library, a highly accurate and computationally inexpensive open-source pseudopotential library which has been designed and optimized for use in high-throughput DFT calculations and released under the gnu public license. We provide potential files for direct use with the Quantum Espresso, Abinit, and JDFTx plane-wave pseudopotential codes, as well as input files for the Vanderbilt Ultrasoft pseudopotential generator.Kevin Garrity (NIST)
PAW atomic datasetsOn this page you can find files containing PAW atomic datasets for ABINIT and tools to produce your own.
PSlibraryPSlibrary is a library of inputs for the ld1.x atomic code. It allows the generation of PAW data sets and ultrasoft pseudopotentials for many elements.

#Cloud Services

#Relevant Codes

CodeDescriptionMaintained byAPI/Docs
MAterials Simulation Toolkit (MAST)MAST is an automated workflow manager and post-processing tool.MAST focuses on diffusion and defect workflows that use density functional theory. It interfaces primarily with the Vienna Ab-initio Simulation Package (VASP).Dane Morgandocs
NWChemNWChem aims to provide its users with computational chemistry tools that are scalable both in their ability to treat large scientific computational chemistry problems efficiently, and in their use of available parallel computing resources from high-performance parallel supercomputers to conventional workstation clusters.PNNLdocs
pymatgenPymatgen-db is a database add-on for the Python Materials Genomics (pymatgen) materials analysis library. It enables the creation of Materials Project-style MongoDB databases for management of materials data. A query engine is also provided to enable the easy translation of MongoDB docs to useful pymatgen objects for analysis purposes.Materials Genome Initiativedocs
pymatgen-dbPymatgen (Python Materials Genomics) is a robust, open-source Python library for materials analysis. It currently powers the public Materials Project, an initiative to make calculated properties of all known inorganic materials available to materials researchers.Materials Genome Initiativedocs
SwiftFast easy parallel scripting - on multicores, clusters, clouds and supercomputers.University of Chicago / Argonnedocs

Python Machine Learning Stack and Resources

CodeDescriptionDocs
scikit-learnMachine learning in Python. Simple and efficient tools for data mining and data analysis.docs
SciPyPython-based ecosystem of open-source software for mathematics, science, and engineering.docs
numpyNumPy is the fundamental package for scientific computing with Python. It contains among other things: a powerful N-dimensional array object; sophisticated (broadcasting) functions;tools for integrating C/C++ and Fortran code;useful linear algebra, Fourier transform, and random number capabilitiesdocs
pandasPython package providing fast, flexible, and expressive data structures designed to make working with “relational” or “labeled” data both easy and intuitive. It aims to be the fundamental high-level building block for doing practical, real world data analysis in Python.docs

API Wrappers

Other Databases, Codes, and Efforts to Sort and Catalog