

Collection of data sets of molecules and properties :gift: :smile:

What is it?


  1. SMILES in the data sets have all been uniformized through the RDKit
  2. Cluster the data sets at the same place. They are all here!
  3. Use it for validating the inference of molecular properties through various machine learning models as proposed in Z. Wu et al.


But what are these data sets?

From Moleculenet.ai, here are their short description and the task for inference between squared brackets (for the regularized data sets reported here):


Source: Moleculenet.ai

Paper: Zhenqin Wu, Bharath Ramsundar, Evan N. Feinberg, Joseph Gomes, Caleb Geniesse, Aneesh S. Pappu, Karl Leswing, Vijay Pande, MoleculeNet: A Benchmark for Molecular Machine Learning, arXiv: 1703.00564, 2017 [cs.LG]