Home

Awesome

Onto2Vec

UPDATE

An improved version of our method is available at : https://github.com/bio-ontology-research-group/opa2vec/

Introduction

Onto2Vec is a program that can be used to produce feature vectors for biological entities based on their annotations to biomedical ontologies. Onto2Vec uses the integrity of the axioms describing the structure of an ontology as well as the entity-concept annotation axioms. This is a detailed documentation on how to use Onto2Vec to produce ontology-based feature vectors. The Onto2Vec implementation shared here uses GO ontology as a case study and can be directly used to produce feature vectors for GO functions and proteins. However, Onto2Vec can also be used to produce feature vectors for entities using any other biomedical ontology. We provide here instructions on how to use Onto2Vec with the ontology of your choice.

Prerequisites

Onto2Vec implementation uses Groovy with Grape for dependency management (http://docs.groovy-lang.org/latest/html/documentation/grape.html), Python and Perl. No other programs are required to run it.

OWL Processing

The first step of Onto2Vec is to process the ontology using OWL API and infer new axioms using a reasoner.

Adding Annotation axioms

The second step of Onto2Vec is to add annotation axioms to our corpus of axioms. This can be easily done using some simple text mining code without using the OWL API sice there is no needed owl processing.

Representation Learning

Since the axioms corpus is now ready, the only step left is to run Word2Vec to obtain the representation vectors in two steps:

Final notes

For any comments or help needed with how to run Onto2Vec, please send an email to: fzohrasmaili@gmail.com