Home

Awesome

CogCompNLP

Build Status Build status

This project collects a number of core libraries for Natural Language Processing (NLP) developed by Cognitive Computation Group.

How to use it?

Depending on what you are after, follow one of the items:

CogComp's main NLP libraries

Each library contains detailed readme and instructions on how to use it. In addition the javadoc of the whole project is available here.

ModuleDescription
nlp-pipelineProvides an end-to-end NLP processing application that runs a variety of NLP tools on input text.
core-utilitiesProvides a set of NLP-friendly data structures and a number of NLP-related utilities that support writing NLP applications, running experiments, etc.
corpusreadersProvides classes to read documents from corpora into core-utilities data structures.
curatorSupports use of CogComp NLP Curator, a tool to run NLP applications as services.
edisonA library for feature extraction from core-utilities data structures.
lemmatizerAn application that uses WordNet and simple rules to find the root forms of words in plain text.
tokenizerAn application that identifies sentence and word boundaries in plain text.
transliterationAn application that transliterates names between different scripts.
posAn application that identifies the part of speech (e.g. verb + tense, noun + number) of each word in plain text.
nerAn application that identifies named entities in plain text according to two different sets of categories.
mdAn application that identifies entity mentions in plain text.
relation-extractionAn application that identifies entity mentions, then identify relation pairs among the mentions detected.
quantifierThis tool detects mentions of quantities in the text, as well as normalizes it to a standard form.
inferenceA suite of unified wrappers to a set optimization libraries, as well as some basic approximate solvers.
depparseAn application that identifies the dependency parse tree of a sentence.
verbsenseThis system addresses the verb sense disambiguation (VSD) problem for English.
prepsrlAn application that identifies semantic relations expressed by prepositions and develops statistical learning models for predicting the relations.
commasrlThis software extracts relations that commas participate in.
similarityThis software compare objects --especially Strings-- and return a score indicating how similar they are.
temporal-normalizerA temporal extractor and normalizer.
dataless-classifierClassifies text into a user-specified label hierarchy from just the textual label descriptions
external-annotatorsA collection useful external annotators.

Using each library programmatically

To include one of the modules in your Maven project, add the following snippet with the #modulename# and #version entries replaced with the relevant module name and the version listed in this project's pom.xml file. Note that you also add to need the <repository> element for the CogComp maven repository in the <repositories> element.

    <dependencies>
         ...
        <dependency>
            <groupId>edu.illinois.cs.cogcomp</groupId>
            <artifactId>#modulename#</artifactId>
            <version>#version#</version>
        </dependency>
        ...
    </dependencies>
    ...
    <repositories>
        <repository>
            <id>CogCompSoftware</id>
            <name>CogCompSoftware</name>
            <url>http://cogcomp.org/m2repo/</url>
        </repository>
    </repositories>

Citing

If you are using the framework, please cite our paper:

@inproceedings{2018_lrec_cogcompnlp,
    author = {Daniel Khashabi, Mark Sammons, Ben Zhou, Tom Redman, Christos Christodoulopoulos, Vivek Srikumar, Nicholas Rizzolo, Lev Ratinov, Guanheng Luo, Quang Do, Chen-Tse Tsai, Subhro Roy, Stephen Mayhew, Zhili Feng, John Wieting, Xiaodong Yu, Yangqiu Song, Shashank Gupta, Shyam Upadhyay, Naveen Arivazhagan, Qiang Ning, Shaoshi Ling, Dan Roth},
    title = {CogCompNLP: Your Swiss Army Knife for NLP},
    booktitle = {11th Language Resources and Evaluation Conference},
    year = {2018},
    url = "http://cogcomp.org/papers/2018_lrec_cogcompnlp.pdf",
}