Home

Awesome

SpExtor: Sparse Entity Extractor

The implementation of the method in our COLING 2018 paper "A Practical Incremental Learning Framework For Sparse Entity Extraction".

Resources

Paper and Slides

Data

Annotated Data mentioned in the paper in addition to many other datasets can be downloaded from This Link.

Source Tree

.
└── src
    ├── main  
    │   ├── java 
    │   │   ├── ActiveLearning.java       :  implements active learning
    │   │   ├── Core.java                 :  core code
    │   │   ├── CoreNLPFeaturizer.java    :  Implements an class to use Core NLP NERFeatureFactory
    │   │   ├── EntitySetExpansion.java   :  implementation of the entity set expansion method
    │   │   └── FeatureFactory.java       :  feature factory for the entity set expansion method
    │   │
    │   └── resources
    │       └── wordnet.dict    : contains WordNet dictionaries files
    │
    └── test 
        ├── java 
        │   ├── Dataset.java                    : prepares the testing data
        │   ├── Test_CoreNLPFeaturizer.java     : testing class using CoreNLPFeaturizer which uses Core NLP NERFeatureFactory
        │   └── Test_SpExtor.java               : main testing class to run SpExtor to learn a model from data
        │   
        └── resources   : contains the gold training and testing data

How to use

Licenses

This work is licensed under GPL-3.0 and CreativesForGood licenses. A copy of the first license can be found in this repository. The other license can be found over this link C4G License.

<p float="left"> <img src="https://upload.wikimedia.org/wikipedia/commons/9/93/GPLv3_Logo.svg" alt="GPLv3 Logo" width="60" /> <img src="https://github.com/halolimat/CreativesForGoodLicense/raw/master/CreativesForGoodLogo2.png" alt="CreativesForGood Logo" width="120" style="margin-left: 10px;" /> </p>

Citing

If you do make use of SpExtor or any of its components please cite the following publication:

Hussein S. Al-Olimat, Steven Gustafson, Jason Mackay, Krishnaprasad Thirunarayan, and Amit Sheth. 2018. 
A practical incremental learning framework for sparse entity extraction. In Proceedings of the 27th International
Conference on Computational Linguistics (COLING 2018), pages 700–710. Association for Computational Linguistics.

@InProceedings{C18-1059,
  author = 	"Al-Olimat, Hussein S.
            and Gustafson, Steven
            and Mackay, Jason
            and Thirunarayan, Krishnaprasad
            and Sheth, Amit",
  title = "A Practical Incremental Learning Framework For Sparse Entity Extraction",
  booktitle = "Proceedings of the 27th International Conference on Computational Linguistics",
  year = "2018",
  publisher = "Association for Computational Linguistics",
  pages = "700--710",
  location = "Santa Fe, New Mexico, USA",
  url = "http://aclweb.org/anthology/C18-1059"
}

We would also be very happy if you provide a link to the github repository:

... Sparse Entity Extractor tool (SpExtor)\footnote{
    \url{https://github.com/halolimat/SpExtor}
}