Awesome
ClausIE -- Clause-Based Open Information Extraction
It is a mavenized version of ClauseIE project in Max Planck Institute (configuration and resource files as well as some of the codes structure are modified).
ClausIE is an open information extractor, it identifies and extracts relations and their arguments in natural language text. ClausIE first detects "useful" pieces of information expressed in a sentence, and then represents this information in terms of one or more extractions. The representation of these extractions can be flexibly customized to the underlying application (e.g., binary or n-ary propositions).
Here is the online test provided by Max Planck Institute: ClausIE online demo!.
ClausIE codes download: [link].
ClausIE tutorials: [link].
Requirements
- Java 1.8.
- JOpt Simple (A Java library for parsing command line options,
version>=4.4
), its maven snippet: [link]. - Stanford Parser (It is a program that works out the grammatical structure of sentences,
version==2.0.4
), its maven snippet: [link], since ClausIE uses the pre-trained parser model of Stanford Parser, please also add the following model dependency:
<dependency>
<groupId>edu.stanford.nlp</groupId>
<artifactId>stanford-parser</artifactId>
<version>${stanford-parser.version}</version>
<classifier>models</classifier>
</dependency>
Usage
A pre-built ClausIE jar
file is located at target/ClausIE.jar
, it includes the JOpt
and Stanford Parser
and its model already, which can be add into your project and used directly. (Use IntelliJ to build jar
file please refer: [link])
Or, you can clone this source and follow the article Maven in 5 Minutes to make some changes and mavenize it to use in your own project. To add it as a maven dependency:
<dependencies>
...
<dependency>
<groupId>de.mpii.clausie</groupId>
<artifactId>ClausIE</artifactId>
<version>1.0-SNAPSHOT</version>
</dependency>
...
</dependencies>