Home

Awesome

omitie

OCaml bindings to MIT's MITIE information extraction library

Synopsis

Experimental bindings to the newly-released MITIE library.
Also see this blog post MITIE currently only provides functions for named entity recognition but work is apparently under way on support for all manner of information extraction.

Install

  1. Install MITIE via the instructions in the MITIE readme
  2. ocaml setup.ml -configure
  3. ocaml setup.ml -build
  4. ocaml setup.ml -install

If you just want to see omitie in action you can just build and play with simple.native which mirrors the ner_example executable in the MITIE distribution. Provided you've downloaded ner_model.dat this will extract named entities from a text file and print the results to stdout.

Example of use of simple.native:

$ ./simple.native ner_model.dat sample_text.txt
The tagger supports 4 tags:
    PERSON
    LOCATION
    ORGANIZATION
    MISC
   Tag: ORGANIZATION Pegasus Airlines
   Tag: LOCATION Istanbul
   Tag: LOCATION Sochi
   Tag: LOCATION Russia
   Tag: LOCATION Turkey
   Tag: ORGANIZATION Transportation Ministry
   Tag: PERSON Meredith Vieira
   Tag: MISC Olympics
   Tag: PERSON Bob Costas
   Tag: PERSON Vieira
   Tag: MISC Olympics
   Tag: PERSON Bob Costas
   Tag: PERSON Bob
   Tag: ORGANIZATION NBC Olympics
   Tag: PERSON Jim Bell
   Tag: LOCATION Sochi

Fire up the included example http micro-service (you must compile with --enable-webserver):

./server.native -p 9000 -d

then

$ curl "http://localhost:9000/extract?text=have+you+been+to+columbia+university+in+new+york"  | jq '.'
{
  "results": [
    {
      "entity": "columbia university",
      "tag": "ORGANIZATION"
    },
    {
      "entity": "new york",
      "tag": "LOCATION"
    }
  ]
}