Home

Awesome

emDepPy

A wrapper and REST API implemented in Python for emDep (Bohnet parser a.k.a. Mate Tools).

WARNING: This module is not thread-safe! One can not load multiple models simultaneously!

WARNING: This wrapper is only compatible with JAVA 11 or higher!

Requirements

Install on local machine

Usage

It is recommended to use the program as the part of e-magyar language processing framework.

If all columns are already exists one can use python3 -m emdeppy with the unified xtsv CLI API.

When --maxlen [n: Int > 0] is supplied only sentences with at least n tokens are parsed longer ones get _ for all fields.

Train

The training is currently available from JAVA CLI only:

  1. cat SzegedDep/*.conll-2009 | awk -F$'\t' -v OFS=$'\t' '{if ($0 != "") print $0,"_","_","_","_","_","_","_"; else print $0}{}' > train_corpus.txt
  2. empdejava -Xmx2G -classpath ./emdeppy/anna-3.61.jar is2.parser.Parser -model szk.mate.new.model -train train_corpus.txt

For more training parameters see the documentation or the source code.

License

This Python wrapper is licensed under the LGPL 3.0 license. The model and the included jar file have their own licenses.