Home

Awesome

emmorph2conll

The script converts the output tag of emMorph morphological analyzer to the corresponding tag of a version Szeged Treebank.

What's in this repo?

The tagsets :hungary:

A detailed description of the tagsets is available here.

emMorph

emMorph is the current morphological analyzer for Hungarian and it is integrated into the e-magyar language processing toolchain. The list of emMorph tags is from here.

CoNLL

What we call here CoNLL is a modified version of the morphosyntactic tagset of MULTEXT transformed into a feature-value pair structure. This modified tagset is an annotation scheme for a version of the largest fully manually annotated corpus of Hungarian, Szeged Treebank.

How to use the converter?

Dependencies

Python3

License

GNU General Public License v3.0

Our converters