Awesome
emmorph2msd
The script converts the output tag of emMorph morphological analyzer to the corresponding output tag of magyarlanc 2.0.
What's in this repo?
- the main script of the converter:
converter.py
- auxiliary files in folder
converterdata
- license
- this readme
The tagsets :hungary:
A detailed description of the tagsets is available here.
emMorph
emMorph is the current morphological analyzer for Hungarian and it is integrated into the e-magyar language processing toolchain. The list of emMorph tags is from here.
MSD
What we call here MSD is a modified version of the morphosyntactic tagset of MULTEXT. This modified tagset is the output of the second version of magyarlanc, a toolkit for linguistic processing of Hungarian texts and it is also an annotation scheme for a version of the largest fully manually annotated corpus of Hungarian, Szeged Treebank.
How to use the converter?
- standard input: token, lemma, emmorph tag separated by tab
- standard output: msd tag
Dependencies
Python3
License
GNU General Public License v3.0