Awesome
<img src="docs/img/logo_title.svg" width="100"/> Finite-state morphology for German
This package started as a migration of a set of finite-state grammars for the morphological analysis of German words delivered with SFST
, a finite-state transducer (FST) toolkit by Helmut Schmid, to Pynini
, another FST toolkit. The latter has the advantage that it is implemented as a python library allowing for seamless interaction with tons of other useful python packages. By now, a number of morphological operations have been added and some analysis strategies adjusted in comparison to the original rule set.
Installation
timur
is implemented in Python 3. In the following, we assume a working Python 3 (tested versions 3.5 and 3.6) installation as well as a working C++ compiler supporting C++-11.
OpenFST
The underlying FST toolkit Pynini
is itself based on OpenFST
, a C++ library for constructing, combining, optimizing, and searching weighted FSTs. Get the latest version of OpenFST which works with the current version of Pynini
(finding a working combination can by a little tricky since Pynini
usually is a bit behind OpenFST; comparing the release dates helps), unpack the archive, build and install via
$ ./configure --enable-grm
$ make
$ [sudo] make install && [sudo ldconfig]
re2
TODO
virtualenv
Using virtualenv
is highly recommended, although not strictly necessary for installing timur
. It may be installed via:
$ [sudo] pip install virtualenv
Create a virtual environement in a subdirectory of your choice (e.g. env
) using
$ virtualenv -p python3 env
and activate it.
$ . env/bin/activate
Python requirements
timur
uses various 3rd party Python packages (including Pynini
) which may best be installed using pip
:
(env) $ pip install -r requirements.txt
Finally, timur
itself can be installed via pip
:
(env) $ pip install .