Awesome
Apertium + Python
Introduction
- The codebase is in development for the GSoC '19 project called Apertium API in Python
- The Apertium core modules are written in C++.
- This project makes the Apertium modules available in Python, which because of its simplicity is more appealing to users.
About the Exisiting Code Base
- The existing codebase has
Subprocess
and SWIG wrapper implementations of the higher level functions of Apertium and CG modules.
Installation
-
Installation on Debian/Ubuntu and Windows is natively supported:
pip install apertium
-
For developers,
pipenv
can be used to install the development dependencies and enter a shell with them:pip install pipenv pipenv install --dev pipenv shell
-
Apertium packages can be installed from Python interpreter as well.
- Install
apertium-all-dev
by callingapertium.installer.install_apertium()
- Install language packages with
apertium.installer.install_module(language_name)
. For exampleapertium-eng
can be installed by executingapertium.installer.install_module('eng')
- Install
Usage
- For multiple invocations
Method 1
is more performant, as the dictionary needs to be loaded only once.
Analysis
Performing Morphological Analysis
Method 1: Create an Analyzer
object and call its analyze
method.
In [1]: import apertium
In [2]: a = apertium.Analyzer('en')
In [3]: a.analyze('cats')
Out[3]: [cats/cat<n><pl>, ./.<sent>]
Method 2: Calling analyze()
directly.
In [1]: import apertium
In [2]: apertium.analyze('en', 'cats')
Out[2]: cats/cat<n><pl>
Generation
Performing Morphological Generation
Method 1: Create a Generator
object and call its generate
method.
In [1]: import apertium
In [2]: g = apertium.Generator('en')
In [3]: g.generate('^cat<n><pl>$')
Out[3]: 'cats'
Method 2: Calling generate()
directly.
In [1]: import apertium
In [2]: apertium.generate('en', '^cat<n><pl>$')
Out[2]: 'cats'
Tagger
Method 1: Create a Tagger
object and call its tag
method.
In [1]: import apertium
In [2]: tagger = apertium.Tagger('eng')
In [3]: tagger.tag('cats')
Out[3]: [cats/cat<n><pl>]
Method 2: Calling tag()
directly.
In [1]: import apertium
In [2]: apertium.tag('en', 'cats')
Out[2]: [cats/cat<n><pl>]
Translation
Method 1: Create a Translator
object and call its translate
method.
In [1]: import apertium
In [2]: t = apertium.Translator('eng', 'spa')
In [3]: t.translate('cats')
Out[3]: 'Gatos'
Method 2: Calling translate()
directly.
In [1]: import apertium
In [2]: apertium.translate('en', 'spa', 'cats')
Out[2]: 'Gatos'
Installing more modes from other language data
One can also install modes by providing the path to the lang-data
:
In [1]: import apertium
In [2]: apertium.append_pair_path('..')