Home

Awesome

Indic NLP Library

This repository is a de-bloated fork of the original Indic NLP Library and integrates UrduHack submodule and Indic NLP Resources directly. This allows to work with Urdu normalization and tokenization without needing to install urduhack and indic_nlp_resources separately, which can be an issue sometimes as it is TensorFlow based. This repository is mainly created and mainted for IndicTrans2 and IndicTransTokenizer

For any queries, please get in touch with the original authors/maintainers of the respective libraries:

Usage:

git clone https://github.com/VarunGumma/indic_nlp_library.git

cd indic_nlp_library
pip install --editable ./

Updates: