Awesome
Indic NLP Library
This repository is a de-bloated fork of the original Indic NLP Library and integrates UrduHack submodule and Indic NLP Resources directly. This allows to work with Urdu normalization and tokenization without needing to install urduhack and indic_nlp_resources
separately, which can be an issue sometimes as it is TensorFlow
based. This repository is mainly created and mainted for IndicTrans2 and IndicTransTokenizer
For any queries, please get in touch with the original authors/maintainers of the respective libraries:
Indic NLP Library
: anoopkunchukuttanIndic NLP Resources
: anoopkunchukuttanUrduHack
: UrduHack
Usage:
git clone https://github.com/VarunGumma/indic_nlp_library.git
cd indic_nlp_library
pip install --editable ./
Updates:
- Integrated
urduhack
directly into the repository. - Renamed
master
branch asmain
. - Integrated
indic_nlp_resources
directly into the repository. - De-bloated the repository.