Home

Awesome

Assem's Arabic Stemmer DOI

This is an algorithm for Arabic stemming written on Snowball framework language. If offers light stemming and text normalization.

@article{Chelli2018,
author = "Assem Chelli",
title = "{Assem's Arabic Stemmer}",
year = "2018",
month = "11",
url = "https://figshare.com/articles/Assem_s_Arabic_Stemmer/7295690",
doi = "10.6084/m9.figshare.7295690.v1"
}

This is a sample of results:

WordLight StemmerRoot-Based Stemmer
طفلطفلطفل
اطفالاطفالطفل
الاطفالاطفالطفل
اطفالكماطفالطفل
فأطفالكماطفالطفل
اطفالهماطفالطفل
والاطفالاطفالطفل
فاطفالهماطفالطفل
وطفلطفلطفل
الطفولةطفولطفل
والطفلتينطفلطفل
طفلتانطفلطفل

Requirements:

They are already attached as git submodules so just run:

$ git submodule update --init --recursive

Build:

$ make build

Run:

$ make run
الطالب
طالب
$ make run_root
الطالب
طلب

Test:

We configured tests to run against snowball-data arabic sample to test speed, grouping factor and precision.

$ make test

Distributions:

$ make dist