Home

Awesome

SymSpellChecker.jl

DocumentationBuild Status
StableDevBuildCoverage

Julia port of SymSpell, extremely fast spelling correction and fuzzy search algorithm.

TL;DR

using SymSpellChecker

d = SymSpell()
push!(d, "hello")
push!(d, "world")

d["wrold"] = ["world"]

Dictionary creation

Dictionaries can be created as follows

using SymSpellChecker

# Loading from file
d = SymSpell("assets/frequency_dictionary_en_30_000.txt")

# Manual update
d = SymSpell()
push!(d, "hello", 100)
push!(d, "world", 50)

Third term in push! function is the word frequency, which is used later in lookup to sort results from highest frequency to the lowest.

SymSpell constructor has following arguments

Lookup procedure

Words search can be made as follows

lookup(d, "wrold") # [SuggestItem("world", 1, 50)]

Here 1 is a Damerau-Levenshtein distance between world and wrold, 50 is a word frequency in current dictionary.

One can extract only words from lookup result

term.(lookup(d, "wrold")) = ["world"]

There is more convenient form of lookup exists

d["wrold"] = ["world"]

Search arguments can be passed either in lookup function or set globally with the help of set_options!(d::SymSpell; kwargs...) command.

set_options!(d, include_unknown = true, verbosity = "closest")
d["wrold"] = ["wrold", "world"]

# this is equivalent to
term.(lookup(d, include_unknown = true, verbosity = "closest"))

Following arguments are supported

License

The SymSpellChecker.jl package is licensed under the MIT License. This package is based on SymSpell and it's python adaptation. Some parts of the code is based on StringDistances.jl.