Awesome
Multi-String Pattern Matching algorithm.
This implementation is inspired from Aho-Corasick algorithm
Getting Started
modelA = mspm.NewModel("mspm_model_A")
patternsToSearch = strings.NewReader(words) // words is newline seperated list of words/keywords
// build mspm model with patterns
modelA.Build(patternsToSearch)
inputString := "input document where the above pattern is searched for."
document := strings.NewReader(inputString)
output, err := modelA.MultiTermMatch(document)
// output ~= [{matched_word: n_count}, ..]
Test Coverage
TrieNode vs TrieHashNode benchmark
$ cd github.com/BlackRabbitt/mspm/ds/trie
$ go test -bench=.
goos: darwin
goarch: amd64
pkg: github.com/BlackRabbitt/mspm/ds/trie
BenchmarkTrieNodeInsert-4 500000 2582 ns/op
BenchmarkTrieNodeSearch-4 10000000 205 ns/op
BenchmarkTrieHashNodeInsert-4 1000000 1491 ns/op
BenchmarkTrieHashNodeSearch-4 10000000 206 ns/op
PASS
ok github.com/BlackRabbitt/mspm/ds/trie 8.795s
Resources
- Trie
- mspm - Multi-String Pattern Matching algorithm. Generally used for Information Retrieval.
- Aho-Corasick algorithm