Home

Awesome

Ruby Natural Language Processing Resources

A collection of Natural Language Processing (NLP) Ruby libraries, tools and software. Suggestions and contributions are welcome.

Categories

APIs

3rd party NLP services

Client libraries to various 3rd party NLP API services.

Instant Messaging Bots

Client/server libraries to various 3rd party instant messengers chat bots APIs.

Facebook Messenger

Kik

Microsoft Bot Framework (Skype)

Slack

Telegram Messenger

Wechat

Natural Language Understanding Tools

Voice-based devices bots

Client/server libraries to various 3rd party voice-based devices APIs.

Amazon Echo Alexa skills

Books

Bitext Alignment

Bitext alignment is the process of aligning two parallel documents on a segment by segment basis. In other words, if you have one document in English and its translation in Spanish, bitext alignment is the process of matching each segment from document A with its corresponding translation in document B.

Case

Chatbot

Classification

Classification aims to assign a document or piece of text to one or more classes or categories making it easier to manage or sort.

Date and Time

Emoji

Error Correction

Full-Text Search

Keyword Ranking

Language Detection

Language Localization

Lexical Databases and Ontologies

Lexical databases, knowledge-base common sense, multilingual lexicalized semantic networks and ontologies

BabelNet

ConceptNet

Mediawiki, Wikipedia

Wordnet

Machine Learning

Machine Translation

Miscellaneous

Multipurpose Tools

The following are libraries that integrate multiple NLP tools or functionality.

Named Entity Recognition

Ngrams

Numbers

Parsers

A natural language parser is a program that works out the grammatical structure of sentences, for instance, which groups of words go together (as "phrases") and which words are the subject or object of a verb.

Part-of-Speech Taggers

Readability

Regular Expressions

Online resources

Ruby NLP Presentations

Sentence Generation

Sentence Segmentation

Sentence segmentation (aka sentence boundary disambiguation, sentence boundary detection) is the problem in natural language processing of deciding where sentences begin and end. Sentence segmentation is the foundation of many common NLP tasks (machine translation, bitext alignment, summarization, etc.).

Speech-to-Text

Stemmers

Stemming is the term used in linguistic morphology and information retrieval to describe the process for reducing inflected (or sometimes derived) words to their word stem, base or root form.

Stop Words

Summarization

Automatic summarization is the process of reducing a text document with a computer program in order to create a summary that retains the most important points of the original document.

Text Extraction

Text Similarity

Text-to-Speech

Tokenizers

Word Count