Awesome

Types: anatomy, award, broadcaster, company, crime, drug, email address, facility, geographic feature, health condition, hashtag, ip address, job title, location, movie, music group, natural event, organization, person, print media, quantity, sport, sporting event, television show, twitter handle, vehicle
Subtypes

Microsoft Cognitive Services Text Analytics (Preview)

General: Overview | Sample output | Demo | Pricing

JavaScript: Node

JVM: Java | Kotlin

What are the entities mentioned in the document?
Where in the document are they mentioned?
What are the URLs to the corresponding Wikipedia entries?
What are their Wikipedia and Bing IDs?

Keyphrase Extraction

Sample input

Amazon Comprehend

General: Overview | Sample output | Demo | Pricing

JavaScript: NPM | Node

JVM: Maven | Java | Kotlin

Which keywords can be extracted for the given document?
How often do each of these keywords occur?

Google Cloud Natural Language

Not supported

IBM Watson Natural Understanding

General: Overview | Sample output | Demo | Pricing

JavaScript: NPM | Node

JVM: Maven | Java | Kotlin

Which keywords can be extracted for the given document?

Microsoft Cognitive Services Text Analytics

General: Overview | Sample output | Demo | Pricing

JavaScript: Node

JVM: Java | Kotlin

Which keywords can be extracted for the given document?

Machine Translation

Sample input

Amazon Translate

General: Overview | Sample output | UI | Pricing

JavaScript: NPM | Node

JVM: Maven | Java | Kotlin

Support for seven languages

Google Cloud Translation API

General: Overview | Sample output | Demo | Pricing

JavaScript: NPM | Node

JVM: Maven | Java | Kotlin

Support for 98 language pairs in neural machine translation model

IBM Watson Language Translator

General: Overview | Sample output | Demo | Price

JavaScript: NPM | Node

JVM: Maven | Java | Kotlin

Support for 33 language pairs

Microsoft Cognitive Services Translator Text

General: Overview | Sample output | Pricing

JavaScript: Node

JVM: Java | Kotlin

Support for 39 language pairs

Sentiment Analysis

Overview | Sample input

Amazon Comprehend

General: Overview | Sample output | Demo | Pricing

JavaScript: NPM | Node

JVM: Maven | Java | Kotlin

To what extent does the document express an overal positive, negative, neutral or mixed sentiment?

Google Cloud Natural Language

General: Overview | Sample output | Demo | Pricing

JavaScript: NPM | Node

JVM: Maven | Java | Kotlin

To what extent does the document express an overal positive, negative, neutral or mixed sentiment?

IBM Watson Natural Understanding

General: Overview | Sample output | Demo | Pricing

JavaScript: NPM | Node

JVM: Maven | Java | Kotlin

To what extent does the document express an overal positive, negative or neutral sentiment?

Microsoft Cognitive Services Text Analytics

General: Overview | Sample output | Demo | Pricing

JavaScript: Node

JVM: Java | Kotlin

To what extent does the document express an overal positive, negative or neutral sentiment?

Speech

<h2 id="speech-to-text">Speech to Text / Speech Recognition</h2>

Sample input

Amazon Transcribe

General: Overview | Sample output | UI | Pricing

JavaScript: NPM | Node

JVM: Maven | Java | Kotlin

Support for US English and Spanish

Google Cloud Speech-to-Text

General: Overview | Sample output | Demo | Pricing

JavaScript: NPM | Node

JVM: Maven | Java | Kotlin

Support for 119 languages/locales

IBM Speech to Text

General: Overview | Sample output | Demo | Pricing

JavaScript: NPM | Node

JVM: Maven | Java | Kotlin

Support for 9 languages

Microsoft Cognitive Services Speech to Text (Preview)

General: Overview | Sample output | Demo | Pricing

JavaScript: Node

JVM: Java | Kotlin

The REST API is limited to utterances of up to 14 seconds.

Support for 8 languages

<h2 id="text-to-speech">Text to Speech / Speech Synthesis</h2>

Overview | Sample input

Amazon Polly

General: Overview | Sample output | UI | Pricing

JavaScript: NPM | Node

JVM: Maven | Java | Kotlin

34 voices in 25 languages

SSML extensions:

Breathing
Dynamic Range Compression
Speaking softly
Timbre
Whispering

Google Cloud Text-to-Speech (Beta)

General: Overview | Sample output | Demo | Pricing

JavaScript: NPM | Node

JVM: Maven | Java | Kotlin

28 voices in 14 languages

IBM Watson Text to Speech

General: Overview | Sample output | Demo | Pricing

JavaScript: NPM | Node

JVM: Maven | Java | Kotlin

13 voices in 7 languages

SSML extensions:

Good news
Apology
Uncertainty

Customization:

Pitch
Glottal tension
Breathiness
Timbre

Microsoft Cognitive Services Text to Speech (Preview)

General: Overview | Sample output | Demo | Pricing

JavaScript: Node

JVM: Java | Kotlin

80 voices in 32 languages

Customization in private preview

Vision

Face Detection

Sample input

Amazon Rekognition

General: Overview | Sample output | Demo | Pricing

JavaScript: NPM | Node

JVM: Maven | Java | Kotlin

Where are the faces and face parts located in the image?
What are the age ranges of the persons shown?
Are they smiling?
Do they wear eyeglasses or sunglasses?
What are their genders?
Do they have a beard or mustache?
Are their eyes or mouth open?
Do they express emotions of happiness, sadness, anger, confusion, disgust, surprise or calmness?
Given a face image, what other image shows the most similar face?
Are the faces in two images of the same person?

Google Cloud Vision

General: Overview | Sample output | Demo | Pricing

JavaScript: NPM | Node

JVM: Maven | Java | Kotlin

Where are the faces and face parts located in the image?
What is the pose of the faces?
Does the faces express emotions states of joy, sorrow, anger or surprise?
Is the person wearing headwear?
Is the photo underexposed or blurred?

IBM Watson Visual Recognition

General: Overview | Sample output | Demo | Pricing

JavaScript: NPM | Node

JVM: Maven | Java | Kotlin

Where are the faces located in the image?
What are the age ranges of the persons shown?
What are their genders?

Microsoft Cognitive Services Face

General: Overview | Sample output | Demo | Pricing

JavaScript: Node

JVM: Java | Kotlin

What are the faces and face parts located in the image?
Are parts of the faces occluded?
What is the pose of the heads?
How old are they?
What are their genders?
Does the face express the emotional states of anger, contempt, disgust, fear, happiness sadness, surprise or a neutral state?
Arey they smiling?
Is the hair visible? What is the hair color? Or is the person bald?
Do they have a moustache, a beard or sideburns?
Are they wearing make-up?
What kind of acessories is the person wearing, if any?
What kind of glasses is the person wearing, if any?
Is the photo blurred? What is the exposure level? What is the noise level?
Are the faces in two images of the same person?

Text Recognition

Sample input

Amazon Rekognition

General: Overview | Sample output | Demo | Pricing

JavaScript: NPM | Node

JVM: Maven | Java

Where in the image file is text located?
What is the text content?
Which boxes do individual words belong to?

Google Cloud Vision

General: Overview | Sample output | Demo | Pricing

JavaScript: NPM | Node

JVM: Maven | Java

Where in the image file is text located?
What is the text content?
What is the language of the text content?

IBM Watson Visual Recognition

This feature is currently in private beta.

Microsoft Cognitive Services Computer Vision

General: Overview | Sample output | Demo | Pricing

JavaScript: Node

JVM: Java

Where in the image file is text located?
What is the text content?

License

This work is licensed under a Creative Commons Attribution 4.0 International License.

Awesome

awesome-ai-services

Sharing

Table of Contents

Natural Language

Speech

Vision

Natural Language

Entity Recognition

Amazon Comprehend

Google Cloud Natural Language

IBM Watson Natural Understanding

Microsoft Cognitive Services Text Analytics (Preview)

Keyphrase Extraction

Amazon Comprehend

Google Cloud Natural Language

IBM Watson Natural Understanding

Microsoft Cognitive Services Text Analytics

Machine Translation

Amazon Translate

Google Cloud Translation API

IBM Watson Language Translator

Microsoft Cognitive Services Translator Text

Sentiment Analysis

Amazon Comprehend

Google Cloud Natural Language

IBM Watson Natural Understanding

Microsoft Cognitive Services Text Analytics

Speech

Amazon Transcribe

Google Cloud Speech-to-Text

IBM Speech to Text

Microsoft Cognitive Services Speech to Text (Preview)

Amazon Polly

Google Cloud Text-to-Speech (Beta)

IBM Watson Text to Speech

Microsoft Cognitive Services Text to Speech (Preview)

Vision

Face Detection

Amazon Rekognition

Google Cloud Vision

IBM Watson Visual Recognition

Microsoft Cognitive Services Face

Text Recognition

Amazon Rekognition

Google Cloud Vision

IBM Watson Visual Recognition

Microsoft Cognitive Services Computer Vision

License