Awesome
awesome-ai-services
An overview of the AI-as-a-service landscape
Sharing
Table of Contents
Natural Language
Speech
Vision
Natural Language
Entity Recognition
Amazon Comprehend
General: Overview | Sample output | UI | Pricing
- What are the entities mentioned in the document?
- What are their types?
- How often is each of these entities mentioned?
Supported entity types: commercial items, dates, events, locations, organizations, persons, quantities, other types, titles
Google Cloud Natural Language
General: Overview | Sample output | Demo | Pricing
- What are the entities mentioned in the document?
- What are their types?
- How salient is each of these entities in the document?
- Where in the text are these entities mentioned?
- What are the URLs to the corresponding Wikipedia entries?
Supported entity types: consumer good, event, location, organization, person, work of art, other types
IBM Watson Natural Understanding
General: Overview | Sample output | Demo | Pricing
- What are the entities mentioned in the document?
- What are their types and subtypes?
Supported entity types:
- Types: anatomy, award, broadcaster, company, crime, drug, email address, facility, geographic feature, health condition, hashtag, ip address, job title, location, movie, music group, natural event, organization, person, print media, quantity, sport, sporting event, television show, twitter handle, vehicle
- Subtypes
Microsoft Cognitive Services Text Analytics (Preview)
General: Overview | Sample output | Demo | Pricing
JavaScript: Node
- What are the entities mentioned in the document?
- Where in the document are they mentioned?
- What are the URLs to the corresponding Wikipedia entries?
- What are their Wikipedia and Bing IDs?
Keyphrase Extraction
Amazon Comprehend
General: Overview | Sample output | Demo | Pricing
- Which keywords can be extracted for the given document?
- How often do each of these keywords occur?
Google Cloud Natural Language
Not supported
IBM Watson Natural Understanding
General: Overview | Sample output | Demo | Pricing
- Which keywords can be extracted for the given document?
Microsoft Cognitive Services Text Analytics
General: Overview | Sample output | Demo | Pricing
JavaScript: Node
- Which keywords can be extracted for the given document?
Machine Translation
Amazon Translate
General: Overview | Sample output | UI | Pricing
Support for seven languages
Google Cloud Translation API
General: Overview | Sample output | Demo | Pricing
Support for 98 language pairs in neural machine translation model
IBM Watson Language Translator
General: Overview | Sample output | Demo | Price
Support for 33 language pairs
Microsoft Cognitive Services Translator Text
General: Overview | Sample output | Pricing
JavaScript: Node
Support for 39 language pairs
Sentiment Analysis
Amazon Comprehend
General: Overview | Sample output | Demo | Pricing
- To what extent does the document express an overal positive, negative, neutral or mixed sentiment?
Google Cloud Natural Language
General: Overview | Sample output | Demo | Pricing
- To what extent does the document express an overal positive, negative, neutral or mixed sentiment?
IBM Watson Natural Understanding
General: Overview | Sample output | Demo | Pricing
- To what extent does the document express an overal positive, negative or neutral sentiment?
Microsoft Cognitive Services Text Analytics
General: Overview | Sample output | Demo | Pricing
JavaScript: Node
- To what extent does the document express an overal positive, negative or neutral sentiment?
Speech
<h2 id="speech-to-text">Speech to Text / Speech Recognition</h2>Amazon Transcribe
General: Overview | Sample output | UI | Pricing
Support for US English and Spanish
Google Cloud Speech-to-Text
General: Overview | Sample output | Demo | Pricing
Support for 119 languages/locales
IBM Speech to Text
General: Overview | Sample output | Demo | Pricing
Support for 9 languages
Microsoft Cognitive Services Speech to Text (Preview)
General: Overview | Sample output | Demo | Pricing
JavaScript: Node
The REST API is limited to utterances of up to 14 seconds.
Support for 8 languages
<h2 id="text-to-speech">Text to Speech / Speech Synthesis</h2>Amazon Polly
General: Overview | Sample output | UI | Pricing
34 voices in 25 languages
SSML extensions:
- Breathing
- Dynamic Range Compression
- Speaking softly
- Timbre
- Whispering
Google Cloud Text-to-Speech (Beta)
General: Overview | Sample output | Demo | Pricing
28 voices in 14 languages
IBM Watson Text to Speech
General: Overview | Sample output | Demo | Pricing
13 voices in 7 languages
SSML extensions:
- Good news
- Apology
- Uncertainty
Customization:
- Pitch
- Glottal tension
- Breathiness
- Timbre
Microsoft Cognitive Services Text to Speech (Preview)
General: Overview | Sample output | Demo | Pricing
JavaScript: Node
80 voices in 32 languages
Customization in private preview
Vision
Face Detection
Amazon Rekognition
General: Overview | Sample output | Demo | Pricing
- Where are the faces and face parts located in the image?
- What are the age ranges of the persons shown?
- Are they smiling?
- Do they wear eyeglasses or sunglasses?
- What are their genders?
- Do they have a beard or mustache?
- Are their eyes or mouth open?
- Do they express emotions of happiness, sadness, anger, confusion, disgust, surprise or calmness?
- Given a face image, what other image shows the most similar face?
- Are the faces in two images of the same person?
Google Cloud Vision
General: Overview | Sample output | Demo | Pricing
- Where are the faces and face parts located in the image?
- What is the pose of the faces?
- Does the faces express emotions states of joy, sorrow, anger or surprise?
- Is the person wearing headwear?
- Is the photo underexposed or blurred?
IBM Watson Visual Recognition
General: Overview | Sample output | Demo | Pricing
- Where are the faces located in the image?
- What are the age ranges of the persons shown?
- What are their genders?
Microsoft Cognitive Services Face
General: Overview | Sample output | Demo | Pricing
JavaScript: Node
- What are the faces and face parts located in the image?
- Are parts of the faces occluded?
- What is the pose of the heads?
- How old are they?
- What are their genders?
- Does the face express the emotional states of anger, contempt, disgust, fear, happiness sadness, surprise or a neutral state?
- Arey they smiling?
- Is the hair visible? What is the hair color? Or is the person bald?
- Do they have a moustache, a beard or sideburns?
- Are they wearing make-up?
- What kind of acessories is the person wearing, if any?
- What kind of glasses is the person wearing, if any?
- Is the photo blurred? What is the exposure level? What is the noise level?
- Are the faces in two images of the same person?
Text Recognition
Amazon Rekognition
General: Overview | Sample output | Demo | Pricing
- Where in the image file is text located?
- What is the text content?
- Which boxes do individual words belong to?
Google Cloud Vision
General: Overview | Sample output | Demo | Pricing
- Where in the image file is text located?
- What is the text content?
- What is the language of the text content?
IBM Watson Visual Recognition
This feature is currently in private beta.
Microsoft Cognitive Services Computer Vision
General: Overview | Sample output | Demo | Pricing
JavaScript: Node
JVM: Java
- Where in the image file is text located?
- What is the text content?
License
This work is licensed under a Creative Commons Attribution 4.0 International License.