Home

Awesome

Angular Web Audio melodies sample

Developer sample written in Angular demonstrating Gemini multimodal (image and audio) input and understanding. The user enters a prompt and the app generates images via VertexAI’s image generation which the user can after that preview in a three-dimensional gallery. The user has an input where they can ask a question about the images. Using Web Audio’s Speech Synthesis API we read Gemini’s answer for the images.

Pre-requisites

  1. Node.js and npm
  1. Gemini API key

Getting started

Compile and run the app:

npm i
npm start

In the text box with placeholder "API key" enter your Gemini API key. You can find instructions how to use the app under "Instructions" when you open the user interface.