Home

Awesome

Pod.Cast 🎱 🐋 | Annotation system

License: MIT

Developed by Prakruti Gogia, Akash Mahajan and Nithya Govindarajan during Microsoft AI4Earth & OneWeek hackathons. (this is volunteer-driven & is not an official product)

For a general introduction to the Pod.Cast project, initiated in 2019, and its relationship to other AI for Orcas efforts, please read the Pod.Cast project general overview at ai4orcas.net.

Techinical Overview

podcast_server.py is a prototype flask-based web-app to label unlabelled bioacoustic recordings, while viewing predictions from a model. This is useful to setup some quick-and-dirty labelling sessions that don't need any advanced features such as automated model inference, user access roles, interfacing with other backends, gamification etc.

(See prediction-explorer for a related tool to quickly visualize & browse model predictions on a set of audio files. This runs locally)

<img src="doc/podcast-screenshot.png" alt="Screenshot of Pod.Cast annotation UI" width="80%">

Dataset Creation

This tool has been used in an active learning style to create & release new training & test sets at orcadata/wiki.

<img src="doc/podcast-activelearning-flowchart.png" alt="Flowchart of feedback loop between model & human listeners" width="70%">

Architecture

This prototype is a single page application with a simple flask backend that interfaces with Azure blob storage. For simplicity/ease of access, this version doubles up use of blob storage as a sort of database. A JSON file acts as a single entry, and separate containers as sort of tables/collections (for now for this hack makes it easy to do quick-and-dirty viewing/editing in Azure Storage Explorer, or any equivalent blob viewer for S3 etc.).

<img src="doc/podcast-arch-diagram.png" alt="Architecture diagram showing API interactions between frontend, backend & blob storage" width="100%">

Backend API:

GET /fetch/session/roundid

Scans the getcontainer blob for an unlabelled session, randomly picks & returns a {sessionid=X} response. The sessionid is simply the name of the corresponding X.JSON file on the blob. Updates/resets internal global variable backend_state that contains info for the progress bar.

GET /load/session/roundid/sessionid

GET Azure blob wav

Fetches the corresponding JSON file from the getcontainer blob. (For an example, see example-load.json) JSON file contains backend_state for the progress bar, and uri that points the client directly to the corresponding audio file on the blob storage.

POST /submit/session/roundid/sessionid

Writes a JSON to the postcontainer blob. (For an example, see example-submit.json, which has the same schema). Also updates internal global variable backend_state that contains info for the progress bar.

Client logic:

Primary logic is defined in main.js.

Use & setup

Setup & local debugging

  1. Create an isolated python environment, and pip install --upgrade pip && pip install -r requirements.txt. (Python 3.6.8 has been tested, though recent versions should likely work as dependencies are quite simple)

  2. Set the environment variable FLASK_APP=podcast_server.py and FLASK_ENV=development. If you haven't made your own CREDS file yet, see #3. Once that's done from this directory start the server with python -m flask run, and browse to the link in the terminal (e.g. http://127.0.0.1:5000/) in your browser (Edge and Chrome are tested).

  3. The CREDS.yaml specifies how the backend authenticates with blob storage & the specific container names to use. The provided file is a template and should be replaced:

Note that when you run this locally, you will still be connecting & writing to the actual blob storage specified in CREDS.yaml so be careful.

Using your own blob storage

This assumes you have already created an Azure Storage account & know how to view & access it using Azure Storage Explorer.

  1. Enable a CORS rule to the account. In short, setting this allows a browser client to directly make a request to the blob storage to retrieve a *.wav file.
<img src="doc/az-blob-configure-cors.png" alt="Screenshot of Azure Storage explorer showing CORS permissions" width="35%">
  1. Make sure you have 3 containers; [1]: audiocontainer *.wav audio files (~1min duration - as each file forms one page/session) [2]: getcontainer model predictions specified in JSON format example-load.json corresponding to each *.wav file [3]: postcontainer destination for user-submitted annotations in JSON format example-submit.json.

  2. Enable public read-only access to blobs in audiocontainer (select the "blobs" option). Along with #1, this is required for the browser to directly retrieve *.wav files.

<img src="doc/az-blob-configure-public.png" alt="Screenshot of Azure Storage explorer to set public access level" width="25%">

Deployment to Azure App Service

Prerequsite: Install Azure CLI

  1. Authenticate and setup your local environment to be using the right subscription
az login 
az account list --output table 
az account set --subscription SUBSCRIPTIONID
  1. In the root directory of your application, create a deployment config file at .azure/config. This contains details about your resource group, appservice plan to use, etc. (An example file is at .azure/config)

  2. Now run the following commands to deploy the app. The first command packages up your local directory into a *.zip for deployment and deploys the app on Azure. If an app with the same name in the deployment config file exists it will update it, else create a new app. The second command is to only be run the first time, to register the entry point of the app. (see note below)

az webapp up --sku B1 --dryrun
az webapp config set -g mldev -n aifororcas-podcast --startup-file "gunicorn --bind=0.0.0.0 --timeout 600 podcast_server:app"

This deployment example is loosely based on the Quickstart. We make a change to the startup command to register the different name of our app file podcast_server.py. (FYI some more details about the CLI commands used here are at: az-webapp-up, configuring-python-app)

References

This code uses a fork of audio-annotator for the frontend code. audio-annotator uses wavesurfer.js for rendering and playing audio. Please refer to the respective references for more info on the core functions/classes used in this repo. (Note: the wavesurfer.js version used here is older than the current docs).

Icons used in readme flowcharts were made by Prosymbols from www.flaticon.com.