Home

Awesome

Visualization Tools for Speaker Diarization

Introduction

The current landscape lacks a robust tool for diarization visualization, which is critical for the analysis of datasets and algorithm outcomes. In this repository, we offer intuitive methods to illustrate speaker diarization results. A pivotal criterion for selecting this visualization software was its capacity for interactive operation. While these visualization tools have room for improvement, they are the best available options at present.

Go to: Visualization tool for Audio-only datasets

Go to: Visualization tool for Audio-visual datasets

<p id="anchor_ao"></p>

Visualization for Audio-only datasets

Step 1: Generating praat format:

python audio_visualized.py -rttm audio_cases/afjiv.rttm -audio_path audio_cases/afjiv.wav -praat_result audio_cases/afjiv.txt

(Example is from VoxConverse)

Step 2: Import praat_result into Praat:

Step3: Overview

You can slide with a horizontal scroll. Speaker labels are shown in each timeline (e.g., spk00, spk01 ...).

Some useful shortcuts:

<p id="anchor_av"></p>

Visualization for Audio-visual datasets

Step 1: Generating VIA format

python audio_visual_visualized.py -rttm audio_visual_cases/00115.rttm -mp4_path audio_visual_cases/00115.rttm -via_json_result audio_visual_cases/00115.json

(Example is from MSDWild)

If the video cannot be previewed or quickly previewed, please try to convert them to support the specific mp4 format of HTML5.

ffmpeg -i original.mp4 -vcodec libx264 -acodec aac -preset fast -movflags +faststart  previewed.mp4

Step 2: Import via_format.json into VIA tools

Step3: Overview

You can use the Space key to control Play/Pause Media.

More keys can be found on:

<img src='imgs/via_shortcut.png' width=20% />

References