Home

Awesome

Voice Activity Detection for Javascript

npm vad-web npm vad-react

Run callbacks on segments of audio with user speech in a few lines of code

This package aims to provide an accurate, user-friendly voice activity detector (VAD) that runs in the browser. By using this package, you can prompt the user for microphone permissions, start recording audio, send segments of audio with speech to your server for processing, or show a certain animation or indicator when the user is speaking. Note that I have decided discontinue node support in order to focus on the browser use case.

Under the hood, these packages run Silero VAD [1] using ONNX Runtime Web / ONNX Runtime Node.js. Thanks a lot to those folks for making this possible.

Sponsorship

Please contribute to the project financially - especially if your commercial product relies on this package. Become a Sponsor

Important update about node support - Oct 2024

I am going to wind down support for ricky0123/vad-node, the voice activity detection package for server-side node environments. I do not plan to publish any updates to the node package from here on out. I made this decision for the following reasons:

Quick Start

To use the VAD via a script tag in the browser, include the following script tags:

<script src="https://cdn.jsdelivr.net/npm/onnxruntime-web@1.19.2/dist/ort.js"></script>
<script src="https://cdn.jsdelivr.net/npm/@ricky0123/vad-web@0.0.19/dist/bundle.min.js"></script>
<script>
  async function main() {
    const myvad = await vad.MicVAD.new({
      onSpeechStart: () => {
        console.log("Speech start detected")
      },
      onSpeechEnd: (audio) => {
        // do something with `audio` (Float32Array of audio samples at sample rate 16000)...
      }
    })
    myvad.start()
  }
  main()
</script>

Documentation for bundling the voice activity detector for the browser or using it in node or React projects can be found on vad.ricky0123.com.

References

<a id="1">[1]</a> Silero Team. (2021). Silero VAD: pre-trained enterprise-grade Voice Activity Detector (VAD), Number Detector and Language Classifier. GitHub, GitHub repository, https://github.com/snakers4/silero-vad, hello@silero.ai.