jtkim-kaist/VAD

Voice activity detection (VAD) toolkit including DNN, bDNN, LSTM and ACAM based VAD. We also provide our directly recorded dataset.

/ 100

Emerging

This toolkit helps signal processing researchers and audio engineers accurately identify when speech is present in an audio recording, distinguishing it from background noise or silence. It takes raw audio recordings as input and outputs precise timestamps or labels indicating speech segments. This is ideal for anyone working with spoken language data where precise speech detection is crucial for further analysis or processing.

869 stars. No commits in the last 6 months.

Use this if you need to reliably separate speech from non-speech in noisy real-world audio, especially for research or advanced audio processing applications.

Not ideal if you're looking for a simple, off-the-shelf voice recorder or transcription service without needing to understand the underlying speech detection models.

speech-processing audio-analysis signal-processing noise-reduction speech-research

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 8 / 25

Community 25 / 25

How are scores calculated?

Stars

869

Forks

233

Language

MATLAB

License

—

Higher-rated alternatives

FluidInference/FluidAudio

Frontier CoreML audio models in your apps — text-to-speech, speech-to-text, voice activity...

k2-fsa/sherpa-ncnn

Real-time speech recognition and voice activity detection (VAD) using next-gen Kaldi with ncnn...

phuc-nt/my-translator

Real-time speech translation — macOS & Windows, free TTS, no server, your API keys only

pot-app/pot-desktop

🌈一个跨平台的划词翻译和OCR软件 | A cross-platform software for text translation and recognition.

Blaizzy/mlx-audio-swift

A modular Swift SDK for audio processing with MLX on Apple Silicon

Explore Voice AI Tools

All categories Trending Voice AI directory Insights