k2-fsa/sherpa-ncnn

Real-time speech recognition and voice activity detection (VAD) using next-gen Kaldi with ncnn without Internet connection. Support iOS, Android, Linux, macOS, Windows, Raspberry Pi, VisionFive2, LicheePi4A etc.

/ 100

Established

This project helps you add real-time voice features to devices like phones, tablets, or even small computers without needing an internet connection. It takes spoken audio and can instantly convert it into text, or recognize when someone is speaking, or even generate speech from text. It's ideal for developers building offline voice assistants, accessibility tools, or interactive voice applications for a wide range of hardware.

1,648 stars. Available on PyPI.

Use this if you need to integrate robust, on-device speech recognition, voice activity detection, or text-to-speech capabilities into your application without relying on cloud services.

Not ideal if your application requires highly specialized, custom speech models that need extensive online training or very large language models.

voice-user-interface edge-ai mobile-app-development accessibility-tech offline-processing

Maintenance 6 / 25

Adoption 10 / 25

Maturity 25 / 25

Community 21 / 25

How are scores calculated?

Stars

1,648

Forks

210

Language

C++

License

Apache-2.0

Related tools

FluidInference/FluidAudio

Frontier CoreML audio models in your apps — text-to-speech, speech-to-text, voice activity...

phuc-nt/my-translator

Real-time speech translation — macOS & Windows, free TTS, no server, your API keys only

pot-app/pot-desktop

🌈一个跨平台的划词翻译和OCR软件 | A cross-platform software for text translation and recognition.

Blaizzy/mlx-audio-swift

A modular Swift SDK for audio processing with MLX on Apple Silicon

soniqo/speech-swift

AI speech toolkit for Apple Silicon — ASR, TTS, speech-to-speech, VAD, and diarization powered...

Explore Voice AI Tools

All categories Trending Voice AI directory Insights