k2-fsa/sherpa-ncnn
Real-time speech recognition and voice activity detection (VAD) using next-gen Kaldi with ncnn without Internet connection. Support iOS, Android, Linux, macOS, Windows, Raspberry Pi, VisionFive2, LicheePi4A etc.
This project helps you add real-time voice features to devices like phones, tablets, or even small computers without needing an internet connection. It takes spoken audio and can instantly convert it into text, or recognize when someone is speaking, or even generate speech from text. It's ideal for developers building offline voice assistants, accessibility tools, or interactive voice applications for a wide range of hardware.
1,648 stars. Available on PyPI.
Use this if you need to integrate robust, on-device speech recognition, voice activity detection, or text-to-speech capabilities into your application without relying on cloud services.
Not ideal if your application requires highly specialized, custom speech models that need extensive online training or very large language models.
Stars
1,648
Forks
210
Language
C++
License
Apache-2.0
Category
Last pushed
Oct 20, 2025
Commits (30d)
0
Dependencies
1
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/k2-fsa/sherpa-ncnn"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
FluidInference/FluidAudio
Frontier CoreML audio models in your apps — text-to-speech, speech-to-text, voice activity...
phuc-nt/my-translator
Real-time speech translation — macOS & Windows, free TTS, no server, your API keys only
pot-app/pot-desktop
🌈一个跨平台的划词翻译和OCR软件 | A cross-platform software for text translation and recognition.
Blaizzy/mlx-audio-swift
A modular Swift SDK for audio processing with MLX on Apple Silicon
soniqo/speech-swift
AI speech toolkit for Apple Silicon — ASR, TTS, speech-to-speech, VAD, and diarization powered...