FluidInference/FluidAudio
Frontier CoreML audio models in your apps — text-to-speech, speech-to-text, voice activity detection, and speaker diarization. In Swift, powered by SOTA open source.
This project helps Apple app developers integrate advanced audio AI features directly into their macOS and iOS applications. It takes raw audio input and can output transcribed text, detect voice activity, identify different speakers, or convert text into spoken audio, all running efficiently on the device itself. App developers can use this to add robust voice capabilities to their products.
1,689 stars. Actively maintained with 98 commits in the last 30 days.
Use this if you are an Apple app developer looking to add fast, private, and on-device speech-to-text, text-to-speech, or speaker recognition features to your macOS or iOS application.
Not ideal if you need a cloud-based audio processing solution or are developing for platforms other than Apple devices.
Stars
1,689
Forks
214
Language
Swift
License
Apache-2.0
Category
Last pushed
Mar 18, 2026
Commits (30d)
98
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/FluidInference/FluidAudio"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
k2-fsa/sherpa-ncnn
Real-time speech recognition and voice activity detection (VAD) using next-gen Kaldi with ncnn...
phuc-nt/my-translator
Real-time speech translation — macOS & Windows, free TTS, no server, your API keys only
pot-app/pot-desktop
🌈一个跨平台的划词翻译和OCR软件 | A cross-platform software for text translation and recognition.
Blaizzy/mlx-audio-swift
A modular Swift SDK for audio processing with MLX on Apple Silicon
soniqo/speech-swift
AI speech toolkit for Apple Silicon — ASR, TTS, speech-to-speech, VAD, and diarization powered...