FluidAudio and speech-swift
These are direct competitors offering overlapping functionality (ASR, TTS, VAD, diarization) for on-device speech processing on Apple platforms, with FluidAudio being more mature (higher stars) and speech-swift offering a broader feature set (speech-to-speech) while both emphasizing local ML inference.
About FluidAudio
FluidInference/FluidAudio
Frontier CoreML audio models in your apps — text-to-speech, speech-to-text, voice activity detection, and speaker diarization. In Swift, powered by SOTA open source.
This project helps Apple app developers integrate advanced audio AI features directly into their macOS and iOS applications. It takes raw audio input and can output transcribed text, detect voice activity, identify different speakers, or convert text into spoken audio, all running efficiently on the device itself. App developers can use this to add robust voice capabilities to their products.
About speech-swift
soniqo/speech-swift
AI speech toolkit for Apple Silicon — ASR, TTS, speech-to-speech, VAD, and diarization powered by MLX and CoreML
This project offers a collection of AI speech models that run directly on your Apple Mac or iOS device, without needing internet access. It helps turn spoken words into text, generate natural-sounding speech from text, and analyze audio for who spoke when or to remove background noise. This is ideal for app developers building privacy-focused audio features for Apple users.
Related comparisons
Scores updated daily from GitHub, PyPI, and npm data. How scores work