mbzuai-oryx/LLMVoX

LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLM

46
/ 100
Emerging

LLMVoX helps create highly responsive, voice-based conversational AI systems. It takes text outputs from any Large Language Model (LLM) or Vision-Language Model and instantly converts them into natural-sounding speech, allowing for real-time spoken dialogues. This is ideal for developers building interactive voice agents, virtual assistants, or any application requiring an LLM to "speak" quickly and clearly.

299 stars. No commits in the last 6 months.

Use this if you are a developer looking to integrate high-quality, low-latency streaming speech generation into your Large Language Model applications without needing to fine-tune the LLM itself.

Not ideal if you need a non-streaming, batch text-to-speech solution or if you don't have access to modern GPU hardware.

conversational-ai voice-assistants speech-synthesis LLM-integration real-time-audio
Stale 6m No Package No Dependents
Maintenance 2 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 18 / 25

How are scores calculated?

Stars

299

Forks

40

Language

Python

License

Last pushed

May 16, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/mbzuai-oryx/LLMVoX"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.