ictnlp/LLaMA-Omni

LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.

47
/ 100
Emerging

LLaMA-Omni helps you have natural, fast voice conversations with an AI. You speak to the AI, and it quickly understands your speech, generates a text response, and speaks its answer back to you. This is ideal for anyone needing quick, spoken information or interaction, like a customer service agent interacting with a bot or a language learner practicing conversation.

3,128 stars. No commits in the last 6 months.

Use this if you need an AI that can understand spoken questions and respond instantly with both text and high-quality generated speech.

Not ideal if your primary need is for purely text-based AI interaction or if you require an AI for commercial products without obtaining a specific license.

voice-assistants speech-to-text text-to-speech conversational-ai interactive-systems
Stale 6m No Package No Dependents
Maintenance 2 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 19 / 25

How are scores calculated?

Stars

3,128

Forks

222

Language

Python

License

Apache-2.0

Last pushed

May 19, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/ictnlp/LLaMA-Omni"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.