VITA-MLLM/Freeze-Omni

✨✨Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM

/ 100

Emerging

This project offers a speech-to-speech dialogue system that provides intelligent and near real-time spoken conversations. You speak into it, and it processes your input to generate a spoken response very quickly. It's designed for anyone needing an advanced, low-latency conversational AI experience, such as customer service agents or virtual assistant developers.

369 stars. No commits in the last 6 months.

Use this if you need a highly responsive conversational AI that understands spoken language and generates intelligent spoken replies with minimal delay.

Not ideal if your primary need is text-based interaction or if you operate in an environment with poor network connectivity or low-performance hardware.

conversational-ai speech-recognition natural-language-processing virtual-assistants customer-service

Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 13 / 25

How are scores calculated?

Stars

369

Forks

Language

Python

License

—

Higher-rated alternatives

KimMeen/Time-LLM

[ICLR 2024] Official implementation of " 🦙 Time-LLM: Time Series Forecasting by Reprogramming...

om-ai-lab/VLM-R1

Solve Visual Understanding with Reinforced VLMs

bytedance/SALMONN

SALMONN family: A suite of advanced multi-modal LLMs

NVlabs/OmniVinci

OmniVinci is an omni-modal LLM for joint understanding of vision, audio, and language.

fixie-ai/ultravox

A fast multimodal LLM for real-time voice

Explore Transformer Models

All categories Trending Transformer directory Insights