ictnlp/ComSpeech

Code for ACL 2024 main conference paper "Can We Achieve High-quality Direct Speech-to-Speech Translation Without Parallel Speech Data?".

/ 100

Emerging

ComSpeech translates spoken language from one language into spoken language of another. It takes an audio recording in a source language and outputs an audio recording of the same speech translated into a target language. This is ideal for researchers or developers working with multilingual audio content who need to create high-quality speech-to-speech translation systems without extensive parallel speech datasets.

No commits in the last 6 months.

Use this if you need to translate speech directly from one language to another, especially when you don't have a large dataset of speech translated side-by-side.

Not ideal if you primarily need text-to-text or speech-to-text translation, or if you require real-time, simultaneous translation capabilities.

speech-translation language-AI audio-processing natural-language-processing multilingual-communication

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 7 / 25

Maturity 8 / 25

Community 16 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

—

Higher-rated alternatives

index-tts/index-tts

An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System

stepfun-ai/Step-Audio-EditX

A powerful 3B-parameter, LLM-based Reinforcement Learning audio edit model excels at editing...

lucasnewman/f5-tts-mlx

Implementation of F5-TTS in MLX

unilight/seq2seq-vc

A sequence-to-sequence voice conversion toolkit.

FireRedTeam/FireRedTTS

An Open-Sourced LLM-empowered Foundation TTS System

Explore Voice AI Tools

All categories Trending Voice AI directory Insights