ictnlp/ComSpeech
Code for ACL 2024 main conference paper "Can We Achieve High-quality Direct Speech-to-Speech Translation Without Parallel Speech Data?".
ComSpeech translates spoken language from one language into spoken language of another. It takes an audio recording in a source language and outputs an audio recording of the same speech translated into a target language. This is ideal for researchers or developers working with multilingual audio content who need to create high-quality speech-to-speech translation systems without extensive parallel speech datasets.
No commits in the last 6 months.
Use this if you need to translate speech directly from one language to another, especially when you don't have a large dataset of speech translated side-by-side.
Not ideal if you primarily need text-to-text or speech-to-text translation, or if you require real-time, simultaneous translation capabilities.
Stars
26
Forks
6
Language
Python
License
—
Category
Last pushed
Jul 02, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/ictnlp/ComSpeech"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
index-tts/index-tts
An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System
stepfun-ai/Step-Audio-EditX
A powerful 3B-parameter, LLM-based Reinforcement Learning audio edit model excels at editing...
lucasnewman/f5-tts-mlx
Implementation of F5-TTS in MLX
unilight/seq2seq-vc
A sequence-to-sequence voice conversion toolkit.
FireRedTeam/FireRedTTS
An Open-Sourced LLM-empowered Foundation TTS System