jzmzhong/Automatic-Prosody-Annotator-with-SSWP-CLAP
An automatic prosodic boundary annotation tool for Text-to-Speech Synthesis (TTS).
This tool helps Text-to-Speech (TTS) developers and researchers automatically mark prosodic boundaries in speech data. It takes aligned text and audio features as input and outputs precise annotations for where 'prosodic words' and 'prosodic phrases' occur. This is essential for creating more natural and controllable synthetic speech.
No commits in the last 6 months.
Use this if you are a Text-to-Speech developer or researcher who needs to automatically annotate large amounts of speech data with prosodic boundaries to improve speech synthesis quality.
Not ideal if you are looking for a general-purpose audio transcription or natural language processing tool not specifically focused on prosody for TTS.
Stars
51
Forks
1
Language
Python
License
Apache-2.0
Category
Last pushed
Jun 11, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/jzmzhong/Automatic-Prosody-Annotator-with-SSWP-CLAP"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
index-tts/index-tts
An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System
stepfun-ai/Step-Audio-EditX
A powerful 3B-parameter, LLM-based Reinforcement Learning audio edit model excels at editing...
lucasnewman/f5-tts-mlx
Implementation of F5-TTS in MLX
unilight/seq2seq-vc
A sequence-to-sequence voice conversion toolkit.
FireRedTeam/FireRedTTS
An Open-Sourced LLM-empowered Foundation TTS System