ShawnPi233/SynParaSpeech
Official Repository of Paper: "SynParaSpeech: Automated Synthesis of Paralinguistic Datasets for Speech Generation and Understanding" (ICASSP 2026)
This project helps researchers and developers create realistic speech generation and understanding systems that include natural human sounds like laughter, sighs, and gasps. It automatically produces high-quality datasets consisting of audio clips, corresponding text, and precise timing for these 'paralinguistic' sounds. Speech AI developers, voice model trainers, and academic researchers focused on natural language processing would use this.
Use this if you need large-scale, accurately annotated datasets containing paralinguistic sounds to train or evaluate speech synthesis (Text-to-Speech) or speech understanding models.
Not ideal if you need a dataset for general speech recognition without specific emphasis on paralinguistic events or if your primary language of interest is not Chinese.
Stars
66
Forks
4
Language
JavaScript
License
—
Category
Last pushed
Jan 27, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/ShawnPi233/SynParaSpeech"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
hetpandya/youtube_tts_data_generator
A python library to generate speech dataset from Youtube videos
IS2AI/Kazakh_TTS
An expanded version of the previously released Kazakh text-to-speech (KazakhTTS) synthesis...
taresh18/TTSizer
ποΈ Automatically transcribe audio/video into high-quality, speaker-specific Text-To-Speech datasets β¨
Hecate2/sukasuka-vocal-dataset-builder
γγγγγ’γγ‘γγ«γγγΌγΏγ»γγγ1st anime vocal dataset. Extract audio (vocal) files from video based on .ass...
youmebangbang/TTS-dataset-tools
Automatically generates TTS dataset using audio and associated text. Make cuts under a custom...