ShawnPi233/SynParaSpeech

Official Repository of Paper: "SynParaSpeech: Automated Synthesis of Paralinguistic Datasets for Speech Generation and Understanding" (ICASSP 2026)

/ 100

Emerging

This project helps researchers and developers create realistic speech generation and understanding systems that include natural human sounds like laughter, sighs, and gasps. It automatically produces high-quality datasets consisting of audio clips, corresponding text, and precise timing for these 'paralinguistic' sounds. Speech AI developers, voice model trainers, and academic researchers focused on natural language processing would use this.

Use this if you need large-scale, accurately annotated datasets containing paralinguistic sounds to train or evaluate speech synthesis (Text-to-Speech) or speech understanding models.

Not ideal if you need a dataset for general speech recognition without specific emphasis on paralinguistic events or if your primary language of interest is not Chinese.

speech-synthesis speech-recognition natural-language-processing voice-AI machine-learning-datasets

No License No Package No Dependents

Maintenance 10 / 25

Adoption 8 / 25

Maturity 7 / 25

Community 8 / 25

How are scores calculated?

Stars

Forks

Language

JavaScript

License

—

Higher-rated alternatives

hetpandya/youtube_tts_data_generator

A python library to generate speech dataset from Youtube videos

IS2AI/Kazakh_TTS

An expanded version of the previously released Kazakh text-to-speech (KazakhTTS) synthesis...

taresh18/TTSizer

🎙️ Automatically transcribe audio/video into high-quality, speaker-specific Text-To-Speech datasets ✨

Hecate2/sukasuka-vocal-dataset-builder

すかすかアニメボカロデータセット。1st anime vocal dataset. Extract audio (vocal) files from video based on .ass...

youmebangbang/TTS-dataset-tools

Automatically generates TTS dataset using audio and associated text. Make cuts under a custom...

Explore Voice AI Tools

All categories Trending Voice AI directory Insights