nonverbalspeech38k/nonverspeech38k
The official repository for the paper “NonVerbalSpeech-38K: A Scalable Pipeline for Enabling Non-Verbal Speech Generation and Understanding”.
This project helps researchers and developers working with audio data to create and understand non-verbal speech. It provides a dataset and tools for generating new non-verbal speech sounds and analyzing existing ones. The output could be synthesized non-verbal cues for virtual assistants or analyzed patterns from human non-verbal communication.
Use this if you need to generate or analyze non-verbal vocalizations like gasps, laughs, or sighs for applications such as virtual assistants, character animation, or communication research.
Not ideal if your primary focus is on understanding or generating spoken language, as this project specifically targets non-verbal vocal cues.
Stars
63
Forks
2
Language
HTML
License
—
Category
Last pushed
Dec 26, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/nonverbalspeech38k/nonverspeech38k"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
hetpandya/youtube_tts_data_generator
A python library to generate speech dataset from Youtube videos
IS2AI/Kazakh_TTS
An expanded version of the previously released Kazakh text-to-speech (KazakhTTS) synthesis...
taresh18/TTSizer
🎙️ Automatically transcribe audio/video into high-quality, speaker-specific Text-To-Speech datasets ✨
Hecate2/sukasuka-vocal-dataset-builder
すかすかアニメボカロデータセット。1st anime vocal dataset. Extract audio (vocal) files from video based on .ass...
youmebangbang/TTS-dataset-tools
Automatically generates TTS dataset using audio and associated text. Make cuts under a custom...