NatGr/annotate_audio
Helper scripts to split a large audio file into smaller chunks and annotate these chunks
This tool helps anyone working with long audio recordings to prepare them for speech-to-text (STT) or text-to-speech (TTS) training. It takes a large audio file, automatically splits it into smaller, manageable clips, and generates initial text transcripts for each. The output is a collection of short audio files with corresponding text annotations, ready for use by data scientists or linguists training speech models.
No commits in the last 6 months.
Use this if you have lengthy audio recordings and need to quickly segment them and create initial transcriptions for STT or TTS model development.
Not ideal if you need extremely precise, human-level transcription without any automated assistance or if you are working with very short, pre-segmented audio clips.
Stars
8
Forks
1
Language
Python
License
—
Category
Last pushed
Oct 19, 2021
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/NatGr/annotate_audio"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
Spr-Aachen/Easy-Voice-Toolkit
A user-friendly audio toolkit for voice recognition, voice transcription, voice conversion etc.
PrzemyslawSwiderski/python-gradle-plugin
Gradle plugin to run Python projects.
alphacep/awesome-russian-speech
Russian speech technology links
ftyers/commonvoice-utils
Linguistic processing for Common Voice
microsoft/UniSpeech
UniSpeech - Large Scale Self-Supervised Learning for Speech