abinashmeher999/voice-data-extract

A command line interface to combine text information from subtitles with voice data in the video. Provides a convenient way to generate training data for speech-recognition purposes.

/ 100

Emerging

This tool helps speech recognition engineers create high-quality audio datasets for training machine learning models. It takes a video file and its corresponding subtitle file as input, and outputs precisely clipped audio files for each subtitle line. Each audio clip has the subtitle text embedded within it, making it easy to build datasets for training new speech recognition systems.

No commits in the last 6 months.

Use this if you need to quickly generate labeled audio training data for speech recognition models from existing videos with subtitles.

Not ideal if you're looking for a solution that automatically handles complex audio cleaning or speaker diarization.

speech-recognition machine-learning-training audio-dataset-creation voice-data-annotation natural-language-processing

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 6 / 25

Maturity 16 / 25

Community 15 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

MIT

Higher-rated alternatives

speechmatics/speechmatics-python

Python library and CLI for Speechmatics

gooofy/py-nltools

A collection of basic python modules for spoken natural language processing

IBM/MAX-Speech-to-Text-Converter

Converts spoken words into text form.

ictnlp/StreamSpeech

StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition,...

snakers4/open_stt

Open STT

Explore Voice AI Tools

All categories Trending Voice AI directory Insights