i4Ds/whisper-prep

Data preparation utility for the finetuning of OpenAI's Whisper model.

/ 100

Emerging

This tool helps AI engineers, machine learning practitioners, and data scientists prepare audio and transcript data for training speech-to-text models like OpenAI's Whisper. It takes raw audio files, sentence-level datasets, or existing SRT/VTT transcripts and outputs meticulously segmented, cleaned, and formatted datasets ready for model fine-tuning. This ensures high-quality training data, leading to better performing transcription models.

Use this if you need to create a high-quality, normalized dataset of audio and text for training an automatic speech recognition (ASR) model, especially when working with varied raw audio sources or sentence-level datasets.

Not ideal if you simply need to transcribe a single audio file or process a small number of transcripts without the intent of training or fine-tuning a machine learning model.

speech-to-text-training audio-data-preparation machine-learning-engineering natural-language-processing dataset-curation

No Package No Dependents

Maintenance 10 / 25

Adoption 5 / 25

Maturity 16 / 25

Community 7 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

MIT

Compare

whisper-prep and whisper-finetune

Higher-rated alternatives

AbdullahHendy/live-translation

Real-time speech-to-text translation over WebSocket. Streams Opus or raw PCM audio from client...

i4Ds/whisper-finetune

This repository contains code for fine-tuning the Whisper speech-to-text model.

512z/podlens

Free Podwise: AI Podcast & Youtube Transcription & Understanding Agent | 播客+youtube转文字/学习/可视化AI工具

Gr122lyBr/voicetag

Speaker identification powered by pyannote and resemblyzer

aws-solutions/content-localization-on-aws

Automatically generate multi-language subtitles using AWS AI/ML services. Machine generated...

Explore Voice AI Tools

All categories Trending Voice AI directory Insights