vasistalodagala/whisper-finetune

Fine-tune and evaluate Whisper models for Automatic Speech Recognition (ASR) on custom datasets or datasets from huggingface.

49
/ 100
Emerging

This project helps machine learning engineers and researchers improve Automatic Speech Recognition (ASR) performance for specific languages or accents. It takes audio recordings and their human-transcribed text as input, then customizes an existing Whisper ASR model. The output is a specialized ASR model that is more accurate for your unique audio data.

361 stars. No commits in the last 6 months.

Use this if you need an ASR model that performs exceptionally well on audio data with characteristics not fully captured by general-purpose models.

Not ideal if you simply need to transcribe general audio with an off-the-shelf ASR model, or if you don't have labeled audio-text pairs for fine-tuning.

speech-to-text voice-recognition audio-transcription natural-language-processing machine-learning-engineering
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 23 / 25

How are scores calculated?

Stars

361

Forks

87

Language

Python

License

MIT

Last pushed

May 23, 2023

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/vasistalodagala/whisper-finetune"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.