jonatasgrosman/huggingsound
HuggingSound: A toolkit for speech-related tasks based on Hugging Face's tools
This toolkit helps you convert spoken audio recordings into written text. You provide audio files (like MP3s or WAVs) and receive the transcription, including timestamps for each word. It's designed for researchers or data analysts who need to quickly process and analyze spoken content.
470 stars. No commits in the last 6 months. Available on PyPI.
Use this if you need to accurately convert audio speech into text, evaluate the performance of transcription models, or fine-tune an existing model with your own specific audio data.
Not ideal if you require advanced natural language understanding beyond basic transcription, or if you're not comfortable working with a command-line interface.
Stars
470
Forks
46
Language
Python
License
MIT
Category
Last pushed
Sep 20, 2023
Commits (30d)
0
Dependencies
5
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/jonatasgrosman/huggingsound"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
speechbrain/speechbrain
A PyTorch-based Speech Toolkit
MattyB95/Jabberjay
🦜 Synthetic Voice Detection
notAI-tech/IndicASR
Speeech Recognition for Indic languages.
henilp105/TeluguASR
Telugu ASR model trained on IIIT Hyderabad ASR Challenge dataset and OpenSLR66 dataset
FernandoLpz/SpeechRecognition
This repository contains the implementation of an Automatic Speech Recognition system in python,...