jonatasgrosman/huggingsound

HuggingSound: A toolkit for speech-related tasks based on Hugging Face's tools

52
/ 100
Established

This toolkit helps you convert spoken audio recordings into written text. You provide audio files (like MP3s or WAVs) and receive the transcription, including timestamps for each word. It's designed for researchers or data analysts who need to quickly process and analyze spoken content.

470 stars. No commits in the last 6 months. Available on PyPI.

Use this if you need to accurately convert audio speech into text, evaluate the performance of transcription models, or fine-tune an existing model with your own specific audio data.

Not ideal if you require advanced natural language understanding beyond basic transcription, or if you're not comfortable working with a command-line interface.

audio-transcription speech-to-text voice-data-analysis linguistic-research audio-content-processing
Stale 6m
Maintenance 0 / 25
Adoption 10 / 25
Maturity 25 / 25
Community 17 / 25

How are scores calculated?

Stars

470

Forks

46

Language

Python

License

MIT

Last pushed

Sep 20, 2023

Commits (30d)

0

Dependencies

5

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/jonatasgrosman/huggingsound"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.