shenasa-ai/speech2text
A Deep-Learning-Based Persian Speech Recognition System
This project offers tools and datasets for converting spoken Persian language into written text. It helps data scientists and machine learning engineers working with Persian audio, providing both code for an Automatic Speech Recognition (ASR) system and large datasets of Persian speech with transcriptions. You feed it audio files, and it outputs corresponding text, which can then be used for various applications.
234 stars. No commits in the last 6 months.
Use this if you are a machine learning engineer or data scientist looking to build or train a Persian speech-to-text system, and you need data or a starting point for implementation.
Not ideal if you are an end-user simply needing to transcribe audio without deep technical knowledge of machine learning, or if you need a ready-to-use commercial-grade ASR API.
Stars
234
Forks
33
Language
Jupyter Notebook
License
MIT
Category
Last pushed
May 22, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/shenasa-ai/speech2text"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
julius-speech/julius
Open-Source Large Vocabulary Continuous Speech Recognition Engine
rolczynski/Automatic-Speech-Recognition
🎧 Automatic Speech Recognition: DeepSpeech & Seq2Seq (TensorFlow)
tabahi/formantfeatures
Extract frequency, power, width and dissonance of formants from wav files
libdriver/ld3320
LD3320 full-featured driver library for general-purpose MCU and Linux.
awsaf49/audio_classification_models
Tensorflow Audio Classification Models