Umbaji/NMTMD
Official repository for the Opensource Textdataset for NMT for local langues in West Africa (EWE Corpus)
This project provides essential language resources for developing translation and speech recognition tools for Ewe, a West African language. It takes existing Ewe-English dictionaries and presents them in a readily usable digital format. Language technologists and researchers focusing on West African languages would use this to build new applications.
No commits in the last 6 months.
Use this if you are a language technologist or researcher working to develop machine translation or speech recognition systems for the Ewe language.
Not ideal if you are looking for a ready-to-use translation application; this project provides foundational data for building such tools.
Stars
26
Forks
10
Language
—
License
MIT
Category
Last pushed
Oct 22, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/Umbaji/NMTMD"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
ynop/audiomate
Python library for handling audio datasets.
reazon-research/ReazonSpeech
Massive open Japanese speech corpus
common-voice/cv-dataset
Metadata and versioning details for the Common Voice dataset
davidmartinrius/speech-dataset-generator
🔊 Create labeled datasets, enhance audio quality, identify speakers, support diverse dataset...
EgorLakomkin/KTSpeechCrawler
Automatically constructing corpus for automatic speech recognition from YouTube videos