asmelashteka/HornMT

Machine translation (MT) benchmark dataset for languages in the Horn of Africa.

34
/ 100
Emerging

This project provides a comprehensive collection of news snippets translated across multiple languages spoken in the Horn of Africa, alongside English. You get parallel text data in formats like plain text, Excel, or JSON, with each snippet accompanied by metadata such as its category, source, and publication date. This is designed for researchers, language service providers, or AI developers working on machine translation for languages like Amharic, Oromo, Somali, and Tigrinya.

No commits in the last 6 months.

Use this if you need high-quality, pre-aligned textual data to train or evaluate machine translation systems for Horn of Africa languages.

Not ideal if you are looking for a translation API or an end-user translation tool, as this provides raw data for development.

Machine Translation Natural Language Processing Linguistics Research African Languages Corpus Development
No License Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 8 / 25
Maturity 8 / 25
Community 18 / 25

How are scores calculated?

Stars

42

Forks

13

Language

License

Last pushed

Oct 13, 2022

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/asmelashteka/HornMT"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.