gentaiscool/miners

MINERS ⛏️: The semantic retrieval benchmark for evaluating multilingual language models. (EMNLP 2024 Findings)

37
/ 100
Emerging

This project offers a way to evaluate how well multilingual language models (LMs) can find relevant information across different languages. It takes various multilingual text datasets and measures the LMs' ability to retrieve semantically similar content, such as finding matching sentences or classifying text. This tool is for researchers and developers working on natural language processing, especially those focused on multilingual applications and language model evaluation.

No commits in the last 6 months.

Use this if you need to benchmark and understand the effectiveness of different language models for tasks involving semantic retrieval, particularly across many languages and in challenging cross-lingual or code-switching scenarios.

Not ideal if you are an end-user simply looking to apply a pre-trained multilingual model for a specific task without needing to evaluate or compare its underlying retrieval capabilities.

multilingual-NLP language-model-evaluation semantic-search bitext-mining cross-lingual-text-classification
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 5 / 25
Maturity 16 / 25
Community 16 / 25

How are scores calculated?

Stars

14

Forks

6

Language

Python

License

Apache-2.0

Last pushed

Oct 03, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/gentaiscool/miners"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.