gentaiscool/miners

MINERS ⛏️: The semantic retrieval benchmark for evaluating multilingual language models. (EMNLP 2024 Findings)

/ 100

Emerging

This project offers a way to evaluate how well multilingual language models (LMs) can find relevant information across different languages. It takes various multilingual text datasets and measures the LMs' ability to retrieve semantically similar content, such as finding matching sentences or classifying text. This tool is for researchers and developers working on natural language processing, especially those focused on multilingual applications and language model evaluation.

No commits in the last 6 months.

Use this if you need to benchmark and understand the effectiveness of different language models for tasks involving semantic retrieval, particularly across many languages and in challenging cross-lingual or code-switching scenarios.

Not ideal if you are an end-user simply looking to apply a pre-trained multilingual model for a specific task without needing to evaluate or compare its underlying retrieval capabilities.

multilingual-NLP language-model-evaluation semantic-search bitext-mining cross-lingual-text-classification

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 5 / 25

Maturity 16 / 25

Community 16 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

Apache-2.0

Higher-rated alternatives

yuanzhoulvpi2017/DocumentSearch

基于sentence transformers和chatglm实现的文档搜索工具

IAmPara0x/Yuno

Yuno is context based search engine for anime.

canjiali/PARADE

code and data to faciliate BERT/ELECTRA for document ranking. Details refer to the paper -...

D-Roberts/transformers-retrieval-ranking-nli-ECIR2021

Multilingual retrieval, ranking and natural language inference with transformers (mBERT);...

atapour/rank-over-class

Source code for the training pipeline of the text ranking model used in the paper entitled "Rank...

Explore Transformer Models

All categories Trending Transformer directory Insights