gentaiscool/miners
MINERS ⛏️: The semantic retrieval benchmark for evaluating multilingual language models. (EMNLP 2024 Findings)
This project offers a way to evaluate how well multilingual language models (LMs) can find relevant information across different languages. It takes various multilingual text datasets and measures the LMs' ability to retrieve semantically similar content, such as finding matching sentences or classifying text. This tool is for researchers and developers working on natural language processing, especially those focused on multilingual applications and language model evaluation.
No commits in the last 6 months.
Use this if you need to benchmark and understand the effectiveness of different language models for tasks involving semantic retrieval, particularly across many languages and in challenging cross-lingual or code-switching scenarios.
Not ideal if you are an end-user simply looking to apply a pre-trained multilingual model for a specific task without needing to evaluate or compare its underlying retrieval capabilities.
Stars
14
Forks
6
Language
Python
License
Apache-2.0
Category
Last pushed
Oct 03, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/gentaiscool/miners"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
yuanzhoulvpi2017/DocumentSearch
基于sentence transformers和chatglm实现的文档搜索工具
IAmPara0x/Yuno
Yuno is context based search engine for anime.
canjiali/PARADE
code and data to faciliate BERT/ELECTRA for document ranking. Details refer to the paper -...
D-Roberts/transformers-retrieval-ranking-nli-ECIR2021
Multilingual retrieval, ranking and natural language inference with transformers (mBERT);...
atapour/rank-over-class
Source code for the training pipeline of the text ranking model used in the paper entitled "Rank...