ImadSaddik/Benchmark_Embedding_Models

The goal of this project is to create a high-quality (golden) dataset from your data, which will serve as a benchmark for evaluating various embedding models.

42
/ 100
Emerging

This project helps data scientists and AI/ML engineers assess and select the best text embedding models for their specific applications. You provide your own text data, and it generates a high-quality "golden" dataset for evaluating various embedding models. The outcome is a clear comparison of model performance using metrics and statistical tests, enabling you to confidently choose the most effective model.

Use this if you need to objectively compare and select an optimal text embedding model for tasks like search, recommendation, or classification using your own company's data.

Not ideal if you are looking for a pre-trained, ready-to-use embedding model without needing to evaluate multiple options on custom data.

AI model evaluation natural language processing information retrieval text analytics MLOps
No Package No Dependents
Maintenance 10 / 25
Adoption 6 / 25
Maturity 15 / 25
Community 11 / 25

How are scores calculated?

Stars

21

Forks

3

Language

Jupyter Notebook

License

MIT

Last pushed

Jan 15, 2026

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/ImadSaddik/Benchmark_Embedding_Models"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.