ImadSaddik/Benchmark_Embedding_Models
The goal of this project is to create a high-quality (golden) dataset from your data, which will serve as a benchmark for evaluating various embedding models.
This project helps data scientists and AI/ML engineers assess and select the best text embedding models for their specific applications. You provide your own text data, and it generates a high-quality "golden" dataset for evaluating various embedding models. The outcome is a clear comparison of model performance using metrics and statistical tests, enabling you to confidently choose the most effective model.
Use this if you need to objectively compare and select an optimal text embedding model for tasks like search, recommendation, or classification using your own company's data.
Not ideal if you are looking for a pre-trained, ready-to-use embedding model without needing to evaluate multiple options on custom data.
Stars
21
Forks
3
Language
Jupyter Notebook
License
MIT
Category
Last pushed
Jan 15, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/ImadSaddik/Benchmark_Embedding_Models"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.