worldbank/GISTEmbed
GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embeddings
This project helps developers fine-tune text embedding models to better understand the relationships between pieces of text. It takes a collection of text data and an existing embedding model, then produces a more accurate, specialized embedding model. This is used by machine learning engineers or NLP researchers who want to improve the performance of their text-based AI applications.
No commits in the last 6 months.
Use this if you are a machine learning engineer or NLP researcher who needs to fine-tune a text embedding model for better performance on specific retrieval or classification tasks.
Not ideal if you are looking for a ready-to-use embedding model for general-purpose tasks without any custom fine-tuning.
Stars
44
Forks
3
Language
Python
License
—
Category
Last pushed
Mar 06, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/worldbank/GISTEmbed"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
shibing624/similarities
Similarities: a toolkit for similarity calculation and semantic search....
explosion/sense2vec
🦆 Contextually-keyed word vectors
chakki-works/chakin
Simple downloader for pre-trained word vectors
sebischair/Lbl2Vec
Lbl2Vec learns jointly embedded label, document and word vectors to retrieve documents with...
pdrm83/sent2vec
How to encode sentences in a high-dimensional vector space, a.k.a., sentence embedding.