worldbank/GISTEmbed

GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embeddings

/ 100

Experimental

This project helps developers fine-tune text embedding models to better understand the relationships between pieces of text. It takes a collection of text data and an existing embedding model, then produces a more accurate, specialized embedding model. This is used by machine learning engineers or NLP researchers who want to improve the performance of their text-based AI applications.

No commits in the last 6 months.

Use this if you are a machine learning engineer or NLP researcher who needs to fine-tune a text embedding model for better performance on specific retrieval or classification tasks.

Not ideal if you are looking for a ready-to-use embedding model for general-purpose tasks without any custom fine-tuning.

text-embeddings NLP-model-tuning information-retrieval-engineering semantic-search-development

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 8 / 25

Maturity 8 / 25

Community 7 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

—

Higher-rated alternatives

shibing624/similarities

Similarities: a toolkit for similarity calculation and semantic search....

explosion/sense2vec

🦆 Contextually-keyed word vectors

chakki-works/chakin

Simple downloader for pre-trained word vectors

sebischair/Lbl2Vec

Lbl2Vec learns jointly embedded label, document and word vectors to retrieve documents with...

pdrm83/sent2vec

How to encode sentences in a high-dimensional vector space, a.k.a., sentence embedding.

Explore Embedding Tools

All categories Trending Embeddings directory Insights