KarineAyrs/knowledge-distillation-semantic-search

KDSS is the framework for knowledge distillation from LLMs

/ 100

Emerging

This framework helps machine learning practitioners or researchers fine-tune smaller, more efficient language models for semantic search tasks. You provide your specialized documents, and the framework uses the knowledge of large language models (like OpenAI's or Alpaca) to create training data. The output is a smaller, fine-tuned model (e.g., BERT) that can then be used to generate embeddings for your documents, making them semantically searchable.

Use this if you need to create a custom semantic search engine for your specific domain and want to leverage the power of large language models to train a smaller, faster model on your data without manual labeling.

Not ideal if you don't have a collection of domain-specific documents or if you require an off-the-shelf, immediately deployable semantic search solution without model training.

semantic-search information-retrieval natural-language-processing machine-learning-engineering

No Package No Dependents

Maintenance 6 / 25

Adoption 5 / 25

Maturity 16 / 25

Community 11 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

MIT

Higher-rated alternatives

airaria/TextBrewer

A PyTorch-based knowledge distillation toolkit for natural language processing

sunyilgdx/NSP-BERT

The code for our paper "NSP-BERT: A Prompt-based Zero-Shot Learner Through an Original...

princeton-nlp/CoFiPruning

[ACL 2022] Structured Pruning Learns Compact and Accurate Models https://arxiv.org/abs/2204.00408

kssteven418/LTP

[KDD'22] Learned Token Pruning for Transformers

georgian-io/Transformers-Domain-Adaptation

:no_entry: [DEPRECATED] Adapt Transformer-based language models to new text domains

Explore NLP Tools

All categories Trending NLP directory Insights