microsoft/AdaMix
This is the implementation of the paper AdaMix: Mixture-of-Adaptations for Parameter-efficient Model Tuning (https://arxiv.org/abs/2205.12410).
This project helps machine learning engineers improve the performance of large language models like BERT and RoBERTa on specific natural language processing tasks. It takes an existing pre-trained model and fine-tunes it using a 'mixture-of-adaptations' approach, resulting in a specialized model that performs better on your chosen task while being more parameter-efficient than traditional fine-tuning.
138 stars. No commits in the last 6 months.
Use this if you are a machine learning engineer working on fine-tuning large language models for tasks such as text classification, natural language inference, or semantic similarity, and want to achieve better performance with fewer trainable parameters.
Not ideal if you are looking for a general-purpose, out-of-the-box solution for end-users or if you are not comfortable working with deep learning model training and evaluation scripts.
Stars
138
Forks
11
Language
Python
License
Apache-2.0
Category
Last pushed
Aug 14, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/microsoft/AdaMix"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related models
pphuc25/distil-cd
Distillation Contrastive Decoding: Improving LLMs Reasoning with Contrastive Decoding and Distillation
taissirboukrouba/Structured-Information-Retrieval-with-LLMs
Academic Sequence Labelling Between DistillBERT & Encoder-only Transformer
mominalix/LLM-Model-Distillation-for-Text-Classification-Models-GUI
GUI application that performs knowledge distillation from OpenAI models to smaller Hugging Face...