microsoft/AdaMix

This is the implementation of the paper AdaMix: Mixture-of-Adaptations for Parameter-efficient Model Tuning (https://arxiv.org/abs/2205.12410).

/ 100

Emerging

This project helps machine learning engineers improve the performance of large language models like BERT and RoBERTa on specific natural language processing tasks. It takes an existing pre-trained model and fine-tunes it using a 'mixture-of-adaptations' approach, resulting in a specialized model that performs better on your chosen task while being more parameter-efficient than traditional fine-tuning.

138 stars. No commits in the last 6 months.

Use this if you are a machine learning engineer working on fine-tuning large language models for tasks such as text classification, natural language inference, or semantic similarity, and want to achieve better performance with fewer trainable parameters.

Not ideal if you are looking for a general-purpose, out-of-the-box solution for end-users or if you are not comfortable working with deep learning model training and evaluation scripts.

natural-language-processing large-language-models model-fine-tuning text-classification machine-learning-engineering

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 11 / 25

How are scores calculated?

Stars

138

Forks

Language

Python

License

Apache-2.0

Related models

pphuc25/distil-cd

Distillation Contrastive Decoding: Improving LLMs Reasoning with Contrastive Decoding and Distillation

taissirboukrouba/Structured-Information-Retrieval-with-LLMs

Academic Sequence Labelling Between DistillBERT & Encoder-only Transformer

mominalix/LLM-Model-Distillation-for-Text-Classification-Models-GUI

GUI application that performs knowledge distillation from OpenAI models to smaller Hugging Face...

Explore Transformer Models

All categories Trending Transformer directory Insights