MarsJacobs/kd-qat-large-enc
[EMNLP 2022 main] Code for "Understanding and Improving Knowledge Distillation for Quantization-Aware-Training of Large Transformer Encoders"
This project helps machine learning engineers and researchers optimize large transformer models for deployment on resource-constrained devices. It takes a pre-trained, full-precision BERT model and applies knowledge distillation during quantization-aware training to create a significantly smaller, ternary (3-bit) version while maintaining performance. The output is a highly compressed, efficient transformer model suitable for mobile or edge applications.
No commits in the last 6 months.
Use this if you need to deploy large language models like BERT on hardware with limited memory or computational power, without sacrificing too much accuracy.
Not ideal if you are not working with BERT-like transformer encoders or if you don't require extreme model compression to ternary precision.
Stars
8
Forks
—
Language
Jupyter Notebook
License
—
Category
Last pushed
Feb 07, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/MarsJacobs/kd-qat-large-enc"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
cdqa-suite/cdQA
⛔ [NOT MAINTAINED] An End-To-End Closed Domain Question Answering System.
AMontgomerie/question_generator
An NLP system for generating reading comprehension questions
KristiyanVachev/Leaf-Question-Generation
Easy to use and understand multiple-choice question generation algorithm using T5 Transformers.
robinniesert/kaggle-google-quest
Google QUEST Q&A Labeling Kaggle Competition 6th Place Solution
cooelf/AwesomeMRC
IJCAI 2021 Tutorial & code for Retrospective Reader for Machine Reading Comprehension (AAAI 2021)