MarsJacobs/kd-qat-large-enc

[EMNLP 2022 main] Code for "Understanding and Improving Knowledge Distillation for Quantization-Aware-Training of Large Transformer Encoders"

/ 100

Experimental

This project helps machine learning engineers and researchers optimize large transformer models for deployment on resource-constrained devices. It takes a pre-trained, full-precision BERT model and applies knowledge distillation during quantization-aware training to create a significantly smaller, ternary (3-bit) version while maintaining performance. The output is a highly compressed, efficient transformer model suitable for mobile or edge applications.

No commits in the last 6 months.

Use this if you need to deploy large language models like BERT on hardware with limited memory or computational power, without sacrificing too much accuracy.

Not ideal if you are not working with BERT-like transformer encoders or if you don't require extreme model compression to ternary precision.

model compression edge AI natural language processing deep learning deployment transformer optimization

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 4 / 25

Maturity 8 / 25

Community 0 / 25

How are scores calculated?

Stars

Forks

—

Language

Jupyter Notebook

License

—

Higher-rated alternatives

cdqa-suite/cdQA

⛔ [NOT MAINTAINED] An End-To-End Closed Domain Question Answering System.

AMontgomerie/question_generator

An NLP system for generating reading comprehension questions

KristiyanVachev/Leaf-Question-Generation

Easy to use and understand multiple-choice question generation algorithm using T5 Transformers.

robinniesert/kaggle-google-quest

Google QUEST Q&A Labeling Kaggle Competition 6th Place Solution

cooelf/AwesomeMRC

IJCAI 2021 Tutorial & code for Retrospective Reader for Machine Reading Comprehension (AAAI 2021)

Explore Transformer Models

All categories Trending Transformer directory Insights