raymin0223/fast_robust_early_exit
Fast and Robust Early-Exiting Framework for Autoregressive Language Models with Synchronized Parallel Decoding (EMNLP 2023 Long)
This project helps machine learning engineers and researchers speed up how quickly large language models generate text for tasks like summarization, question answering, and translation. It takes an existing autoregressive language model and processes its output to make predictions faster without losing accuracy. The end user is a machine learning engineer or researcher working with large language models.
No commits in the last 6 months.
Use this if you need to accelerate the text generation process of large language models for tasks like summarization or translation, and you are comfortable working with machine learning model deployments.
Not ideal if you are looking for a no-code solution or are unfamiliar with integrating and fine-tuning deep learning models.
Stars
65
Forks
10
Language
Python
License
—
Category
Last pushed
Sep 28, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/raymin0223/fast_robust_early_exit"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
ShiZhengyan/InstructionModelling
[NeurIPS 2024 Main Track] Code for the paper titled "Instruction Tuning With Loss Over Instructions"
SALT-NLP/Adaptive-Compositional-Modules
Code for the ACL 2022 paper "Continual Sequence Generation with Adaptive Compositional Modules"
oooranz/Baby-CoThought
🍼 Baby's CoThought: Leveraging LLMs for Enhanced Reasoning in Compact Models (BabyLM Challenge)
joisino/zeh
Code for "Even GPT-5.2 Can’t Count to Five: The Case for Zero-Error Horizons in Trustworthy LLMs"
yhy1117/X-Mixup
Implementation of ICLR 2022 paper "Enhancing Cross-lingual Transfer by Manifold Mixup".