raymin0223/fast_robust_early_exit

Fast and Robust Early-Exiting Framework for Autoregressive Language Models with Synchronized Parallel Decoding (EMNLP 2023 Long)

31
/ 100
Emerging

This project helps machine learning engineers and researchers speed up how quickly large language models generate text for tasks like summarization, question answering, and translation. It takes an existing autoregressive language model and processes its output to make predictions faster without losing accuracy. The end user is a machine learning engineer or researcher working with large language models.

No commits in the last 6 months.

Use this if you need to accelerate the text generation process of large language models for tasks like summarization or translation, and you are comfortable working with machine learning model deployments.

Not ideal if you are looking for a no-code solution or are unfamiliar with integrating and fine-tuning deep learning models.

natural-language-generation large-language-models model-inference-optimization text-summarization machine-translation
No License Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 8 / 25
Maturity 8 / 25
Community 15 / 25

How are scores calculated?

Stars

65

Forks

10

Language

Python

License

Last pushed

Sep 28, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/raymin0223/fast_robust_early_exit"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.