Ki6an/fastT5

⚡ boost inference speed of T5 models by 5x & reduce the model size by 3x.

/ 100

Established

When performing natural language tasks like summarization, translation, or question answering using T5 models, processing times can be slow, especially with larger models. This tool optimizes your existing T5 models to run up to 5 times faster while reducing their size by 3 times. This is perfect for machine learning engineers or data scientists deploying T5 models in production environments where speed and efficiency are crucial.

589 stars. No commits in the last 6 months. Available on PyPI.

Use this if you need to speed up the inference of your T5-based natural language processing models and reduce their memory footprint.

Not ideal if you are not working with T5 models or if your primary concern is model training speed rather than inference performance.

natural-language-processing machine-learning-operations text-summarization machine-translation question-answering

Stale 6m No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 25 / 25

Community 19 / 25

How are scores calculated?

Stars

589

Forks

Language

Python

License

Apache-2.0

Related frameworks

NVIDIA/TransformerEngine

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit and 4-bit...

mlcommons/inference

Reference implementations of MLPerf® inference benchmarks

mlcommons/training

Reference implementations of MLPerf® training benchmarks

datamade/usaddress

:us: a python library for parsing unstructured United States address strings into address components

GRAAL-Research/deepparse

Deepparse is a state-of-the-art library for parsing multinational street addresses using deep learning

Explore ML Frameworks

All categories Trending ML Framework directory Insights