Ki6an/fastT5
⚡ boost inference speed of T5 models by 5x & reduce the model size by 3x.
When performing natural language tasks like summarization, translation, or question answering using T5 models, processing times can be slow, especially with larger models. This tool optimizes your existing T5 models to run up to 5 times faster while reducing their size by 3 times. This is perfect for machine learning engineers or data scientists deploying T5 models in production environments where speed and efficiency are crucial.
589 stars. No commits in the last 6 months. Available on PyPI.
Use this if you need to speed up the inference of your T5-based natural language processing models and reduce their memory footprint.
Not ideal if you are not working with T5 models or if your primary concern is model training speed rather than inference performance.
Stars
589
Forks
74
Language
Python
License
Apache-2.0
Category
Last pushed
Apr 24, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/Ki6an/fastT5"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related frameworks
NVIDIA/TransformerEngine
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit and 4-bit...
mlcommons/inference
Reference implementations of MLPerf® inference benchmarks
mlcommons/training
Reference implementations of MLPerf® training benchmarks
datamade/usaddress
:us: a python library for parsing unstructured United States address strings into address components
GRAAL-Research/deepparse
Deepparse is a state-of-the-art library for parsing multinational street addresses using deep learning