PrunaAI/pruna
Pruna is a model optimization framework built for developers, enabling you to deliver faster, more efficient models with minimal overhead.
This tool helps machine learning engineers and AI developers make their AI models faster, smaller, and more cost-effective. You input your existing, trained AI model (like an LLM or a vision transformer), and it outputs an optimized version that runs more efficiently. It's designed for developers building and deploying AI solutions who need to improve model performance and resource usage.
1,142 stars. Actively maintained with 17 commits in the last 30 days.
Use this if you are a machine learning engineer looking to reduce the inference time, memory footprint, or operational cost of your deployed AI models.
Not ideal if you are an end-user without programming skills or if your primary goal is to train a new AI model from scratch rather than optimize an existing one.
Stars
1,142
Forks
85
Language
Python
License
Apache-2.0
Category
Last pushed
Mar 26, 2026
Commits (30d)
17
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/diffusion/PrunaAI/pruna"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related models
bytedance/LatentSync
Taming Stable Diffusion for Lip Sync!
haoheliu/AudioLDM-training-finetuning
AudioLDM training, finetuning, evaluation and inference.
Text-to-Audio/Make-An-Audio
PyTorch Implementation of Make-An-Audio (ICML'23) with a Text-to-Audio Generative Model
teticio/audio-diffusion
Apply diffusion models using the new Hugging Face diffusers package to synthesize music instead...
ivanvovk/WaveGrad
Implementation of WaveGrad high-fidelity vocoder from Google Brain in PyTorch.