OpenBMB/BMInf

Efficient Inference for Big Models

45
/ 100
Emerging

This package helps machine learning engineers and researchers run very large language models, like those used for generating text or answering questions, on less powerful computer hardware. It takes your existing large language model and allows it to perform its tasks efficiently, even on a single consumer-grade GPU. The output is the same high-quality results from your large model, but with significantly reduced hardware requirements and improved speed.

587 stars. No commits in the last 6 months.

Use this if you need to deploy or experiment with extremely large pre-trained language models (10+ billion parameters) but are limited by GPU memory or want to achieve better performance on powerful GPUs.

Not ideal if you are working with smaller models that already fit comfortably within your GPU's memory or if you prefer not to modify your model's internal structure for optimization.

large-language-models NLP-deployment AI-inference-optimization natural-language-generation deep-learning-research
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 19 / 25

How are scores calculated?

Stars

587

Forks

66

Language

Python

License

Apache-2.0

Last pushed

Jan 24, 2023

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/OpenBMB/BMInf"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.