dvmazur/mixtral-offloading

Run Mixtral-8x7B models in Colab or consumer desktops

46
/ 100
Emerging

This tool helps developers and researchers run large Mixtral-8x7B language models on hardware with limited GPU memory, such as consumer desktops or Google Colab environments. It takes a Mixtral model and processes it to efficiently use both GPU and CPU memory, allowing for text generation or other inference tasks that would otherwise require more powerful, expensive hardware. It's designed for machine learning practitioners experimenting with large language models.

2,327 stars. No commits in the last 6 months.

Use this if you need to run Mixtral-8x7B models for inference but only have access to consumer-grade GPUs or cloud environments like Google Colab.

Not ideal if you already have access to high-end GPUs with ample memory for large language models, or if you need a command-line interface for local execution without coding.

large-language-models machine-learning-inference resource-constrained-ml ai-model-deployment ml-experimentation
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 20 / 25

How are scores calculated?

Stars

2,327

Forks

234

Language

Python

License

MIT

Category

mistral-ai-tools

Last pushed

Apr 08, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/dvmazur/mixtral-offloading"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.