dvmazur/mixtral-offloading

Run Mixtral-8x7B models in Colab or consumer desktops

/ 100

Emerging

This tool helps developers and researchers run large Mixtral-8x7B language models on hardware with limited GPU memory, such as consumer desktops or Google Colab environments. It takes a Mixtral model and processes it to efficiently use both GPU and CPU memory, allowing for text generation or other inference tasks that would otherwise require more powerful, expensive hardware. It's designed for machine learning practitioners experimenting with large language models.

2,327 stars. No commits in the last 6 months.

Use this if you need to run Mixtral-8x7B models for inference but only have access to consumer-grade GPUs or cloud environments like Google Colab.

Not ideal if you already have access to high-end GPUs with ample memory for large language models, or if you need a command-line interface for local execution without coding.

large-language-models machine-learning-inference resource-constrained-ml ai-model-deployment ml-experimentation

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 20 / 25

How are scores calculated?

Stars

2,327

Forks

234

Language

Python

License

MIT

Higher-rated alternatives

mistralai/mistral-inference

Official inference library for Mistral models

open-compass/MixtralKit

A toolkit for inference and evaluation of 'mixtral-8x7b-32kseqlen' from Mistral AI

vicuna-tools/vicuna-installation-guide

The "vicuna-installation-guide" provides step-by-step instructions for installing and...

pleisto/yuren-13b

Yuren 13B is an information synthesis large language model that has been continuously trained...

hkproj/mistral-llm-notes

Notes on the Mistral AI model

Explore Transformer Models

All categories Trending Transformer directory Insights