laelhalawani/glai
glai - GGUF LLAMA AI - Package for simplified model handling and text generation with Llama models quantized to GGUF format. APIs for downloading and loading models automatically, includes a db with models of various scale and quantizations. With this high level API you need one line to load the model and one to generate text completions.
No commits in the last 6 months. Available on PyPI.
Stars
6
Forks
—
Language
Python
License
—
Category
Last pushed
Jan 14, 2024
Commits (30d)
0
Dependencies
4
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/laelhalawani/glai"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
ModelCloud/GPTQModel
LLM model quantization (compression) toolkit with hw acceleration support for Nvidia CUDA, AMD...
intel/auto-round
🎯An accuracy-first, highly efficient quantization toolkit for LLMs, designed to minimize quality...
pytorch/ao
PyTorch native quantization and sparsity for training and inference
bodaay/HuggingFaceModelDownloader
Simple go utility to download HuggingFace Models and Datasets
NVIDIA/kvpress
LLM KV cache compression made easy