inference and PowerInfer

These are competitors: both provide local LLM inference engines with unified interfaces for running open-source models, though Xinference emphasizes multi-modal support and cloud/on-prem flexibility while PowerInfer focuses on speed optimization through GPU-CPU co-inference.

inference
76
Verified
PowerInfer
54
Established
Maintenance 22/25
Adoption 10/25
Maturity 25/25
Community 19/25
Maintenance 10/25
Adoption 10/25
Maturity 16/25
Community 18/25
Stars: 9,129
Forks: 805
Downloads:
Commits (30d): 63
Language: Python
License: Apache-2.0
Stars: 8,808
Forks: 501
Downloads:
Commits (30d): 0
Language: C++
License: MIT
No risk flags
No Package No Dependents

About inference

xorbitsai/inference

Swap GPT for any LLM by changing a single line of code. Xinference lets you run open-source, speech, and multimodal models on cloud, on-prem, or your laptop — all through one unified, production-ready inference API.

This tool helps AI developers and researchers deploy and manage various artificial intelligence models, including large language models (LLMs), speech recognition, and multimodal models. It takes trained AI models and makes them accessible through a unified API, allowing other applications to easily interact with them. Anyone building AI-powered applications, from chatbots to image analysis tools, would use this to put their models into production.

AI-application-development model-serving LLM-deployment speech-recognition-systems multimodal-AI

About PowerInfer

Tiiny-AI/PowerInfer

High-speed Large Language Model Serving for Local Deployment

PowerInfer helps you run large AI language models directly on your personal computer using a single consumer-grade graphics card, making them faster and more accessible. It takes a model file and your input, then rapidly generates responses, allowing individuals or small businesses to use powerful AI locally without needing expensive server hardware. This is ideal for researchers, developers, or anyone needing to run LLMs privately and quickly on their own machine.

AI-on-device local-LLM-deployment personal-AI consumer-AI edge-AI

Scores updated daily from GitHub, PyPI, and npm data. How scores work