rtp-llm and FastFlowLM

rtp-llm

Verified

FastFlowLM

Established

Maintenance 22/25

Adoption 10/25

Maturity 16/25

Community 22/25

Maintenance 22/25

Adoption 10/25

Maturity 15/25

Community 15/25

Stars: 1,065

Forks: 159

Downloads: —

Commits (30d): 163

Language: Cuda

License: Apache-2.0

Stars: 942

Forks: 51

Downloads: —

Commits (30d): 176

Language: C++

License: —

No Package No Dependents

About rtp-llm

alibaba/rtp-llm

RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.

This is a high-performance engine for deploying large language models (LLMs) in real-world applications. It takes your trained LLM, potentially with multimodal inputs like images and text, and efficiently generates responses for a large number of users. It is designed for engineers and AI product managers responsible for running LLM-powered services like AI assistants or smart search features at scale.

AI-application-deployment LLM-serving AI-platform-operations conversational-AI enterprise-search

About FastFlowLM

FastFlowLM/FastFlowLM

Run LLMs on AMD Ryzen™ AI NPUs in minutes. Just like Ollama - but purpose-built and deeply optimized for the AMD NPUs.

This tool helps you run large language models (LLMs) like Llama and other advanced AI models, including those with vision and audio capabilities, directly on your AMD Ryzen AI NPU. You simply type in commands or use an API to interact with these models, getting instant responses without needing a powerful graphics card. It's designed for anyone who wants to use AI locally for tasks like content generation, data analysis, or creative applications, especially on AMD NPU-equipped laptops or desktops.

local-AI on-device-inference large-language-models AI-powered-applications private-AI

Related comparisons

rtp-llm and vllm rtp-llm and xllm rtp-llm and LightLLM rtp-llm and ZhiLight rtp-llm and PowerInfer

Scores updated daily from GitHub, PyPI, and npm data. How scores work