NetEase-Media/grps_trtllm

Higher performance OpenAI LLM service than vLLM serve: A pure C++ high-performance OpenAI LLM service implemented with GPRS+TensorRT-LLM+Tokenizers.cpp, supporting chat and function call, AI agents, distributed multi-GPU inference, multimodal capabilities, and a Gradio chat interface.

/ 100

Emerging

This project helps large organizations and tech companies deploy high-performance AI large language models (LLMs) and multimodal models for various internal and external applications. It takes in user prompts, images, and other data, processing them through advanced AI models to generate text, facilitate AI agent workflows, and execute function calls. This is ideal for AI product managers, machine learning operations engineers, and technical leaders who need to serve advanced AI capabilities with maximum efficiency.

158 stars.

Use this if you need to run large language models and multimodal AI services with superior speed and efficiency compared to existing solutions, especially for AI agents or function calling applications.

Not ideal if you are a single user or small team without significant GPU resources, as this project is designed for high-scale, production-grade AI inference.

AI-service-deployment large-language-models multimodal-AI AI-agent-orchestration high-performance-inference

No Package No Dependents

Maintenance 6 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 10 / 25

How are scores calculated?

Stars

158

Forks

Language

Python

License

Apache-2.0

Higher-rated alternatives

hassancs91/SimplerLLM

Simplify interactions with Large Language Models

tylerelyt/LLM-Workshop

🌟 Learn Large Language Model development through hands-on projects and real-world implementations

avilum/minrlm

Token-efficient Recursive Language Model. 3.6x fewer tokens than vanilla LLMs. Data never enters...

kyegomez/SingLoRA

This repository provides a minimal, single-file implementation of SingLoRA (Single Matrix...

parvbhullar/superpilot

LLMs based multi-model framework for building AI apps.

Explore Transformer Models

All categories Trending Transformer directory Insights