PaddlePaddle/FastDeploy

High-performance Inference and Deployment Toolkit for LLMs and VLMs based on PaddlePaddle

/ 100

Verified

This tool helps machine learning engineers and AI researchers deploy large language models (LLMs) and vision-language models (VLMs) efficiently. It takes trained PaddlePaddle-based models and optimizes them for high-performance inference, outputting a production-ready deployment solution. You would use this if you need to serve advanced AI models like ERNIE-4.5 or PaddleOCR-VL in real-world applications with speed and reliability.

3,659 stars. Actively maintained with 221 commits in the last 30 days.

Use this if you need to rapidly deploy and serve large language or vision-language AI models from the PaddlePaddle ecosystem, requiring high performance and compatibility with various hardware.

Not ideal if your primary focus is on training new models or if you are not working with PaddlePaddle-based LLMs or VLMs.

AI model deployment Large Language Models Vision-Language Models AI inference optimization Machine Learning Engineering

No Package No Dependents

Maintenance 22 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 25 / 25

How are scores calculated?

Stars

3,659

Forks

720

Language

Python

License

Apache-2.0

Compare

FastDeploy and llm-deploy

Related models

mlc-ai/mlc-llm

Universal LLM Deployment Engine with ML Compilation

skyzh/tiny-llm

A course of learning LLM inference serving on Apple Silicon for systems engineers: build a tiny...

ServerlessLLM/ServerlessLLM

Serverless LLM Serving for Everyone.

AXERA-TECH/ax-llm

Explore LLM model deployment based on AXera's AI chips

AmpereComputingAI/ampere_model_library

AML's goal is to make benchmarking of various AI architectures on Ampere CPUs a pleasurable experience :)

Explore Transformer Models

All categories Trending Transformer directory Insights