inclusionAI/asystem-awex

A high-performance RL training-inference weight synchronization framework, designed to enable second-level parameter updates from training to inference in RL workflows

/ 100

Emerging

This project helps machine learning engineers and researchers quickly update large-scale Reinforcement Learning (RL) models in production. It takes newly trained model parameters (weights) and instantly synchronizes them with the models running inference, ensuring that real-time applications always use the latest, most optimized version. It's designed for anyone managing RL systems where rapid model iteration and deployment are critical for performance.

138 stars.

Use this if you need to update trillion-parameter RL models running in inference environments within seconds, ensuring minimal latency between training and deployment.

Not ideal if you are working with small models, non-RL workflows, or if your application does not require extremely low-latency model updates.

Reinforcement Learning Model Deployment Real-time AI High-Performance Computing AI Infrastructure

No Package No Dependents

Maintenance 10 / 25

Adoption 10 / 25

Maturity 13 / 25

Community 12 / 25

How are scores calculated?

Stars

138

Forks

Language

Python

License

Apache-2.0

Higher-rated alternatives

vllm-project/vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

sgl-project/sglang

SGLang is a high-performance serving framework for large language models and multimodal models.

alibaba/MNN

MNN: A blazing-fast, lightweight inference engine battle-tested by Alibaba, powering...

xorbitsai/inference

Swap GPT for any LLM by changing a single line of code. Xinference lets you run open-source,...

tensorzero/tensorzero

TensorZero is an open-source stack for industrial-grade LLM applications. It unifies an LLM...

Explore Transformer Models

All categories Trending Transformer directory Insights