inclusionAI/asystem-awex
A high-performance RL training-inference weight synchronization framework, designed to enable second-level parameter updates from training to inference in RL workflows
This project helps machine learning engineers and researchers quickly update large-scale Reinforcement Learning (RL) models in production. It takes newly trained model parameters (weights) and instantly synchronizes them with the models running inference, ensuring that real-time applications always use the latest, most optimized version. It's designed for anyone managing RL systems where rapid model iteration and deployment are critical for performance.
138 stars.
Use this if you need to update trillion-parameter RL models running in inference environments within seconds, ensuring minimal latency between training and deployment.
Not ideal if you are working with small models, non-RL workflows, or if your application does not require extremely low-latency model updates.
Stars
138
Forks
12
Language
Python
License
Apache-2.0
Category
Last pushed
Mar 11, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/inclusionAI/asystem-awex"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
sgl-project/sglang
SGLang is a high-performance serving framework for large language models and multimodal models.
alibaba/MNN
MNN: A blazing-fast, lightweight inference engine battle-tested by Alibaba, powering...
xorbitsai/inference
Swap GPT for any LLM by changing a single line of code. Xinference lets you run open-source,...
tensorzero/tensorzero
TensorZero is an open-source stack for industrial-grade LLM applications. It unifies an LLM...