ray_vllm_inference and ray-llm

The former is a specific implementation utilizing vLLM on Ray Serve for scalable inference, whereas the latter appears to be a broader, now-archived project or framework that encompassed LLMs on Ray, making them ecosystem siblings where one potentially built upon or leveraged components from the other within the Ray ecosystem.

ray_vllm_inference

Emerging

ray-llm

Emerging

Maintenance 0/25

Adoption 9/25

Maturity 16/25

Community 14/25

Maintenance 0/25

Adoption 10/25

Maturity 8/25

Community 18/25

Stars: 78

Forks: 11

Downloads: —

Commits (30d): 0

Language: Python

License: Apache-2.0

Stars: 1,267

Forks: 91

Downloads: —

Commits (30d): 0

Language: —

License: —

Stale 6m No Package No Dependents

Archived No License Stale 6m No Package No Dependents

About ray_vllm_inference

asprenger/ray_vllm_inference

A simple service that integrates vLLM with Ray Serve for fast and scalable LLM serving.

This service helps developers serve large language models (LLMs) quickly and efficiently. It takes an LLM from Hugging Face and serves it as an API endpoint, returning generated text based on prompts. This is for machine learning engineers or MLOps teams who need to deploy LLMs for applications requiring high throughput and responsiveness.

LLM deployment MLOps AI infrastructure API serving scalable inference

About ray-llm

ray-project/ray-llm

RayLLM - LLMs on Ray (Archived). Read README for more info.

Scores updated daily from GitHub, PyPI, and npm data. How scores work