ray_vllm_inference and ray-llm

The former is a specific implementation utilizing vLLM on Ray Serve for scalable inference, whereas the latter appears to be a broader, now-archived project or framework that encompassed LLMs on Ray, making them ecosystem siblings where one potentially built upon or leveraged components from the other within the Ray ecosystem.

ray_vllm_inference
39
Emerging
ray-llm
36
Emerging
Maintenance 0/25
Adoption 9/25
Maturity 16/25
Community 14/25
Maintenance 0/25
Adoption 10/25
Maturity 8/25
Community 18/25
Stars: 78
Forks: 11
Downloads:
Commits (30d): 0
Language: Python
License: Apache-2.0
Stars: 1,267
Forks: 91
Downloads:
Commits (30d): 0
Language:
License:
Stale 6m No Package No Dependents
Archived No License Stale 6m No Package No Dependents

About ray_vllm_inference

asprenger/ray_vllm_inference

A simple service that integrates vLLM with Ray Serve for fast and scalable LLM serving.

This service helps developers serve large language models (LLMs) quickly and efficiently. It takes an LLM from Hugging Face and serves it as an API endpoint, returning generated text based on prompts. This is for machine learning engineers or MLOps teams who need to deploy LLMs for applications requiring high throughput and responsiveness.

LLM deployment MLOps AI infrastructure API serving scalable inference

About ray-llm

ray-project/ray-llm

RayLLM - LLMs on Ray (Archived). Read README for more info.

Scores updated daily from GitHub, PyPI, and npm data. How scores work