jdaln/dgx-spark-inference-stack

Serve the home! Inference stack for your Nvidia DGX Spark aka the Grace Blackwell AI supercomputer on your desk. Mostly vLLM based for now

38
/ 100
Emerging

This project helps owners of an Nvidia DGX Spark AI supercomputer to run and serve large language models (LLMs) efficiently on their device. It takes various LLM model files as input and provides an API endpoint to generate text completions or handle chat interactions. This is ideal for AI researchers, enthusiasts, or small businesses with a DGX Spark who want to utilize its powerful capabilities for local AI inference.

Use this if you own an Nvidia DGX Spark and want to easily deploy and manage large language models for local inference, getting the most out of your hardware.

Not ideal if you do not have an Nvidia DGX Spark or are looking for a cloud-based LLM serving solution.

AI-inference large-language-models home-lab AI-supercomputer-management local-AI-deployment
No Package No Dependents
Maintenance 10 / 25
Adoption 7 / 25
Maturity 11 / 25
Community 10 / 25

How are scores calculated?

Stars

26

Forks

3

Language

JavaScript

License

Apache-2.0

Last pushed

Feb 24, 2026

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/jdaln/dgx-spark-inference-stack"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.