robert-mcdermott/ollama-batch-cluster

Large Scale Batch Processing with Ollama

/ 100

Emerging

This project helps you process a large number of text prompts using an LLM on multiple Ollama servers and GPUs, much faster than a single setup. You provide a file with many prompts and the system returns individual response files or a combined output. This is ideal for researchers, data analysts, or content creators who need to generate many LLM responses efficiently.

No commits in the last 6 months.

Use this if you need to run a large batch of prompts through an Ollama-based LLM and want to use multiple GPUs or servers to speed up the process significantly.

Not ideal if you only have a few prompts to process or are not comfortable setting up and managing multiple Ollama instances and GPU configurations.

LLM batch processing large scale text generation AI research content analysis data processing

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 7 / 25

Maturity 8 / 25

Community 17 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

—

Related models

anmolg1997/Multi-LoRA-Serve

Multi-adapter inference gateway — one base model, many LoRA adapters per-request,...

kimmmmyy223/llm-batch

🚀 Process JSON data in batches with `llm-batch`, leveraging sequential or parallel modes for...

Rohit2sali/vllm-multi-tenant-llm-gateway

This is vllm multi tenant large language model gateway. This system is created to serve lot of...

Explore Transformer Models

All categories Trending Transformer directory Insights