datallmhub/ragctl
A powerful CLI tool to manage, test, and optimize RAG pipelines. Streamline your Retrieval-Augmented Generation workflows from terminal.
This tool helps AI engineers and developers prepare various documents like PDFs, Word files, and images for use in Retrieval-Augmented Generation (RAG) applications. It takes raw documents, extracts text using advanced OCR, intelligently breaks them into meaningful chunks, and exports them in formats like JSON or directly into a vector store. This streamlines the crucial data preparation step for building robust RAG systems.
Available on PyPI.
Use this if you need a robust, command-line solution to process a wide variety of documents, including scanned ones, into semantically meaningful chunks ready for your RAG pipeline or vector database.
Not ideal if you need a graphical user interface for document processing or are not working with RAG systems that require text chunking.
Stars
18
Forks
7
Language
Python
License
MIT
Category
Last pushed
Jan 12, 2026
Commits (30d)
0
Dependencies
31
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/rag/datallmhub/ragctl"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
Bessouat40/RAGLight
RAGLight is a modular framework for Retrieval-Augmented Generation (RAG). It makes it easy to...
superagent-ai/super-rag
Super performant RAG pipelines for AI apps. Summarization, Retrieve/Rerank and Code Interpreters...
feld-m/rag_blueprint
A modular framework for building and deploying Retrieval-Augmented Generation (RAG) systems with...
McKern3l/RAGdrag
RAG pipeline security testing toolkit - 27 techniques across 6 kill chain phases, mapped to MITRE ATLAS
mburaksayici/RAG-Boilerplate
RAG boilerplate with semantic/propositional chunking, hybrid search (BM25 + dense), LLM...