datallmhub/ragctl

A powerful CLI tool to manage, test, and optimize RAG pipelines. Streamline your Retrieval-Augmented Generation workflows from terminal.

50
/ 100
Established

This tool helps AI engineers and developers prepare various documents like PDFs, Word files, and images for use in Retrieval-Augmented Generation (RAG) applications. It takes raw documents, extracts text using advanced OCR, intelligently breaks them into meaningful chunks, and exports them in formats like JSON or directly into a vector store. This streamlines the crucial data preparation step for building robust RAG systems.

Available on PyPI.

Use this if you need a robust, command-line solution to process a wide variety of documents, including scanned ones, into semantically meaningful chunks ready for your RAG pipeline or vector database.

Not ideal if you need a graphical user interface for document processing or are not working with RAG systems that require text chunking.

AI-engineering NLP-data-prep document-processing RAG-application-development vector-database-ingestion
Maintenance 6 / 25
Adoption 6 / 25
Maturity 22 / 25
Community 16 / 25

How are scores calculated?

Stars

18

Forks

7

Language

Python

License

MIT

Last pushed

Jan 12, 2026

Commits (30d)

0

Dependencies

31

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/rag/datallmhub/ragctl"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.