GiovanniPasq/chunky
Validate, visualize, edit, and export chunks for RAG pipelines.
This tool helps AI engineers and data scientists build more reliable Retrieval-Augmented Generation (RAG) applications by ensuring the quality of source documents. You input PDFs and get out validated Markdown and perfectly structured data chunks, ready for your vector database. It's designed for anyone setting up RAG pipelines who needs to visually inspect and refine their document processing.
Use this if you are building RAG applications and frequently encounter issues with document conversions or sub-optimal chunking leading to poor AI responses.
Not ideal if you are looking for a fully automated, hands-off RAG solution without needing to visually inspect and manually adjust document processing steps.
Stars
17
Forks
1
Language
Python
License
MIT
Category
Last pushed
Mar 12, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/rag/GiovanniPasq/chunky"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Compare
Higher-rated alternatives
chonkie-inc/chonkie
🦛 CHONK docs with Chonkie ✨ — The lightweight ingestion library for fast, efficient and robust...
speedyk-005/chunklet-py
One library to split them all: Sentence, Code, Docs. Chunk smarter, not harder — built for LLMs,...
jchunk-io/jchunk
JChunk is a lightweight and flexible library designed to provide multiple strategies for text...
andreshere00/Splitter_MR
Chunk your data into markdown text blocks for your LLM applications
chonkie-inc/chonkiejs
🦛 CHONK your texts with Chonkie ✨ Type-friendly, light-weight, fast and super-simple chunking library