chonkie and chunky

These are complementary tools: Chonkie handles the core chunking and ingestion for RAG pipelines, while Chunky provides validation, visualization, and editing capabilities for inspecting and refining the chunks that Chonkie produces.

chonkie
80
Verified
chunky
32
Emerging
Maintenance 22/25
Adoption 15/25
Maturity 25/25
Community 18/25
Maintenance 10/25
Adoption 6/25
Maturity 11/25
Community 5/25
Stars: 3,829
Forks: 256
Downloads:
Commits (30d): 82
Language: Python
License: MIT
Stars: 17
Forks: 1
Downloads:
Commits (30d): 0
Language: Python
License: MIT
No risk flags
No Package No Dependents

About chonkie

chonkie-inc/chonkie

🦛 CHONK docs with Chonkie ✨ — The lightweight ingestion library for fast, efficient and robust RAG pipelines

This is a lightweight tool for developers building Retrieval-Augmented Generation (RAG) applications. It takes various forms of text data, processes it by intelligently splitting it into smaller, meaningful parts (chunks), and then refines and embeds these chunks. The output is optimized text chunks ready to be stored in a vector database for efficient retrieval by large language models.

RAG development LLM application development text preprocessing vector database integration AI application engineering

About chunky

GiovanniPasq/chunky

Validate, visualize, edit, and export chunks for RAG pipelines.

This tool helps AI engineers and data scientists build more reliable Retrieval-Augmented Generation (RAG) applications by ensuring the quality of source documents. You input PDFs and get out validated Markdown and perfectly structured data chunks, ready for your vector database. It's designed for anyone setting up RAG pipelines who needs to visually inspect and refine their document processing.

RAG-pipeline-development document-processing data-preparation LLM-engineering AI-application-development

Scores updated daily from GitHub, PyPI, and npm data. How scores work