chonkie and chunking-strategies

A production-ready chunking library and a research overview repository are **complements**: the latter informs the design decisions and benchmarking choices for the former, while practitioners using the former might consult the latter to understand the algorithmic tradeoffs underlying their chunking strategy.

chonkie

Verified

chunking-strategies

Emerging

Maintenance 22/25

Adoption 15/25

Maturity 25/25

Community 18/25

Maintenance 0/25

Adoption 9/25

Maturity 8/25

Community 19/25

Stars: 3,829

Forks: 256

Downloads: —

Commits (30d): 82

Language: Python

License: MIT

Stars: 85

Forks: 18

Downloads: —

Commits (30d): 0

Language: Jupyter Notebook

License: —

No risk flags

No License Stale 6m No Package No Dependents

About chonkie

chonkie-inc/chonkie

🦛 CHONK docs with Chonkie ✨ — The lightweight ingestion library for fast, efficient and robust RAG pipelines

This is a lightweight tool for developers building Retrieval-Augmented Generation (RAG) applications. It takes various forms of text data, processes it by intelligently splitting it into smaller, meaningful parts (chunks), and then refines and embeds these chunks. The output is optimized text chunks ready to be stored in a vector database for efficient retrieval by large language models.

RAG development LLM application development text preprocessing vector database integration AI application engineering

About chunking-strategies

ALucek/chunking-strategies

An Overview of the Latest Document Chunking Research

This project helps you prepare large text documents for use with AI systems like chatbots or question-answering tools. It takes your raw, unstructured text and breaks it down into smaller, optimized pieces that improve how accurately the AI can understand and respond to your queries. Anyone building or managing RAG (Retrieval Augmented Generation) applications, from content managers to data scientists, would find this useful.

AI-application-development natural-language-processing text-retrieval knowledge-management generative-AI

Related comparisons

chonkie and chunklet-py chonkie and jchunk chonkie and chonkiejs chonkie and chonkify chonkie and rag-chunk chonkie and SmartChunk

Scores updated daily from GitHub, PyPI, and npm data. How scores work