Document Chunking NLP Tools
There are 5 document chunking tools tracked. 1 score above 50 (established tier). The highest-rated is mirth/chonky at 51/100 with 407 stars.
Get all 5 projects as JSON
curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=nlp&subcategory=document-chunking&limit=20"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
| # | Tool | Score | Tier |
|---|---|---|---|
| 1 |
mirth/chonky
Fully neural approach for text chunking |
|
Established |
| 2 |
sentencizer/sentencizer
A sentence splitting (sentence boundary disambiguation) library for Go. It... |
|
Emerging |
| 3 |
jackfsuia/bert-chunker
bert-chunker: efficient and trained chunking for unstructured documents. ... |
|
Emerging |
| 4 |
prajwal10001/semantic-chunker-langchain
Token-aware, LangChain-compatible semantic chunker with PDF, markdown, and... |
|
Experimental |
| 5 |
bgokden/fast-text-splitter
fast text splitter with onnx |
|
Experimental |