smart-models/Normalized-Semantic-Chunker

Cutting-edge tool that unlocks the full potential of semantic chunking

39
/ 100
Emerging

This tool helps knowledge managers and AI engineers prepare long documents for large language models (LLMs) and retrieval systems. You input raw text, Markdown, or JSON files, and it produces semantically coherent document segments. These segments are optimized for consistent token counts, preventing issues like context window overflow in LLMs.

No commits in the last 6 months.

Use this if you need to precisely control the token size of text chunks while maintaining their semantic meaning, especially for RAG pipelines or other token-sensitive NLP applications.

Not ideal if you only need basic text splitting without concern for semantic coherence or precise control over chunk token counts.

knowledge-management text-processing retrieval-augmented-generation natural-language-processing large-language-models
Stale 6m No Package No Dependents
Maintenance 2 / 25
Adoption 6 / 25
Maturity 16 / 25
Community 15 / 25

How are scores calculated?

Stars

21

Forks

5

Language

Python

License

GPL-3.0

Last pushed

Sep 18, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/smart-models/Normalized-Semantic-Chunker"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.