jparkerweb/semantic-chunking

🍱 semantic-chunking ⇢ semantically create chunks from large document for passing to LLM workflows

/ 100

Established

When preparing long documents for AI models, it's crucial to break them into smaller, meaningful pieces. This tool takes your raw text documents and automatically splits them into semantically coherent chunks, making the input more digestible and effective for large language models. This is ideal for anyone working with AI applications that process extensive text, like researchers analyzing scientific papers or content strategists summarizing articles.

134 stars. Used by 1 other package. Available on npm.

Use this if you need to intelligently divide large text documents into smaller, related sections for better performance in AI models.

Not ideal if your primary goal is simple, fixed-length text splitting without considering the meaning or context of sentences.

AI-content-preparation document-preprocessing text-analysis large-language-models knowledge-management

Maintenance 10 / 25

Adoption 11 / 25

Maturity 25 / 25

Community 13 / 25

How are scores calculated?

Stars

134

Forks

Language

JavaScript

License

MIT

Compare

semantic-chunking and Normalized-Semantic-Chunker semantic-chunking and go-semantic-chunking

Related tools

drittich/SemanticSlicer

🧠✂️ SemanticSlicer — A smart text chunker for LLM-ready documents.

smart-models/Normalized-Semantic-Chunker

Cutting-edge tool that unlocks the full potential of semantic chunking

ndgigliotti/afterthoughts

Sentence-aware embeddings using late chunking with transformers.

ReemHal/Semantic-Text-Segmentation-with-Embeddings

Uses GloVe embeddings and greedy sequence segmentation to semantically segment a text document...

agamm/semantic-split

A Python library to chunk/group your texts based on semantic similarity.

Explore Embedding Tools

All categories Trending Embeddings directory Insights