moeki0/baran

Text Splitter for Large Language Model (LLM) datasets.

/ 100

Experimental

When preparing long documents or articles for use with AI models, it's often necessary to break them into smaller, manageable pieces to fit within token limits and improve search accuracy. This tool takes your raw text (like a research paper, book chapter, or website content) and splits it into smaller chunks, ensuring relevant context is maintained across the breaks. It's designed for developers building applications that interact with large language models, helping them preprocess text data effectively.

No commits in the last 6 months.

Use this if you are a developer building an application that needs to break down lengthy text documents into smaller, context-aware segments for large language models, vector databases, or information retrieval systems.

Not ideal if you need a non-programmatic, visual tool for document splitting or if your primary need is simple text extraction without context preservation for AI models.

AI application development text preprocessing natural language processing information retrieval data engineering

Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 6 / 25

Maturity 16 / 25

Community 5 / 25

How are scores calculated?

Stars

Forks

Language

Ruby

License

MIT

Higher-rated alternatives

microsoft/multilspy

multilspy is a lsp client library in Python intended to be used to build applications around...

mlc-ai/xgrammar

Fast, Flexible and Portable Structured Generation

vicentereig/dspy.rb

The Ruby framework for programming—rather than prompting—language models.

feenkcom/gt4llm

A GT package for working with LLMs

Evref-BL/Pharo-LLMAPI

Use LLM API from Pharo

Explore LLM Tools

All categories Trending LLM Tool directory Insights