moeki0/baran
Text Splitter for Large Language Model (LLM) datasets.
When preparing long documents or articles for use with AI models, it's often necessary to break them into smaller, manageable pieces to fit within token limits and improve search accuracy. This tool takes your raw text (like a research paper, book chapter, or website content) and splits it into smaller chunks, ensuring relevant context is maintained across the breaks. It's designed for developers building applications that interact with large language models, helping them preprocess text data effectively.
No commits in the last 6 months.
Use this if you are a developer building an application that needs to break down lengthy text documents into smaller, context-aware segments for large language models, vector databases, or information retrieval systems.
Not ideal if you need a non-programmatic, visual tool for document splitting or if your primary need is simple text extraction without context preservation for AI models.
Stars
19
Forks
1
Language
Ruby
License
MIT
Category
Last pushed
May 31, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/moeki0/baran"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
microsoft/multilspy
multilspy is a lsp client library in Python intended to be used to build applications around...
mlc-ai/xgrammar
Fast, Flexible and Portable Structured Generation
vicentereig/dspy.rb
The Ruby framework for programming—rather than prompting—language models.
feenkcom/gt4llm
A GT package for working with LLMs
Evref-BL/Pharo-LLMAPI
Use LLM API from Pharo