scurkovic/cutters
A rule based sentence segmentation library.
This helps break down raw text into individual sentences, even when dealing with complex punctuation like abbreviations or quoted speech. You input a block of text in Croatian or English, and it outputs a list of clearly separated sentences. This is useful for anyone working with text data, such as researchers, linguists, or data analysts preparing text for further processing.
No commits in the last 6 months.
Use this if you need to accurately split large volumes of text into individual sentences for analysis, translation, or other natural language processing tasks.
Not ideal if you need to process text in languages other than Croatian or English, or if you require highly specialized segmentation rules beyond standard grammatical structures.
Stars
14
Forks
—
Language
Rust
License
MIT
Category
Last pushed
Jul 17, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/scurkovic/cutters"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
PyThaiNLP/nlpo3
Thai natural language processing library in Rust, with Python and Node bindings.
forzagreen/n2words
Convert numerical numbers to written numbers, in 52+ languages.
greyblake/whatlang-rs
Natural language detection library for Rust. Try demo online: https://whatlang.org/
wikimedia/sentencex
A sentence segmentation library with wide language support optimized for speed and utility.
pemistahl/lingua-rs
The most accurate natural language detection library for Rust, suitable for short text and...