UglyToad/PragmaticSegmenterNet
Port of PragmaticSegmenter for sentence boundary detection
When you have a block of text and need to break it down into individual sentences, this tool helps. It takes raw text in various languages and outputs a clean list of separate sentences, making it useful for anyone working with textual data, such as researchers, content analysts, or linguists.
No commits in the last 6 months.
Use this if you need to accurately split paragraphs or longer text into distinct sentences, especially when dealing with multiple languages or text from different sources like PDFs or HTML.
Not ideal if your primary need is word-level tokenization or if you are working exclusively with highly structured data that doesn't require complex sentence boundary detection.
Stars
39
Forks
12
Language
C#
License
—
Category
Last pushed
Sep 21, 2021
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/UglyToad/PragmaticSegmenterNet"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
wooorm/franc
Natural language detection
microsoft/Recognizers-Text
Microsoft.Recognizers.Text provides recognition and resolution of numbers, units, date/time,...
winkjs/wink-pos-tagger
English Part-of-speech (POS) tagger
sillsdev/machine
Machine is a natural language processing library for .NET that is focused on providing tools for...
ayoungprogrammer/Lango
Language Lego