segment-any-text/wtpsplit

Toolkit to segment text into sentences or other semantic units in a robust, efficient and adaptable way.

62
/ 100
Established

This tool helps you break down large blocks of text into individual sentences or paragraphs, making it easier to analyze or process. You input raw text, and it outputs a list of cleanly separated sentences or paragraphs. Anyone who works with text data, such as researchers, content analysts, or data scientists, can use this to prepare text for further tasks like translation, summarization, or information extraction.

1,255 stars. Used by 1 other package. Available on PyPI.

Use this if you need to reliably split text into meaningful, distinct sentences or paragraphs, especially across many languages or diverse text styles.

Not ideal if you only need very basic punctuation-based sentence splitting for a single language, as simpler tools might suffice.

text-analysis natural-language-processing data-preparation content-management multilingual-data
Maintenance 10 / 25
Adoption 11 / 25
Maturity 25 / 25
Community 16 / 25

How are scores calculated?

Stars

1,255

Forks

82

Language

Python

License

MIT

Last pushed

Feb 26, 2026

Commits (30d)

0

Dependencies

8

Reverse dependents

1

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/segment-any-text/wtpsplit"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.