skorani/tokenizer

An open source High level Persian Tokenizer

34
/ 100
Emerging

If you're working with Persian text, this tool helps you break down sentences or documents into meaningful units, or "tokens." This is crucial for tasks like text analysis or search, taking raw Persian text and outputting a list of its constituent semantic tokens. This is designed for anyone needing to process Persian language data.

No commits in the last 6 months.

Use this if you need to prepare Persian text for any kind of computational analysis or natural language processing.

Not ideal if your Persian text is not already cleaned of punctuation and extra spaces, as this tool requires normalized input.

Persian-text-analysis Natural-Language-Processing Text-mining Data-preparation
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 5 / 25
Maturity 16 / 25
Community 13 / 25

How are scores calculated?

Stars

10

Forks

2

Language

Jupyter Notebook

License

MIT

Last pushed

Feb 20, 2020

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/skorani/tokenizer"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.