ARBML/tkseem
Arabic Tokenization Library. It provides many tokenization algorithms.
This library helps developers process Arabic text for machine learning tasks by breaking it down into smaller, meaningful units. It takes raw Arabic text as input and outputs processed tokens, which are essential for building natural language processing models. This is for software developers or data scientists working on Arabic language applications.
110 stars. No commits in the last 6 months.
Use this if you are a developer building an NLP application, such as sentiment analysis or machine translation, and need to prepare Arabic text data.
Not ideal if you are a non-technical user looking for a ready-to-use application to analyze Arabic text without writing code.
Stars
110
Forks
21
Language
Jupyter Notebook
License
MIT
Category
Last pushed
Jan 04, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/ARBML/tkseem"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
CAMeL-Lab/camel_tools
A suite of Arabic natural language processing tools developed by the CAMeL Lab at New York...
PetrKorab/Arabica
Python package for text mining of time-series data
markuskiller/textblob-de
German language support for TextBlob.
MagedSaeed/farasapy
A Python implementation of Farasa toolkit
adhaamehab/textblob-ar
Arabic support for textblob