Arabic NLP Tools
Comprehensive NLP processing libraries, toolkits, and resources specifically for Arabic and Arabic dialects (including Modern Standard Arabic, Moroccan Darija, Tunisian Derja, Sudanese Arabic). Includes tokenization, POS tagging, stemming, diacritization, syntax analysis, and dialect-specific datasets. Does NOT include general multilingual NLP tools, non-Arabic language resources, or downstream applications (sentiment analysis, translation, etc.) unless Arabic processing is the primary focus.
There are 36 arabic nlp tools tracked. 2 score above 50 (established tier). The highest-rated is CAMeL-Lab/camel_tools at 67/100 with 538 stars.
Get all 36 projects as JSON
curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=nlp&subcategory=arabic-nlp-tools&limit=20"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
| # | Tool | Score | Tier |
|---|---|---|---|
| 1 |
CAMeL-Lab/camel_tools
A suite of Arabic natural language processing tools developed by the CAMeL... |
|
Established |
| 2 |
PetrKorab/Arabica
Python package for text mining of time-series data |
|
Established |
| 3 |
markuskiller/textblob-de
German language support for TextBlob. |
|
Emerging |
| 4 |
MagedSaeed/farasapy
A Python implementation of Farasa toolkit |
|
Emerging |
| 5 |
adhaamehab/textblob-ar
Arabic support for textblob |
|
Emerging |
| 6 |
ARBML/tkseem
Arabic Tokenization Library. It provides many tokenization algorithms. |
|
Emerging |
| 7 |
01walid/awesome-arabic
A curated list of awesome projects and dev/design resources for supporting... |
|
Emerging |
| 8 |
AnwarCS/Sudanese-Arabic-LLM
Building a Sudanese Arabic dataset and fine-tuning LLMs to improve... |
|
Emerging |
| 9 |
CompLin/nheengatu
Tools and resources for the computational processing of Nheengatu (Modern Tupi) |
|
Emerging |
| 10 |
ARBML/tnkeeh
Arabic cleaning, normalization and segmentation library. |
|
Emerging |
| 11 |
Ruqyai/Ruqia-Library
Python library used for Arabic NLP to process, prepare and clean the Arabic text |
|
Emerging |
| 12 |
linuxscout/arabicnlptoolslist
Arabic NLP tools List inventory |
|
Emerging |
| 13 |
Seen-Arabic/Arabic-Services
بعض الخدمات البرمجية على نصوص اللغة العربية |
|
Emerging |
| 14 |
AsoSoft/AsoSoft-Library-py
AsoSoft's Library for Kurdish language processing tasks in python |
|
Emerging |
| 15 |
mohabmes/Arabycia
Arabic NLP tool used to perform Text Search, POS tagging, Translation,... |
|
Emerging |
| 16 |
sudaverse/sudaverse
The Sudaverse ecosystem - Building Sudanese Arabic into the Heart of AI |
|
Experimental |
| 17 |
ARBML/nmatheg
A simple strategy for training and finetuning NLP models for Arabic. Specify... |
|
Experimental |
| 18 |
bahaeddinmselmi/derja-smart-scraper
A lightweight CLI tool for collecting Tunisian Derja text snippets from the... |
|
Experimental |
| 19 |
OussamaBenSlama/safwaText
safwaText is a Python package designed to clean, stem, and transform Arabic... |
|
Experimental |
| 20 |
AliOsamaHassan/Quran-and-Arabic-Language-Repository
Projects & Libraries related to Quran & Arabic Language |
|
Experimental |
| 21 |
gtoffoli/spacy-ar_core_news_md
Unofficial Arabic language model for spaCy |
|
Experimental |
| 22 |
iamjazzar/matn
A shared space for Arabic text processors. |
|
Experimental |
| 23 |
Rashidbm/pysarf
Python-native Arabic morphology engine powered by NumPy — root extraction,... |
|
Experimental |
| 24 |
SssiiiSssiii/ArabicTextCleaner
Arabic Text Cleaner |
|
Experimental |
| 25 |
sinaahmadi/ScriptNormalization
Script Normalization for Unconventional Writing of Perso-Arabic scripts (ACL2023) |
|
Experimental |
| 26 |
MujtabaMohsin/Syntactic-Positioning-for-Short-Arabic-Sentences
Irab Al-Ishraf (إعراب الأشراف) is a java application for syntactic... |
|
Experimental |
| 27 |
jerbarnes/nordial
NorDial is a project that aims to create resources and collect knowledge... |
|
Experimental |
| 28 |
ayzem88/syntactic-selector
أداة متقدمة لتحليل التراكيب اللغوية العربية |
|
Experimental |
| 29 |
abjed/Arabic-NLP-resources
📚 This project holds an inventory of NLP resources for Arabic. |
|
Experimental |
| 30 |
gtoffoli/spacy-cameltokenizer
Tokenizer extension for the Arabic language (MSA), integrating the... |
|
Experimental |
| 31 |
Kwimoad/ToDarija
Automatic translation application into Moroccan Darija. This project... |
|
Experimental |
| 32 |
wa3dbk/Barcha
Open source NLP resources for the Tunisian arabic dialect. |
|
Experimental |
| 33 |
theRealProHacker/dmg
An application that provides automatic transliteration to orientalists,... |
|
Experimental |
| 34 |
sudaverse/sudaverse-normalizer
Sudanese Arabic text normalization and cleaning toolkit |
|
Experimental |
| 35 |
bahaeddinmselmi/tunisian-arabic-ai-dataset
The largest open-source dataset for Tunisian Arabic (Derja) NLP, featuring... |
|
Experimental |
| 36 |
KhaledTofailieh/Location-Normalizer
In This Notebook I've build a Machine-Learning model that normalize region... |
|
Experimental |