overfit-ir/parstwiner
Name Entity Recognition (NER) on the Persian Twitter dataset.
This corpus helps researchers and developers working with Persian language data identify and extract key information from informal text, specifically tweets. It takes raw Persian tweets and outputs text where named entities like people, organizations, locations, and events are clearly marked. Language technology researchers and data scientists focused on natural language processing for Persian would use this.
No commits in the last 6 months.
Use this if you need a high-quality, annotated dataset to train or evaluate machine learning models for named entity recognition in informal Persian text, such as social media content.
Not ideal if you are looking for a pre-trained model to use directly, rather than data for training or evaluating your own models.
Stars
53
Forks
3
Language
Jupyter Notebook
License
MIT
Category
Last pushed
Nov 10, 2021
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/overfit-ir/parstwiner"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
MantisAI/nervaluate
Full named-entity (i.e., not tag/token) evaluation metrics based on SemEval’13
dice-group/gerbil
GERBIL - General Entity annotatoR Benchmark
bltlab/seqscore
SeqScore: Scoring for named entity recognition and other sequence labeling tasks
syuoni/eznlp
Easy Natural Language Processing
LHNCBC/metamaplite
A near real-time named-entity recognizer