Hironsan/IOB2Corpus
Japanese IOB2 tagged corpus for Named Entity Recognition.
This project provides a specialized collection of Japanese text, where each word is marked to identify specific entities like names or locations. It takes raw Japanese news articles and Wikipedia text and processes them into a format that highlights these named entities. This is useful for researchers and developers working with natural language processing in Japanese, particularly for tasks like information extraction.
No commits in the last 6 months.
Use this if you need a pre-tagged dataset of Japanese text for training or evaluating models that identify named entities.
Not ideal if you need to process text in languages other than Japanese, or if you require a corpus tagged for different linguistic phenomena.
Stars
61
Forks
18
Language
—
License
—
Category
Last pushed
Feb 25, 2020
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/Hironsan/IOB2Corpus"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
MantisAI/nervaluate
Full named-entity (i.e., not tag/token) evaluation metrics based on SemEval’13
dice-group/gerbil
GERBIL - General Entity annotatoR Benchmark
bltlab/seqscore
SeqScore: Scoring for named entity recognition and other sequence labeling tasks
syuoni/eznlp
Easy Natural Language Processing
LHNCBC/metamaplite
A near real-time named-entity recognizer