Hironsan/IOB2Corpus

Japanese IOB2 tagged corpus for Named Entity Recognition.

35
/ 100
Emerging

This project provides a specialized collection of Japanese text, where each word is marked to identify specific entities like names or locations. It takes raw Japanese news articles and Wikipedia text and processes them into a format that highlights these named entities. This is useful for researchers and developers working with natural language processing in Japanese, particularly for tasks like information extraction.

No commits in the last 6 months.

Use this if you need a pre-tagged dataset of Japanese text for training or evaluating models that identify named entities.

Not ideal if you need to process text in languages other than Japanese, or if you require a corpus tagged for different linguistic phenomena.

Japanese NLP Named Entity Recognition Corpus Linguistics Information Extraction
No License Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 8 / 25
Maturity 8 / 25
Community 19 / 25

How are scores calculated?

Stars

61

Forks

18

Language

License

Last pushed

Feb 25, 2020

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/Hironsan/IOB2Corpus"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.