cfiltnlp/HiNER

This repository contains the HiNER dataset released with our paper at LREC 2022

37
/ 100
Emerging

This dataset provides a comprehensive collection of Hindi text where specific terms like names of people, places, and organizations have been identified and tagged. It's designed for natural language processing engineers or researchers who are developing or improving systems that need to understand and extract key information from Hindi language content. You put in raw Hindi text, and it helps your system learn to output that same text with named entities clearly marked.

No commits in the last 6 months.

Use this if you are building, training, or evaluating a system that needs to automatically recognize and categorize named entities in Hindi text.

Not ideal if you are looking for a pre-built tool or API to perform named entity recognition directly without needing to train a model, or if your focus is on languages other than Hindi.

Hindi language processing information extraction named entity recognition machine learning datasets natural language processing
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 6 / 25
Maturity 16 / 25
Community 15 / 25

How are scores calculated?

Stars

16

Forks

5

Language

Jupyter Notebook

License

CC-BY-SA-4.0

Last pushed

Jun 06, 2023

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/cfiltnlp/HiNER"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.