cfiltnlp/HiNER
This repository contains the HiNER dataset released with our paper at LREC 2022
This dataset provides a comprehensive collection of Hindi text where specific terms like names of people, places, and organizations have been identified and tagged. It's designed for natural language processing engineers or researchers who are developing or improving systems that need to understand and extract key information from Hindi language content. You put in raw Hindi text, and it helps your system learn to output that same text with named entities clearly marked.
No commits in the last 6 months.
Use this if you are building, training, or evaluating a system that needs to automatically recognize and categorize named entities in Hindi text.
Not ideal if you are looking for a pre-built tool or API to perform named entity recognition directly without needing to train a model, or if your focus is on languages other than Hindi.
Stars
16
Forks
5
Language
Jupyter Notebook
License
CC-BY-SA-4.0
Category
Last pushed
Jun 06, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/cfiltnlp/HiNER"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
chakki-works/seqeval
A Python framework for sequence labeling evaluation(named-entity recognition, pos tagging, etc...)
Hironsan/anago
Bidirectional LSTM-CRF and ELMo for Named-Entity Recognition, Part-of-Speech Tagging and so on.
jbesomi/texthero
Text preprocessing, representation and visualization from zero to hero.
hamelsmu/ktext
Utilities for preprocessing text for deep learning with Keras
asahi417/tner
Language model fine-tuning on NER with an easy interface and cross-domain evaluation. "T-NER: An...