Yinghao-Li/CHMM-ALT
Code for "BERTifying the Hidden Markov Model for Multi-Source Weakly Supervised Named Entity Recognition"
This project helps researchers and data scientists automatically identify specific entities, such as disease names or product features, within large collections of text. It takes raw text from various sources and processes it using weak labels (less precise annotations) to produce a dataset with identified named entities. This is useful for anyone working with unstructured text data who needs to extract key information without extensive manual annotation.
No commits in the last 6 months.
Use this if you need to extract specific named entities from text, have access to multiple sources of weakly labeled data, and want to leverage advanced machine learning models for improved accuracy.
Not ideal if you have a small, perfectly labeled dataset or if you need to perform general text classification rather than named entity recognition.
Stars
32
Forks
8
Language
Python
License
Apache-2.0
Category
Last pushed
Jun 20, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/Yinghao-Li/CHMM-ALT"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
dnanhkhoa/python-vncorenlp
A Python wrapper for VnCoreNLP using a bidirectional communication channel.
datquocnguyen/RDRPOSTagger
A fast and accurate POS and morphological tagging toolkit (EACL 2014)
OpenSextant/SolrTextTagger
A text tagger based on Lucene / Solr, using FST technology
ankane/informers
Fast transformer inference for Ruby
bentrevett/pytorch-pos-tagging
A tutorial on how to implement models for part-of-speech tagging using PyTorch and TorchText.