yumoxu/detnet
Code and dataset for TACL 19: Weakly Supervised Domain Detection.
This project helps identify specific sentences or phrases within a larger text that strongly indicate a particular subject area or 'domain.' It takes any textual content as input and outputs segments that are highly relevant to certain domains, even with minimal initial examples. This is designed for natural language processing engineers and researchers who build tools that need to understand the thematic focus of text.
No commits in the last 6 months.
Use this if you need to automatically pinpoint and extract the most domain-specific parts of a document, such as identifying key sentences about 'biotechnology' within a general science article.
Not ideal if you're looking for a tool to classify entire documents by topic or if you need a solution for very short, keyword-based texts.
Stars
19
Forks
2
Language
Python
License
—
Category
Last pushed
Feb 17, 2020
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/yumoxu/detnet"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
luheng/deep_srl
Code and pre-trained model for: Deep Semantic Role Labeling: What Works and What's Next
sileod/tasksource
Datasets collection and preprocessings framework for NLP extreme multitask learning
loomchild/maligna
Bilingual sengence aligner
CK-Explorer/DuoSubs
Semantic subtitle aligner and merger for bilingual subtitle syncing.
coastalcph/lex-glue
LexGLUE: A Benchmark Dataset for Legal Language Understanding in English