wss1996/Name-disambiguation
同名论文消歧的工程化方案(参考2019智源-aminer人名消歧竞赛第一名方案)
This project helps research institutions and large academic publishers accurately identify unique authors within vast datasets of scientific publications, even when authors share the same name. It takes raw scientific paper metadata and author-to-paper records as input, and outputs a CSV file that uniquely identifies each author with a specific author_id. Research administrators, librarians, or data scientists managing large academic databases would find this useful.
No commits in the last 6 months.
Use this if you need to precisely distinguish between multiple authors who share identical names across millions of scientific papers to improve data accuracy and integrity.
Not ideal if you're dealing with smaller datasets or if you don't have the significant computational resources (150GB+ disk space, 16-core 64GB Linux server) and time (2-3 working days) required for processing.
Stars
25
Forks
3
Language
Jupyter Notebook
License
Apache-2.0
Category
Last pushed
Dec 08, 2022
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/wss1996/Name-disambiguation"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
ShawnyXiao/2017-CCF-BDCI-AIJudge
2017-CCF-BDCI-让AI当法官(初赛):7th/415 (Top 1.68%)
ShawnyXiao/2018-DC-DataGrand-TextIntelProcess
2018-DC-“达观杯”文本智能处理挑战赛:冠军 (1st/3131)
beader/tianchi_nl2sql
追一科技首届中文NL2SQL挑战赛决赛第3名方案+代码
rogeroyer/2019-CCF-BDCI-Finance-Information-Negative-Judgment
top1-solution
zhanzecheng/SOHU_competition
Sohu's 2018 content recognition competition 1st solution(搜狐内容识别大赛第一名解决方案)