GeekDream-x/SemEval2022-Task8-TonyX
Deep-learning system proposed by HFL for SemEval-2022 Task 8: Multilingual News Similarity
This project helps news analysts, content managers, or media researchers quickly determine how similar two news articles are, even if they're in different languages. You input two news articles, and it outputs a similarity score from 0 to 1, indicating how closely related their content is. This tool is designed for anyone needing to compare news content across multiple languages efficiently.
No commits in the last 6 months.
Use this if you need to quantitatively assess the semantic similarity between news articles, especially when dealing with a mix of different languages.
Not ideal if you require the full augmented dataset for your specific training needs, as it is not provided due to copyright.
Stars
40
Forks
6
Language
Python
License
Apache-2.0
Category
Last pushed
Jul 15, 2022
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/GeekDream-x/SemEval2022-Task8-TonyX"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
DerwenAI/pytextrank
Python implementation of TextRank algorithms ("textgraphs") for phrase extraction
Tiiiger/bert_score
BERT score for text generation
BrikerMan/Kashgari
Kashgari is a production-level NLP Transfer learning framework built on top of tf.keras for...
asyml/texar
Toolkit for Machine Learning, Natural Language Processing, and Text Generation, in TensorFlow. ...
yohasebe/wp2txt
A command-line tool to extract plain text from Wikipedia dumps with category and section filtering