louisowen6/quora_paraphrasing_id
Quora Paraphrasing Dataset Bahasa Indonesia Version
This dataset helps content managers, customer support teams, or researchers working with Indonesian text to identify when different phrasings of a question actually mean the same thing. You input pairs of Indonesian questions, and it tells you if they are duplicates or not. This is for anyone who needs to understand the semantic similarity between different ways people ask questions in Bahasa Indonesia.
No commits in the last 6 months.
Use this if you need to build or evaluate systems that understand the intent behind Indonesian questions, even when worded differently, like for a chatbot or a search engine.
Not ideal if you are working with languages other than Bahasa Indonesia or need to generate new paraphrases rather than identify existing ones.
Stars
11
Forks
2
Language
Python
License
—
Category
Last pushed
Apr 18, 2021
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/louisowen6/quora_paraphrasing_id"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
PaddlePaddle/RocketQA
🚀 RocketQA, dense retrieval for information retrieval and question answering, including both...
shuaihuaiyi/QA
使用深度å¦ä¹ ç®—æ³•å®žçŽ°çš„ä¸æ–‡é—®ç”系统
allenai/deep_qa
A deep NLP library, based on Keras / tf, focused on question answering (but useful for other NLP too)
worldbank/iQual
iQual is a package that leverages natural language processing to scale up interpretative...
fhamborg/Giveme5W1H
Extraction of the journalistic five W and one H questions (5W1H) from news articles: who did...