YuriyGuts/kaggle-quora-question-pairs
My solution to Kaggle Quora Question Pairs competition (Top 2%, Private LB log loss 0.13497).
This project helps identify duplicate questions within large text datasets, ensuring that similar inquiries are grouped together. It takes in pairs of questions and outputs a prediction indicating whether they are duplicates or distinct. This is useful for anyone managing user-generated content, like forum moderators, social media analysts, or customer support knowledge base managers.
102 stars. No commits in the last 6 months.
Use this if you need to automatically detect and flag semantically similar questions in a large corpus of text to avoid redundancy or improve information retrieval.
Not ideal if you are looking for a simple, off-the-shelf solution without any setup or if your primary goal is not question deduplication.
Stars
102
Forks
38
Language
Jupyter Notebook
License
MIT
Category
Last pushed
Nov 26, 2020
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/YuriyGuts/kaggle-quora-question-pairs"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
patil-suraj/question_generation
Neural question generation using transformers
vinhkhuc/MemN2N-babi-python
End-To-End Memory Networks for bAbI question-answering tasks
nelson-liu/paraphrase-id-tensorflow
Various models and code (Manhattan LSTM, Siamese LSTM + Matching Layer, BiMPM) for the...
krishnap25/mauve
Package to compute Mauve, a similarity score between neural text and human text. Install with...
dtrizna/slp
Shell Language Processing (SLP). Pre-processing of sh/bash/zsh/.. commands for Machine Learning models.