esantus/Outlier_Detection
Data and code for the experiments in the Outlier Detection task proposed by Camacho-Collados et al.
This project helps researchers and computational linguists identify unusual words within a group and estimate how similar two words are. You provide word lists or clusters and pre-computed word embeddings, and it outputs scores indicating outlier words and word similarity. It's designed for someone working with semantic relationships between words in linguistic datasets.
No commits in the last 6 months.
Use this if you need to programmatically find the 'odd one out' in a set of words or quantify the semantic relatedness between word pairs in research datasets.
Not ideal if you're looking for a user-friendly application with a graphical interface, as this requires command-line execution and Python scripting knowledge.
Stars
13
Forks
1
Language
Python
License
—
Category
Last pushed
Aug 28, 2018
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/esantus/Outlier_Detection"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
TorchDR/TorchDR
TorchDR - PyTorch Dimensionality Reduction
derrickburns/generalized-kmeans-clustering
Production-ready K-Means clustering for Apache Spark with pluggable Bregman divergences (KL,...
abhilash1910/ClusterTransformer
Topic clustering library built on Transformer embeddings and cosine similarity...
md-experiments/picture_text
Interactive tree-maps with SBERT & Hierarchical Clustering (HAC)
mainlp/semantic_components
Finding semantic components in your neural representations.