vgupta123/P-SIF

Source code for our AAAI 2020 paper P-SIF: Document Embeddings using Partition Averaging

/ 100

Emerging

This project helps researchers and data scientists represent text data as numerical vectors for machine learning tasks. It takes raw text documents (like news articles or tweets) and converts them into fixed-dimension vectors, which can then be used as input for classification, information retrieval, or semantic similarity models. This is useful for anyone working with large text datasets who needs to prepare them for computational analysis.

No commits in the last 6 months.

Use this if you are a researcher or data scientist needing to transform diverse text documents into numerical representations for tasks like categorizing content or finding semantically similar texts.

Not ideal if you are looking for a plug-and-play solution without any programming or deep learning expertise.

text-classification information-retrieval natural-language-processing semantic-text-similarity machine-learning

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 7 / 25

Maturity 8 / 25

Community 17 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

—

Higher-rated alternatives

MilaNLProc/contextualized-topic-models

A python package to run contextualized topic modeling. CTMs combine contextualized embeddings...

vinid/cade

Compass-aligned Distributional Embeddings. Align embeddings from different corpora

spcl/ncc

Neural Code Comprehension: A Learnable Representation of Code Semantics

criteo-research/CausE

Code for the Recsys 2018 paper entitled Causal Embeddings for Recommandation.

vintasoftware/entity-embed

PyTorch library for transforming entities like companies, products, etc. into vectors to support...

Explore Embedding Tools

All categories Trending Embeddings directory Insights