EsterHlav/Quantitative-Comparison-NLP-Embeddings-from-GloVe-to-RoBERTa

Fair quantitative comparison of NLP embeddings from GloVe to RoBERTa with Sequential Bayesian Optimization fine-tuning using Flair and SentEval. Extension of HyperOpt library to log_b priors.

/ 100

Experimental

This project helps machine learning researchers or NLP practitioners understand which text embedding models perform best for common tasks like sentiment analysis or question answering. It takes various NLP embedding architectures as input and outputs a quantitative comparison of their performance after fine-tuning. This is for someone who needs to choose the most effective embedding model for their specific natural language processing application.

No commits in the last 6 months.

Use this if you are developing an NLP system and need to quickly compare the performance of different text embedding models (like GloVe, BERT, or RoBERTa) on standard downstream tasks.

Not ideal if you are looking for a plug-and-play solution for a specific business problem, as this is a research benchmark rather than an application.

natural-language-processing machine-learning-research text-analysis model-benchmarking sentiment-analysis

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 6 / 25

Maturity 8 / 25

Community 9 / 25

How are scores calculated?

Stars

Forks

Language

Jupyter Notebook

License

—

Higher-rated alternatives

MinishLab/model2vec

Fast State-of-the-Art Static Embeddings

AnswerDotAI/ModernBERT

Bringing BERT into modernity via both architecture changes and scaling

tensorflow/hub

A library for transfer learning by reusing parts of TensorFlow models.

Embedding/Chinese-Word-Vectors

100+ Chinese Word Vectors 上百种预训练中文词向量

twang2218/vocab-coverage

语言模型中文认知能力分析

Explore Embedding Tools

All categories Trending Embeddings directory Insights