EsterHlav/Quantitative-Comparison-NLP-Embeddings-from-GloVe-to-RoBERTa
Fair quantitative comparison of NLP embeddings from GloVe to RoBERTa with Sequential Bayesian Optimization fine-tuning using Flair and SentEval. Extension of HyperOpt library to log_b priors.
This project helps machine learning researchers or NLP practitioners understand which text embedding models perform best for common tasks like sentiment analysis or question answering. It takes various NLP embedding architectures as input and outputs a quantitative comparison of their performance after fine-tuning. This is for someone who needs to choose the most effective embedding model for their specific natural language processing application.
No commits in the last 6 months.
Use this if you are developing an NLP system and need to quickly compare the performance of different text embedding models (like GloVe, BERT, or RoBERTa) on standard downstream tasks.
Not ideal if you are looking for a plug-and-play solution for a specific business problem, as this is a research benchmark rather than an application.
Stars
18
Forks
2
Language
Jupyter Notebook
License
—
Category
Last pushed
Sep 04, 2019
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/EsterHlav/Quantitative-Comparison-NLP-Embeddings-from-GloVe-to-RoBERTa"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
MinishLab/model2vec
Fast State-of-the-Art Static Embeddings
AnswerDotAI/ModernBERT
Bringing BERT into modernity via both architecture changes and scaling
tensorflow/hub
A library for transfer learning by reusing parts of TensorFlow models.
Embedding/Chinese-Word-Vectors
100+ Chinese Word Vectors 上百种预训练中文词向量
twang2218/vocab-coverage
语言模型中文认知能力分析