NLP Resource Collections ML Frameworks
Curated lists, datasets, and reference materials for Natural Language Processing across languages and domains. Does NOT include implementations of NLP models, tutorials, or frameworks—only aggregated resources and paper collections.
There are 17 nlp resource collections frameworks tracked. 2 score above 50 (established tier). The highest-rated is jonathanwvd/awesome-industrial-datasets at 55/100 with 359 stars.
Get all 17 projects as JSON
curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=ml-frameworks&subcategory=nlp-resource-collections&limit=20"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
| # | Framework | Score | Tier |
|---|---|---|---|
| 1 |
jonathanwvd/awesome-industrial-datasets
A curated collection of public industrial datasets. |
|
Established |
| 2 |
leomaurodesenv/game-datasets
:video_game: A curated list of awesome game datasets, and tools to... |
|
Established |
| 3 |
jsbroks/awesome-dataset-tools
🔧 A curated list of awesome dataset tools |
|
Emerging |
| 4 |
NTMC-Community/awesome-neural-models-for-semantic-match
A curated list of papers dedicated to neural text (semantic) matching. |
|
Emerging |
| 5 |
haiker2011/awesome-nlp-sentiment-analysis
:book: 收集NLP领域相关的数据集、论文、开源实现,尤其是情感分析、情绪原因识别、评价对象和评价词抽取方面。 |
|
Emerging |
| 6 |
maastrichtlawtech/awesome-legal-nlp
📖 A curated list of LegalNLP resources from all around the web. |
|
Emerging |
| 7 |
ml4code/ml4code.github.io
Website for "A Survey of Machine Learning for Big Code and Naturalness" |
|
Emerging |
| 8 |
Jamie-Cui/paper-pulse
Automatically fetch, filter, and summarize research papers from arXiv & IACR... |
|
Emerging |
| 9 |
Huffon/NLP101
NLP 101: a resource repository for Deep Learning and Natural Language Processing |
|
Emerging |
| 10 |
coteries/cedille-ai
✒️ Cedille is a large French language model (6B), released under an... |
|
Emerging |
| 11 |
vandroogenbroeckmarc/doi2bib
Tool to convert a DOI to a BiBTeX entry (mainly "adapted" for the computer... |
|
Emerging |
| 12 |
enochkan/awesome-gans-and-deepfakes
A curated list of GAN & Deepfake papers and repositories. |
|
Emerging |
| 13 |
MEgooneh/awesome-Iran-datasets
Iranian/Persian Datasets. دیتاستهای فارسی و ایرانی |
|
Emerging |
| 14 |
tushartushar/ML4SCA
Machine Learning for Source Code Analysis |
|
Emerging |
| 15 |
bdqnghi/awesome-ai4code
A collection of recent papers, benchmarks and datasets of AI4Code domain. |
|
Experimental |
| 16 |
sciknoworg/ald-ale-orkg-review
The repository contains code to automate extraction of review tables from... |
|
Experimental |
| 17 |
nlx-group/study-of-commonsense-reasoning
Code and data for Masters Dissertation "A Study of Commonsense Reasoning... |
|
Experimental |