Llm Learning Resources NLP Tools

There are 20 llm learning resources tools tracked. 2 score above 50 (established tier). The highest-rated is PaddlePaddle/ERNIE at 56/100 with 7,693 stars.

Get all 20 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=nlp&subcategory=llm-learning-resources&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Tool Score Tier
1 PaddlePaddle/ERNIE

The official repository for ERNIE 4.5 and ERNIEKit – its industrial-grade...

56
Established
2 eyurtsev/kor

LLM(😽)

52
Established
3 NiuTrans/NLPBook

A comprehensive book on neural networks and large language models in NLP

46
Emerging
4 bigscience-workshop/data-preparation

Code used for sourcing and cleaning the BigScience ROOTS corpus

44
Emerging
5 allenai/TOPICAL

:magic_wand::page_facing_up: TOPICAL: TOPIC pages AutomagicaLly

41
Emerging
6 ditto-assistant/nlp_server

NLP server housing intent and NER models as well as an LLM agent with long...

32
Emerging
7 guntas-13/CS613-NLP-Telugu-Team1

Collecting data for Telugu LLM. Group Project in Natural Language Processing...

31
Emerging
8 mattbarreto/procesamiento-habla-nlp

Laboratorio de Introducción al Procesamiento del Lenguaje Natural y LLMs -...

28
Experimental
9 veezbo/akkadian_english_corpus

Cleaned Akkadian English Corpus for LLMs

20
Experimental
10 MDFahimAnjum/LiPCoT

Linear Predictive Coding based Tokenizer for self-supervised learning of...

20
Experimental
11 bjam24/agh-natural-language-processing

This respository contains projects made for the NLP course at the AGH UST in...

17
Experimental
12 PeytonCleveland/Fair-Use

Collection of Python scripts for gathering, generating, cleaning and...

17
Experimental
13 hanguyenai/sudo-code-nlp

A Jupyter-based repository for exploring key concepts and techniques in...

14
Experimental
14 varun-suresh/experiments-with-gpt2

Experimenting with GPT-2 and BERT

13
Experimental
15 Ryan-Iacovone/Database-Resources-LLM

Use natural language to connect patrons with library databases/resources

13
Experimental
16 r-kovalch/omnigec-data

End‑to‑end pipelines, notebooks and configs for assembling the multilingual...

12
Experimental
17 AliAtaollahi/NLP-Course-Projects

Projects of NLP Course at the University of Tehran; Spring 2024

12
Experimental
18 progsi/YTUnCoverLLM

An NER dataset and LLM benchmark of music entities in user-generated content.

11
Experimental
19 aravind-selvam/NLP-notebooks

Repository for storing my NLP practice notebooks

11
Experimental
20 nourmorsy/PremioLLM

Investigates the effectiveness of various tokenization strategies in Arabic language

11
Experimental