ajdavidl/Portuguese-NLP
List of resources and tools developed with focus on Portuguese.
This project is a curated catalog of datasets specifically designed for working with the Portuguese language. It brings together various types of text and speech data, from news articles and social media posts to medical texts and court decisions, as well as tools and resources. If you're a linguist, researcher, or data scientist focusing on Portuguese natural language processing, this is your go-to resource to find suitable data for your projects.
311 stars. No commits in the last 6 months.
Use this if you need to find specialized Portuguese language datasets for research, sentiment analysis, essay scoring, speech recognition, or other text-based applications.
Not ideal if you are looking for general-purpose language models or tools that are not specifically focused on the Portuguese language.
Stars
311
Forks
32
Language
—
License
—
Category
Last pushed
Jun 26, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/ajdavidl/Portuguese-NLP"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
thalesbertaglia/enelvo
A flexible normalizer for user-generated content
meedan/alegre
A text and media analysis service for Meedan Check, a collaborative media annotation platform
alan-barzilay/NLPortugues
NLPortuguês - Aprenda PLN em português! Esse repositório contem os materiais e exercícios do...
ulysses-camara/ulysses-segmenter
Pretrained segmenter models for Portuguese legislative text.
elenderg/PAL-1000
Ambiente de Desenvolvimento Integrado (ADI) contando com um explorador de arquivos, editor de...