Text Mining Fundamentals NLP Tools
Introductory courses, tutorials, and practical guides covering core text mining techniques, workflows, and applications. Includes repositories focused on teaching text processing, analysis methods, and statistical approaches to text data. Does NOT include domain-specific applications (sentiment analysis, fake news detection, etc.) or advanced specialized tools already categorized elsewhere.
There are 77 text mining fundamentals tools tracked. 1 score above 50 (established tier). The highest-rated is dipanjanS/text-analytics-with-python at 51/100 with 1,690 stars.
Get all 77 projects as JSON
curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=nlp&subcategory=text-mining-fundamentals&limit=20"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
| # | Tool | Score | Tier |
|---|---|---|---|
| 1 |
dipanjanS/text-analytics-with-python
Learn how to process, classify, cluster, summarize, understand syntax,... |
|
Established |
| 2 |
jonathandunn/text_analytics
Basic text analytics and natural language processing in Python |
|
Emerging |
| 3 |
IBM/watson-document-co-relation
Correlate text content across documents using Watson NLU, Python NLTK and... |
|
Emerging |
| 4 |
Clarifai/clarifai-pyspark
Interfaces for Unstructured data and ML pipelines with Databricks and Clarifai |
|
Emerging |
| 5 |
umer7/Applied-Text-Mining-in-Python
Repo for Applied Text Mining in Python (coursera) by University of Michigan |
|
Emerging |
| 6 |
EudaLabs/nlp
A repository for Natural Language Processing (NLP) projects, tools, and experiments. |
|
Emerging |
| 7 |
itrummer/NaturalMiner
Mine data for patterns described in natural language |
|
Emerging |
| 8 |
fingeredman/teanaps
자연어 처리와 텍스트 분석을 위한 오픈소스 파이썬 라이브러리 입니다. |
|
Emerging |
| 9 |
algonell/ipo-miner
IPO Investment via Text Mining. |
|
Emerging |
| 10 |
mchesterkadwell/intro-to-text-mining-with-python
Cambridge Digital Humanities 'Introduction to Text-Mining with Python'... |
|
Emerging |
| 11 |
mchesterkadwell/intro-to-text-mining-with-python-2020
Cambridge Digital Humanities Learning, Methods Workshop: "Introduction to... |
|
Emerging |
| 12 |
zaratsian/Spark
Apache Spark (Scala, PySpark, SparkR) Code, Tricks, and References |
|
Emerging |
| 13 |
blanchefort/text_mining
Набор ноутбуков, в которых решаются различные задачи обработки естественного... |
|
Emerging |
| 14 |
oroszgy/hungarian-text-mining-workshop
Materials for the Text Mining workshop held in the HuNLP meetup, June 2017 |
|
Emerging |
| 15 |
malares/STeM-Scientifc-Paper-Mining-Tool
STeM is a text mining tool to help scientists and researchers evaluate new... |
|
Emerging |
| 16 |
fingeredman/text-mining-for-practice
파이썬 라이브러리를 활용해 텍스트 분석을 수행하는 방법에 대해 다룹니다. |
|
Emerging |
| 17 |
hhaoyan/awesome-textmining-materials-science
Collection of papers on text mining for materials science |
|
Emerging |
| 18 |
QData/TextAttack-WebDemo
TextAttack Web Demo |
|
Emerging |
| 19 |
JohnSnowLabs/spark-nlp-conda
Build and publish Spark NLP to Anaconda Cloud |
|
Emerging |
| 20 |
argilla-io/biome-text
Custom Natural Language Processing with big and small models 🌲🌱 |
|
Emerging |
| 21 |
mb010/Text2Tag
Code base for the analysis presented in in Bowles et al. 2022: "Radio Galaxy... |
|
Emerging |
| 22 |
arshren/MachineLearning
Machine Learning documents |
|
Emerging |
| 23 |
remrama/krank
Fetch curated dream reports. |
|
Emerging |
| 24 |
buomsoo-kim/Introduction-to-text-mining-with-Python
Lectures in Urban Data Science Lab, Seoul |
|
Emerging |
| 25 |
lorenzoscottb/DReAMy
DReAMy: a library for dream-reports annotation methods with python, NLP, and LLMs |
|
Emerging |
| 26 |
thatguy1104/NLP-Data-Mining-Engine
Our main project goals include trying to achieve a way for all researchers... |
|
Emerging |
| 27 |
MrpYA45/github-text-mining-tfg
We're aiming to create a tool which lets us experiment with text mining and... |
|
Emerging |
| 28 |
DmitrySerg/open-data
Collecting and analysing open data stuff |
|
Emerging |
| 29 |
aeleraqi/Text-Mining
Text mining techniques and workflows in Python |
|
Emerging |
| 30 |
ycatsh/connor
Organize and classify files based on their content using NLP |
|
Emerging |
| 31 |
SAP-samples/github-pull-analyzer
The GitHub Pull Request Analyzer (with SAP AI Core) automates the task of... |
|
Emerging |
| 32 |
Vaibhavabhaysharma/Applied-Text-Mining-in-Python
This repository contains solutions of the course-... |
|
Experimental |
| 33 |
prestondunton/marvel-dialogue-nlp
A machine learning project that will use Natural Language Processing (NLP)... |
|
Experimental |
| 34 |
juliasilge/ibm-ai-day
Presentation for IBM Community Day AI |
|
Experimental |
| 35 |
SciCrunch/Antibody-Watch
Antibody Watch: Text Mining Antibody Specificity from the Literature |
|
Experimental |
| 36 |
HimanshuMittal01/bagmodels
Various bag-of-words ML algorithms like BM25 |
|
Experimental |
| 37 |
StabRise/ScaleDP-Tutorials
Tutorials for ScaleDP library. ScaleDP is an Open-Source Library for... |
|
Experimental |
| 38 |
park1997/Industrial_safety_and_health_law-visualization
산업안전보건법 법규시각화, 텍스트마이닝을 통한 법들간의 유사도 네트워크화 |
|
Experimental |
| 39 |
analyticalmonk/pyspark_nlp_workshop
Instructions and code for the workshop "From Big Data to NLP Insights:... |
|
Experimental |
| 40 |
cyidhn/texto
📚 La librairie Python de textométrie. |
|
Experimental |
| 41 |
fingeredman/text-mining-for-beginner
파이썬 기초문법 부터 간단한 텍스트 분석을 수행하는 방법에 대해 다룹니다. |
|
Experimental |
| 42 |
AsadiAhmad/Ngram-Spark-Wikipedia
Calculating Ngram with PySpark for wikipedia text |
|
Experimental |
| 43 |
AsadiAhmad/Word-Counter-Spark
Word counter with spark |
|
Experimental |
| 44 |
AsadiAhmad/Edit-Distance-Spark
Calculating Edit Distance with PySpark |
|
Experimental |
| 45 |
fredriko/draviz
A method for assessing the data readiness of NLP projects, as well as the... |
|
Experimental |
| 46 |
Achint08/tech-diffusion
Patents data analysis on PySpark |
|
Experimental |
| 47 |
sudheera96/pyspark-textprocessing
Project on word count using pySpark, data bricks cloud environment. |
|
Experimental |
| 48 |
paulbricman/memnav
Expanding propositional memory through text mining. |
|
Experimental |
| 49 |
fingeredman/advanced-text-mining
TEANAPS 라이브러리를 활용한 자연어 처리와 텍스트 분석 방법론에 대해 다룹니다. |
|
Experimental |
| 50 |
mucahidozcelik/NLP
Text Mining and Natural Language Processing |
|
Experimental |
| 51 |
thukg/AMinerOpen
An open source community who focuses on developing and publishing elegant... |
|
Experimental |
| 52 |
tkachuksergiy/aws-spark-nlp
Works related to recent project on the use of Apache Spark and AWS cloud for... |
|
Experimental |
| 53 |
ReAlex1902/Hawk
German documents analysis |
|
Experimental |
| 54 |
yashmanne/an_analysis_of_nothing
Exploring character occurrences and NLP with Seinfeld scripts. |
|
Experimental |
| 55 |
manmeetkaurbaxi/Analyzing-ACL-and-EMNLP-papers
Analyzing paper details of ACL and EMNLP from 2016-2021. |
|
Experimental |
| 56 |
Doubtable-Steves-Linguistics/MinecraftNLP
Natural Language Processing (NLP) project built to predict GitHub repository... |
|
Experimental |
| 57 |
Robin1999Stark/Recipe_Tagger
NLP Project for Auto Labeling Receipes |
|
Experimental |
| 58 |
exaiatech/cymo-tutorial
CYMO is a next-generation text mining and analytics software developed by Exaia |
|
Experimental |
| 59 |
MuzamilSaiq/toy-to-theory-bag-of-words
Pedagogical walkthrough of Bag of Words |
|
Experimental |
| 60 |
minven/nlp-lt
Natural Language Processing for Lithuanian language |
|
Experimental |
| 61 |
frances-ai/frances-api
frances is an advanced cloud-based text mining digital platform that... |
|
Experimental |
| 62 |
peetceenatoo/my-first-keyword-extractor
first steps into natural language processing |
|
Experimental |
| 63 |
VirtualRoyalty/spark-nlp-project
Micro project on big data technologies via spark |
|
Experimental |
| 64 |
YukiChen-yuxin/proj_NLPbrl_DATA534
The NLPbrl wrapper API is a package for wrapping The Rosette Text Analytics... |
|
Experimental |
| 65 |
ekardatos/TextAnalysisAndStatisticalTesting
Statistical hypothesis testing applied to linguistic text data. |
|
Experimental |
| 66 |
prishanmu/She-Ra
Text analysis of scripts from the recent reboot of She-Ra and the Princesses of Power |
|
Experimental |
| 67 |
hza2002/WordWise
A file manager that organizes text files by keyword-based topics. |
|
Experimental |
| 68 |
soumyajit4419/Advance-NLP-Text_Mining
Natural Launguage Processing ,Text-Mining ,Natural Launguage Understanding |
|
Experimental |
| 69 |
bright1993ff66/Text-Data-Analysis
Projects, classes, and functions for my text mining |
|
Experimental |
| 70 |
minailkhani/Two-Information-Retrieval-projects
some projects |
|
Experimental |
| 71 |
KatyaZeross/TrekPredict
An NLP analysis on the impact of Star Trek: The Next Generation's character... |
|
Experimental |
| 72 |
lambdaofgod/niph
Tools for searching and text mining transcribed podcasts |
|
Experimental |
| 73 |
N-y-c-t-o/Gutenberg-scribe-main
A Python-based project that processes and analyzes public-domain books from... |
|
Experimental |
| 74 |
djc06048/python-textmining
python-textmining |
|
Experimental |
| 75 |
golecalicja/thats-what-she-said
The Office NLP data analysis |
|
Experimental |
| 76 |
MouadEttali/NLP-and-Text_Mining
NLP and text mining tasks, starting off with the basics and then moving on... |
|
Experimental |
| 77 |
Stacy067/Text-mining-and-Natural-Language-Process
Text-mining and NLP |
|
Experimental |