Text Mining Fundamentals NLP Tools

Introductory courses, tutorials, and practical guides covering core text mining techniques, workflows, and applications. Includes repositories focused on teaching text processing, analysis methods, and statistical approaches to text data. Does NOT include domain-specific applications (sentiment analysis, fake news detection, etc.) or advanced specialized tools already categorized elsewhere.

There are 77 text mining fundamentals tools tracked. 1 score above 50 (established tier). The highest-rated is dipanjanS/text-analytics-with-python at 51/100 with 1,690 stars.

Get all 77 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=nlp&subcategory=text-mining-fundamentals&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Tool Score Tier
1 dipanjanS/text-analytics-with-python

Learn how to process, classify, cluster, summarize, understand syntax,...

51
Established
2 jonathandunn/text_analytics

Basic text analytics and natural language processing in Python

48
Emerging
3 IBM/watson-document-co-relation

Correlate text content across documents using Watson NLU, Python NLTK and...

44
Emerging
4 Clarifai/clarifai-pyspark

Interfaces for Unstructured data and ML pipelines with Databricks and Clarifai

42
Emerging
5 umer7/Applied-Text-Mining-in-Python

Repo for Applied Text Mining in Python (coursera) by University of Michigan

41
Emerging
6 EudaLabs/nlp

A repository for Natural Language Processing (NLP) projects, tools, and experiments.

40
Emerging
7 itrummer/NaturalMiner

Mine data for patterns described in natural language

40
Emerging
8 fingeredman/teanaps

자연어 처리와 텍스트 분석을 위한 오픈소스 파이썬 라이브러리 입니다.

39
Emerging
9 algonell/ipo-miner

IPO Investment via Text Mining.

38
Emerging
10 mchesterkadwell/intro-to-text-mining-with-python

Cambridge Digital Humanities 'Introduction to Text-Mining with Python'...

38
Emerging
11 mchesterkadwell/intro-to-text-mining-with-python-2020

Cambridge Digital Humanities Learning, Methods Workshop: "Introduction to...

37
Emerging
12 zaratsian/Spark

Apache Spark (Scala, PySpark, SparkR) Code, Tricks, and References

37
Emerging
13 blanchefort/text_mining

Набор ноутбуков, в которых решаются различные задачи обработки естественного...

37
Emerging
14 oroszgy/hungarian-text-mining-workshop

Materials for the Text Mining workshop held in the HuNLP meetup, June 2017

37
Emerging
15 malares/STeM-Scientifc-Paper-Mining-Tool

STeM is a text mining tool to help scientists and researchers evaluate new...

37
Emerging
16 fingeredman/text-mining-for-practice

파이썬 라이브러리를 활용해 텍스트 분석을 수행하는 방법에 대해 다룹니다.

35
Emerging
17 hhaoyan/awesome-textmining-materials-science

Collection of papers on text mining for materials science

35
Emerging
18 QData/TextAttack-WebDemo

TextAttack Web Demo

35
Emerging
19 JohnSnowLabs/spark-nlp-conda

Build and publish Spark NLP to Anaconda Cloud

35
Emerging
20 argilla-io/biome-text

Custom Natural Language Processing with big and small models 🌲🌱

35
Emerging
21 mb010/Text2Tag

Code base for the analysis presented in in Bowles et al. 2022: "Radio Galaxy...

33
Emerging
22 arshren/MachineLearning

Machine Learning documents

33
Emerging
23 remrama/krank

Fetch curated dream reports.

32
Emerging
24 buomsoo-kim/Introduction-to-text-mining-with-Python

Lectures in Urban Data Science Lab, Seoul

31
Emerging
25 lorenzoscottb/DReAMy

DReAMy: a library for dream-reports annotation methods with python, NLP, and LLMs

31
Emerging
26 thatguy1104/NLP-Data-Mining-Engine

Our main project goals include trying to achieve a way for all researchers...

31
Emerging
27 MrpYA45/github-text-mining-tfg

We're aiming to create a tool which lets us experiment with text mining and...

31
Emerging
28 DmitrySerg/open-data

Collecting and analysing open data stuff

31
Emerging
29 aeleraqi/Text-Mining

Text mining techniques and workflows in Python

30
Emerging
30 ycatsh/connor

Organize and classify files based on their content using NLP

30
Emerging
31 SAP-samples/github-pull-analyzer

The GitHub Pull Request Analyzer (with SAP AI Core) automates the task of...

30
Emerging
32 Vaibhavabhaysharma/Applied-Text-Mining-in-Python

This repository contains solutions of the course-...

29
Experimental
33 prestondunton/marvel-dialogue-nlp

A machine learning project that will use Natural Language Processing (NLP)...

29
Experimental
34 juliasilge/ibm-ai-day

Presentation for IBM Community Day AI

28
Experimental
35 SciCrunch/Antibody-Watch

Antibody Watch: Text Mining Antibody Specificity from the Literature

28
Experimental
36 HimanshuMittal01/bagmodels

Various bag-of-words ML algorithms like BM25

27
Experimental
37 StabRise/ScaleDP-Tutorials

Tutorials for ScaleDP library. ScaleDP is an Open-Source Library for...

26
Experimental
38 park1997/Industrial_safety_and_health_law-visualization

산업안전보건법 법규시각화, 텍스트마이닝을 통한 법들간의 유사도 네트워크화

25
Experimental
39 analyticalmonk/pyspark_nlp_workshop

Instructions and code for the workshop "From Big Data to NLP Insights:...

24
Experimental
40 cyidhn/texto

📚 La librairie Python de textométrie.

24
Experimental
41 fingeredman/text-mining-for-beginner

파이썬 기초문법 부터 간단한 텍스트 분석을 수행하는 방법에 대해 다룹니다.

23
Experimental
42 AsadiAhmad/Ngram-Spark-Wikipedia

Calculating Ngram with PySpark for wikipedia text

23
Experimental
43 AsadiAhmad/Word-Counter-Spark

Word counter with spark

23
Experimental
44 AsadiAhmad/Edit-Distance-Spark

Calculating Edit Distance with PySpark

23
Experimental
45 fredriko/draviz

A method for assessing the data readiness of NLP projects, as well as the...

23
Experimental
46 Achint08/tech-diffusion

Patents data analysis on PySpark

23
Experimental
47 sudheera96/pyspark-textprocessing

Project on word count using pySpark, data bricks cloud environment.

23
Experimental
48 paulbricman/memnav

Expanding propositional memory through text mining.

22
Experimental
49 fingeredman/advanced-text-mining

TEANAPS 라이브러리를 활용한 자연어 처리와 텍스트 분석 방법론에 대해 다룹니다.

22
Experimental
50 mucahidozcelik/NLP

Text Mining and Natural Language Processing

22
Experimental
51 thukg/AMinerOpen

An open source community who focuses on developing and publishing elegant...

21
Experimental
52 tkachuksergiy/aws-spark-nlp

Works related to recent project on the use of Apache Spark and AWS cloud for...

20
Experimental
53 ReAlex1902/Hawk

German documents analysis

18
Experimental
54 yashmanne/an_analysis_of_nothing

Exploring character occurrences and NLP with Seinfeld scripts.

18
Experimental
55 manmeetkaurbaxi/Analyzing-ACL-and-EMNLP-papers

Analyzing paper details of ACL and EMNLP from 2016-2021.

18
Experimental
56 Doubtable-Steves-Linguistics/MinecraftNLP

Natural Language Processing (NLP) project built to predict GitHub repository...

17
Experimental
57 Robin1999Stark/Recipe_Tagger

NLP Project for Auto Labeling Receipes

17
Experimental
58 exaiatech/cymo-tutorial

CYMO is a next-generation text mining and analytics software developed by Exaia

17
Experimental
59 MuzamilSaiq/toy-to-theory-bag-of-words

Pedagogical walkthrough of Bag of Words

15
Experimental
60 minven/nlp-lt

Natural Language Processing for Lithuanian language

14
Experimental
61 frances-ai/frances-api

frances is an advanced cloud-based text mining digital platform that...

13
Experimental
62 peetceenatoo/my-first-keyword-extractor

first steps into natural language processing

13
Experimental
63 VirtualRoyalty/spark-nlp-project

Micro project on big data technologies via spark

11
Experimental
64 YukiChen-yuxin/proj_NLPbrl_DATA534

The NLPbrl wrapper API is a package for wrapping The Rosette Text Analytics...

11
Experimental
65 ekardatos/TextAnalysisAndStatisticalTesting

Statistical hypothesis testing applied to linguistic text data.

11
Experimental
66 prishanmu/She-Ra

Text analysis of scripts from the recent reboot of She-Ra and the Princesses of Power

11
Experimental
67 hza2002/WordWise

A file manager that organizes text files by keyword-based topics.

11
Experimental
68 soumyajit4419/Advance-NLP-Text_Mining

Natural Launguage Processing ,Text-Mining ,Natural Launguage Understanding

11
Experimental
69 bright1993ff66/Text-Data-Analysis

Projects, classes, and functions for my text mining

11
Experimental
70 minailkhani/Two-Information-Retrieval-projects

some projects

11
Experimental
71 KatyaZeross/TrekPredict

An NLP analysis on the impact of Star Trek: The Next Generation's character...

11
Experimental
72 lambdaofgod/niph

Tools for searching and text mining transcribed podcasts

11
Experimental
73 N-y-c-t-o/Gutenberg-scribe-main

A Python-based project that processes and analyzes public-domain books from...

11
Experimental
74 djc06048/python-textmining

python-textmining

10
Experimental
75 golecalicja/thats-what-she-said

The Office NLP data analysis

10
Experimental
76 MouadEttali/NLP-and-Text_Mining

NLP and text mining tasks, starting off with the basics and then moving on...

10
Experimental
77 Stacy067/Text-mining-and-Natural-Language-Process

Text-mining and NLP

10
Experimental