ieg-dhr/NLP-Course4Humanities_2024

This repository is part of an NLP course for humanities and cultural studies. This course uses historical newspapers as a source and applies NLP methods to them. NLP tasks: Tokenization, Lemmatization, TF-IDF, Part-of-speech tagging, semantic search with transformers, article extraction and OCR post-correction with LLMs, NER and text classification

/ 100

Emerging

This project helps humanities scholars and cultural studies researchers analyze large collections of historical newspaper texts. It takes raw historical newspaper data, often with OCR errors, and applies natural language processing techniques to extract insights. Researchers can identify key themes, recognize entities like people and places, and semantically search through articles.

No commits in the last 6 months.

Use this if you are a humanities or cultural studies researcher looking to apply computational methods to large historical text datasets, especially digitized newspapers.

Not ideal if you are a developer looking for an NLP library or a practitioner outside of humanities and cultural studies.

digital-humanities cultural-studies historical-research text-analysis newspaper-archives

No License Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 6 / 25

Maturity 8 / 25

Community 16 / 25

How are scores calculated?

Stars

Forks

Language

Jupyter Notebook

License

—

Higher-rated alternatives

natasha/natasha

Solves basic Russian NLP tasks, API for lower level Natasha projects

monikkinom/ner-lstm

Named Entity Recognition using multilayered bidirectional LSTM

ancatmara/data-science-nlp

NLP Section of the Data Science course, NRU HSE

mhbashari/awesome-persian-nlp-ir

Curated List of Persian Natural Language Processing and Information Retrieval Tools and Resources

soheil-mp/Natural-Language-Processing-Tutorials

NLP Webinars Created for Udacity's Mentorship Program (2019).

Explore NLP Tools

All categories Trending NLP directory Insights