juletx/corpus-linguistics

Corpus Linguistics slides, labs, assignments and data

/ 100

Experimental

This course material helps linguists and language researchers learn how to analyze large collections of text, known as corpora. You'll input raw text data and learn methods to extract insights like common word pairings (collocations) or significant terms (keywords). It's designed for anyone studying language who wants to use computational methods to understand how language is used in real-world contexts.

No commits in the last 6 months.

Use this if you are a linguistics student, researcher, or language enthusiast looking to understand and apply computational techniques to analyze large text datasets.

Not ideal if you are looking for a plug-and-play software tool for corpus analysis without learning the underlying methods and theory.

corpus-linguistics language-analysis text-mining linguistic-research computational-linguistics

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 4 / 25

Maturity 8 / 25

Community 0 / 25

How are scores calculated?

Stars

Forks

—

Language

License

—

Higher-rated alternatives

Helsinki-NLP/OpusFilter

OpusFilter - Parallel corpus processing toolkit

natasha/corus

Links to Russian corpora + Python functions for loading and parsing

darija-open-dataset/dataset

darija <-> english dataset

omicsNLP/Auto-CORPus

Auto-CORPus pipeline developed by a University of Nottingham and Imperial College London...

SergeyShk/ruTS

Библиотека для извлечения статистик из текстов на русском языке.

Explore NLP Tools

All categories Trending NLP directory Insights