EmilHvitfeldt/R-text-data

List of textual data sources to be used for text mining in R

31
/ 100
Emerging

This is a curated list of textual data sources, primarily for those working with R, to practice text analysis and natural language processing. It provides readily available datasets ranging from classic literature to religious texts and TV show scripts, formatted for easy use. Text analysts, researchers, and students can use this to quickly obtain diverse text data for their projects without extensive data wrangling.

150 stars. No commits in the last 6 months.

Use this if you need various types of pre-processed text data to jumpstart your text mining or NLP project in R.

Not ideal if you're looking for a guide on how to perform text analysis or if you need to build custom datasets from raw web sources.

text analysis natural language processing R programming data collection linguistics research
No License Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 8 / 25
Community 13 / 25

How are scores calculated?

Stars

150

Forks

15

Language

License

Last pushed

Aug 17, 2021

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/EmilHvitfeldt/R-text-data"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.