brandonleekramer/tidyorgs

A tidy package that detects and standardizes organizations in unstructured text data

/ 100

Experimental

This tool helps researchers, analysts, and policymakers categorize messy text data to identify and standardize organization names across different sectors like academia, business, government, and nonprofits. You provide unstructured text or email domains, and it returns standardized organization names and their sector classification. This is ideal for anyone needing to analyze affiliations from large datasets containing varied text entries.

No commits in the last 6 months.

Use this if you need to clean and categorize organization names from raw text fields or email addresses for social, economic, or policy analysis.

Not ideal if your data is already perfectly standardized or if you only need to extract organizations without categorizing them by sector.

organizational-analysis social-research economic-analysis policy-analysis data-standardization

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 4 / 25

Maturity 16 / 25

Community 9 / 25

How are scores calculated?

Stars

Forks

Language

License

MIT

Higher-rated alternatives

quanteda/quanteda

An R package for the Quantitative Analysis of Textual Data

juliasilge/tidytext

Text mining using tidy tools :sparkles::page_facing_up::sparkles:

massimoaria/tall

Text Analysis for aLL

keyATM/keyATM

An R package for Keyword Assisted Topic Models

gagolews/stringi

Fast and Portable Character String Processing in R (with the Unicode ICU)

Explore NLP Tools

All categories Trending NLP directory Insights