lgomezt/tidyX

Python package to clean raw tweets for ML applications.

27
/ 100
Experimental

This tool helps researchers, marketers, or analysts transform messy, raw text, especially from social media platforms like Twitter and particularly in Spanish, into clean, structured data ready for analysis. It takes in tweets and other short-form text and outputs a streamlined version, free of noise like URLs, hashtags, and emojis, making it ideal for natural language processing applications. Anyone working with social media data who needs to prepare it for sentiment analysis, topic modeling, or other text-based insights would find this valuable.

No commits in the last 6 months.

Use this if you need to quickly and efficiently clean social media text, especially Spanish tweets, to prepare it for machine learning or other analytical tasks.

Not ideal if your primary need is for deep linguistic analysis or processing highly structured, formal text datasets outside of social media.

social-media-analytics text-mining sentiment-analysis market-research public-opinion-analysis
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 7 / 25
Maturity 16 / 25
Community 4 / 25

How are scores calculated?

Stars

26

Forks

1

Language

Python

License

MIT

Last pushed

Feb 20, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/lgomezt/tidyX"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.