cbaziotis/ekphrasis

Ekphrasis is a text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenization, word normalization, word segmentation (for splitting hashtags) and spell correction, using word statistics from 2 big corpora (english Wikipedia, twitter - 330mil english tweets).

58
/ 100
Established

This tool helps you clean up and standardize social media text, like tweets or Facebook posts, to prepare it for analysis. It takes raw, messy social text with hashtags, emojis, and slang, and outputs a more readable, corrected, and structured version. This is for anyone, like social media analysts or researchers, who needs to extract insights or build models from online conversations.

675 stars. No commits in the last 6 months. Available on PyPI.

Use this if you need to process large volumes of user-generated text from platforms like Twitter or Facebook, and require accurate tokenization, hashtag splitting, and spell correction.

Not ideal if your primary text source is formal documents or traditional articles, as it's specifically optimized for the unique challenges of social media language.

social-media-analysis sentiment-analysis text-mining natural-language-processing data-preparation
Stale 6m
Maintenance 2 / 25
Adoption 10 / 25
Maturity 25 / 25
Community 21 / 25

How are scores calculated?

Stars

675

Forks

95

Language

Python

License

MIT

Last pushed

Jun 02, 2025

Commits (30d)

0

Dependencies

8

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/cbaziotis/ekphrasis"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.