cbaziotis/ekphrasis
Ekphrasis is a text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenization, word normalization, word segmentation (for splitting hashtags) and spell correction, using word statistics from 2 big corpora (english Wikipedia, twitter - 330mil english tweets).
This tool helps you clean up and standardize social media text, like tweets or Facebook posts, to prepare it for analysis. It takes raw, messy social text with hashtags, emojis, and slang, and outputs a more readable, corrected, and structured version. This is for anyone, like social media analysts or researchers, who needs to extract insights or build models from online conversations.
675 stars. No commits in the last 6 months. Available on PyPI.
Use this if you need to process large volumes of user-generated text from platforms like Twitter or Facebook, and require accurate tokenization, hashtag splitting, and spell correction.
Not ideal if your primary text source is formal documents or traditional articles, as it's specifically optimized for the unique challenges of social media language.
Stars
675
Forks
95
Language
Python
License
MIT
Category
Last pushed
Jun 02, 2025
Commits (30d)
0
Dependencies
8
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/cbaziotis/ekphrasis"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
vi3k6i5/flashtext
Extract Keywords from sentence or Replace keywords in sentences.
alirezatheh/perke
A keyphrase extractor for Persian
andrewtavis/kwx
BERT, LDA, and TFIDF based keyword extraction in Python
lovit/KR-WordRank
비지도학습 방법으로 한국어 텍스트에서 단어/키워드를 자동으로 추출하는 라이브러리입니다
gagan3012/keytotext
Keywords to Sentences