gidim/Babler

Data Collection System For NLP/Speech Recognition

41
/ 100
Emerging

Babler helps you gather relevant text conversations from Twitter, blogs, and forums in over 500 languages. You provide a list of keywords or topics you're interested in, and Babler automatically collects and cleans the corresponding posts. This is ideal for researchers, data scientists, or anyone needing real-world conversational data to train language models, perform sentiment analysis, or improve keyword search.

No commits in the last 6 months.

Use this if you need large amounts of clean, conversational text data for natural language processing or speech recognition tasks, especially in less common languages.

Not ideal if you need data from sources other than Twitter, blogs, or forums, or if you prefer a system with a graphical user interface.

natural-language-processing speech-recognition sentiment-analysis language-modeling data-collection
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 7 / 25
Maturity 16 / 25
Community 18 / 25

How are scores calculated?

Stars

25

Forks

12

Language

Java

License

Apache-2.0

Last pushed

Apr 20, 2021

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/gidim/Babler"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.