gidim/Babler

Data Collection System For NLP/Speech Recognition

/ 100

Emerging

Babler helps you gather relevant text conversations from Twitter, blogs, and forums in over 500 languages. You provide a list of keywords or topics you're interested in, and Babler automatically collects and cleans the corresponding posts. This is ideal for researchers, data scientists, or anyone needing real-world conversational data to train language models, perform sentiment analysis, or improve keyword search.

No commits in the last 6 months.

Use this if you need large amounts of clean, conversational text data for natural language processing or speech recognition tasks, especially in less common languages.

Not ideal if you need data from sources other than Twitter, blogs, or forums, or if you prefer a system with a graphical user interface.

natural-language-processing speech-recognition sentiment-analysis language-modeling data-collection

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 7 / 25

Maturity 16 / 25

Community 18 / 25

How are scores calculated?

Stars

Forks

Language

Java

License

Apache-2.0

Higher-rated alternatives

rosette-api/java

Babel Street Analytics Client Library for Java

kermitt2/entity-fishing

A machine learning tool for fishing entities

vinhkhuc/JFastText

Java interface for fastText

CeON/CERMINE

Content ExtRactor and MINEr

vinhkhuc/jcrfsuite

Java interface for CRFsuite: http://www.chokkan.org/software/crfsuite/

Explore NLP Tools

All categories Trending NLP directory Insights