hamelsmu/ktext
Utilities for preprocessing text for deep learning with Keras
This tool helps developers prepare raw text data for deep learning models, particularly when using the Keras framework. It takes unstructured text, cleans it by removing unwanted elements like phone numbers or HTML, breaks it into individual words, and then converts these words into numerical sequences that deep learning models can understand. The primary users are machine learning engineers or data scientists working with text-based AI applications.
180 stars. No commits in the last 6 months. Available on PyPI.
Use this if you need to quickly pre-process text data for a Keras deep learning model, especially if you have a large dataset that can fit into memory and you want to leverage parallel processing for speed.
Not ideal if your text data is too large to fit into a single computer's memory, or if you prefer using the more modern and maintained text processing layers built directly into Keras.
Stars
180
Forks
29
Language
Jupyter Notebook
License
MIT
Category
Last pushed
Dec 08, 2022
Commits (30d)
0
Dependencies
14
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/hamelsmu/ktext"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
chakki-works/seqeval
A Python framework for sequence labeling evaluation(named-entity recognition, pos tagging, etc...)
Hironsan/anago
Bidirectional LSTM-CRF and ELMo for Named-Entity Recognition, Part-of-Speech Tagging and so on.
jbesomi/texthero
Text preprocessing, representation and visualization from zero to hero.
asahi417/tner
Language model fine-tuning on NER with an easy interface and cross-domain evaluation. "T-NER: An...
Franck-Dernoncourt/NeuroNER
Named-entity recognition using neural networks. Easy-to-use and state-of-the-art results.