towhee-io/towhee
Towhee is a framework that is dedicated to making neural data processing pipelines simple and fast.
This framework helps developers quickly build and optimize data processing pipelines for unstructured data like text, images, audio, and video. It takes in various raw data types and outputs transformed data, such as text, images, or numerical embeddings, ready for storage in systems like vector databases. Developers or machine learning engineers who need to extract insights from large amounts of diverse unstructured data would use this.
3,458 stars. No commits in the last 6 months. Available on PyPI.
Use this if you are a developer building applications that require processing and extracting features from large volumes of unstructured data (text, images, video, audio) using advanced AI models and want to accelerate development and deployment.
Not ideal if you are an end-user without programming experience, as this is a developer tool requiring Python knowledge to implement and customize data pipelines.
Stars
3,458
Forks
262
Language
Python
License
Apache-2.0
Category
Last pushed
Oct 18, 2024
Commits (30d)
0
Dependencies
9
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/towhee-io/towhee"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Compare
Related tools
deepset-ai/haystack-tutorials
Here you can find all the Tutorials for Haystack 📓
aryn-ai/sycamore
🍁 Sycamore is an LLM-powered search and analytics platform for unstructured data.
MaartenGr/PolyFuzz
Fuzzy string matching, grouping, and evaluation.
unum-cloud/USearch
Fast Open-Source Search & Clustering engine × for Vectors & Arbitrary Objects × in C++, C,...
pingcap/pytidb
TiDB AI SDK: Unified Multi-Modal Data Platform for AI Apps & Agents - https://pingcap.github.io/ai/