AstraBert/ingest-anything
From data to vector database effortlessly
This tool helps you prepare diverse, non-PDF files like documents, code, or web content for use in AI applications. It takes your raw files or URLs and transforms them into a structured format (embeddings) stored in a vector database, which is crucial for building powerful search or question-answering systems. It's designed for AI developers or data scientists who need to easily populate vector databases with various data types.
No commits in the last 6 months. Available on PyPI.
Use this if you are building an AI application and need a streamlined way to get various types of data—beyond just PDFs and Markdown—into a vector database for tasks like RAG.
Not ideal if you primarily work with existing PDFs or Markdown files, or if you don't need to use a vector database for your application.
Stars
89
Forks
12
Language
Python
License
MIT
Category
Last pushed
May 17, 2025
Commits (30d)
0
Dependencies
5
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/vector-db/AstraBert/ingest-anything"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
pixeltable/pixeltable
Data Infrastructure providing a declarative, incremental approach for multimodal AI workloads.
activeloopai/deeplake
Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store,...
superlinked/VectorHub
VectorHub is a free, open-source learning website for people (software developers to senior ML...
hhblaze/DBreeze
C# .NET NOSQL ( key value, object store embedded TextSearch SemanticSearch Vector layer ) ACID...
TileDB-Inc/TileDB-Vector-Search
Cloud-native vector similarity search and storage with efficient, serverless scale-out