jaschadub/VectorSmuggle
Testing platform for covert data exfiltration techniques where sensitive documents are embedded into vector representations and tunneled out under the guise of legitimate RAG operations — bypassing traditional security controls and evading detection through semantic obfuscation.
This project helps security researchers and defenders understand how sensitive information can be secretly hidden and extracted from AI systems, especially those using Retrieval-Augmented Generation (RAG). It takes various document formats as input, embeds hidden data into their vector representations, and then demonstrates how to query and reconstruct this data, effectively bypassing standard security measures. Security professionals, such as red teamers, security architects, and AI/ML security engineers, would use this to identify and mitigate such vulnerabilities.
Use this if you are an AI/ML security professional needing to test and understand how covert data exfiltration works in RAG-based systems.
Not ideal if you are looking for a general-purpose data encryption tool or a standard RAG system for legitimate data retrieval.
Stars
67
Forks
3
Language
Python
License
MIT
Category
Last pushed
Feb 25, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/vector-db/jaschadub/VectorSmuggle"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
pixeltable/pixeltable
Data Infrastructure providing a declarative, incremental approach for multimodal AI workloads.
activeloopai/deeplake
Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store,...
superlinked/VectorHub
VectorHub is a free, open-source learning website for people (software developers to senior ML...
hhblaze/DBreeze
C# .NET NOSQL ( key value, object store embedded TextSearch SemanticSearch Vector layer ) ACID...
TileDB-Inc/TileDB-Vector-Search
Cloud-native vector similarity search and storage with efficient, serverless scale-out