unmonoqueteclea/voilib
🎧 Podcast Search Engine. Try it now for free or run your own instance.
ArchivedImplements semantic search over podcast transcripts by dividing episodes into ~40-word fragments and storing their embeddings (384-dimensional vectors) in Qdrant. The pipeline chains OpenAI's Whisper for transcription, embedding generation for semantic indexing, and vector similarity search—supporting both RSS-sourced podcasts and custom audio files. Deployable entirely self-hosted via Docker Compose with no external paid dependencies.
No commits in the last 6 months.
Stars
75
Forks
6
Language
Python
License
GPL-3.0
Category
Last pushed
Oct 11, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/unmonoqueteclea/voilib"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
DiceTechJobs/VectorsInSearch
Dice.com repo to accompany the dice.com 'Vectors in Search' talk by Simon Hughes, from the...
IuriiD/pinecone-faiss-pgvector
Comparing vector DBs Pinecone, FAISS & pgvector in combination with OpenAI Embeddings for semantic search
lukovicaleksa/semantic-search-mongodb-fastapi
This project demonstrates how you can enhance standard CRUD operations in your application using...
DrRuin/Personalized-Real-Estate-Agent
In an industry where personalization is key to customer satisfaction, your company wants to...
nmdra/Semantic-Search
A semantic search system built with PostgreSQL and pgvector, powered by Gemini for generating...