jeswanthmukesh20/VocalText-Contrastive-Embedding
This repository features a CLIP-inspired contrastive model that aligns audio signals and transcripts in a shared embedding space, enabling bi-directional retrieval (Audio→Text and Text→Audio). It uses a frozen Whisper-large encoder for audio (with a lightweight trainable adapter) and a trainable Nomic-embed-text-v1.5 for text
20
/ 100
Experimental
No commits in the last 6 months.
Stale 6m
No Package
No Dependents
Maintenance
2 / 25
Adoption
3 / 25
Maturity
15 / 25
Community
0 / 25
Stars
3
Forks
—
Language
Python
License
MIT
Category
Last pushed
Jul 22, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/jeswanthmukesh20/VocalText-Contrastive-Embedding"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.