seehiong/voicedoc-agent

🎙️ Voice-native document intelligence using Gemini, ElevenLabs STT/TTS, and Datadog observability — turning text documents into spoken conversations.

33
/ 100
Emerging

This project helps professionals deeply understand a single text document entirely through natural voice conversation. You upload a document (like a legal contract, financial report, or academic paper), and the system responds verbally to your questions, even adapting its tone and pace to match the document's subject matter. It's designed for anyone who needs to quickly grasp complex information from documents without typing or reading extensively.

Use this if you need to quickly and thoroughly comprehend a single complex document through spoken conversation, receiving nuanced, context-aware verbal explanations.

Not ideal if you need to compare or summarize information across many documents at once, as its focus is on deep interaction with one file.

document-analysis hands-free-research verbal-briefing information-retrieval policy-review
No Package No Dependents
Maintenance 6 / 25
Adoption 7 / 25
Maturity 13 / 25
Community 7 / 25

How are scores calculated?

Stars

25

Forks

2

Language

TypeScript

License

MIT

Last pushed

Dec 27, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/rag/seehiong/voicedoc-agent"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.