semaj87/image-to-text-to-speech

An app that uses Hugging Face AI models together with OpenAI & LangChain, to generate text from an image, which then generates audio from the text

32
/ 100
Emerging

This tool helps content creators and educators transform static images into engaging audio stories. You provide an image, and it first describes the scene in text, then crafts a short narrative from that description, and finally converts the story into spoken audio. It's ideal for anyone looking to quickly add a verbal dimension to their visual content.

No commits in the last 6 months.

Use this if you need to generate spoken narratives or audio descriptions from images for social media, educational materials, or presentations without manual writing or voice recording.

Not ideal if you require precise control over the story's plot, specific character dialogue, or professional-grade voice acting, as the AI generates the narrative and voice automatically.

content-creation digital-storytelling educational-content social-media-marketing accessibility-tools
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 6 / 25
Maturity 16 / 25
Community 10 / 25

How are scores calculated?

Stars

15

Forks

2

Language

Python

License

MIT

Last pushed

Nov 28, 2023

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/prompt-engineering/semaj87/image-to-text-to-speech"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.