TencentARC/AudioStory
AudioStory: Generating Long-Form Narrative Audio with Large Language Models
AudioStory helps content creators, animators, and video editors generate long, coherent soundscapes and narratives for their projects. You provide text descriptions or existing video captions, and it produces high-quality, continuous audio, complete with sound effects, music, and spoken elements. It's ideal for anyone needing to create immersive sound without extensive audio production skills.
299 stars. No commits in the last 6 months.
Use this if you need to generate detailed, lengthy audio tracks, such as background sounds for animations, dubbing for videos, or natural sound narratives from text descriptions.
Not ideal if you primarily need to generate short, isolated sound effects or individual voice clips without a complex, evolving narrative.
Stars
299
Forks
22
Language
Jupyter Notebook
License
—
Category
Last pushed
Sep 21, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/diffusion/TencentARC/AudioStory"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
PrunaAI/pruna
Pruna is a model optimization framework built for developers, enabling you to deliver faster,...
bytedance/LatentSync
Taming Stable Diffusion for Lip Sync!
haoheliu/AudioLDM-training-finetuning
AudioLDM training, finetuning, evaluation and inference.
Text-to-Audio/Make-An-Audio
PyTorch Implementation of Make-An-Audio (ICML'23) with a Text-to-Audio Generative Model
teticio/audio-diffusion
Apply diffusion models using the new Hugging Face diffusers package to synthesize music instead...