FareedKhan-dev/train-text2video-scratch
This repository provides a PyTorch implementation of a video diffusion model, similar to OpenAI's Sora, allowing you to train and generate videos from text prompts using a configurable architecture and diffusion process.
This project helps developers train custom models that create short videos from text descriptions. You provide a dataset of short video clips paired with text descriptions, and the system learns to generate new video clips (like "A person holding a camera" or "Spaceship crossing the bridge") from fresh text prompts. It is ideal for machine learning engineers, AI researchers, or data scientists working on generative AI applications.
No commits in the last 6 months.
Use this if you need to train a text-to-video model from scratch or fine-tune an existing one using your own specialized video datasets.
Not ideal if you're looking for a user-friendly application to simply generate videos without deep technical involvement in model training and architecture.
Stars
8
Forks
1
Language
Python
License
MIT
Category
Last pushed
Jan 29, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/diffusion/FareedKhan-dev/train-text2video-scratch"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
hao-ai-lab/FastVideo
A unified inference and post-training framework for accelerated video generation.
ModelTC/LightX2V
Light Image Video Generation Inference Framework
thu-ml/TurboDiffusion
TurboDiffusion: 100–200× Acceleration for Video Diffusion Models
PKU-YuanGroup/Helios
Helios: Real Real-Time Long Video Generation Model
PKU-YuanGroup/MagicTime
[TPAMI 2025🔥] MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators