Gen1

My Implementation of " Structure and Content-Guided Video Synthesis with Diffusion Models" by RunwayML. "Input videos x are encoded to z0 with a fixed encoder E and diffused to zt. We extract a structure representation s by encoding depth maps obtained with MiDaS, and a content representation c by encoding one of the frames with CLIP. The model then learns to reverse the diffusion process in the latent space, with the help of s, which gets concatenated to zt, as well as c, which is provided via cross-attention blocks. During inference (right), the structure s of an input video is provided in the same manner. To specify content via text, we convert CLIP text embeddings to image embeddings via a prior."

Install

pip3 install gen1

Usage

import torch
from gen1.model import Gen1

# Create an instance of the Gen1 model
model = Gen1()

# Generate random input images and video tensors
images = torch.randn(1, 3, 128, 128)
video = torch.randn(1, 3, 16, 128, 128)

# Pass the input images and video through the model's forward method
run_out = model.forward(images, video)

Datasets

Here is a summary table of the datasets used in the Structure and Content-Guided Video Synthesis with Diffusion Models paper:

Dataset	Type	Size	Domain	Description	Source
Internal dataset	Images	240M	General	Uncaptioned images	Private
Custom video dataset	Videos	6.4M clips	General	Uncaptioned short video clips	Private
DAVIS	Videos	-	General	Video object segmentation	Link
Stock footage	Videos	-	General	Diverse video clips	-

Citation

@misc{2302.03011,
Author = {Patrick Esser and Johnathan Chiu and Parmida Atighehchian and Jonathan Granskog and Anastasis Germanidis},
Title = {Structure and Content-Guided Video Synthesis with Diffusion Models},
Year = {2023},
Eprint = {arXiv:2302.03011},

Todo

Add training script
Add in conditional text paramater to pass in text, not just images and or other videos

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
gen1		gen1
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
agorabanner.png		agorabanner.png
example.py		example.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Gen1

Install

Usage

Datasets

Citation

Todo

About

Releases

Packages

Languages

License

kyegomez/Gen1

Folders and files

Latest commit

History

Repository files navigation

Gen1

Install

Usage

Datasets

Citation

Todo

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages