Text-to-Video 3D

Text-to-Video Synthesis with VQGAN and CLIP

This project combines the capabilities of VQGAN (Vector Quantized Generative Adversarial Networks) and CLIP (Contrastive Language-Image Pre-training) to generate video frames based on textual descriptions. The synthesis system applies a series of transformations to create dynamic, artistic representations of textual prompts.

Features

Text-to-image synthesis using VQGAN+CLIP.
Image transformations (zoom, rotate, translate) to enhance the visual dynamics.
Output frames saved as images which can be compiled into videos.

Prerequisites

Python 3.8 or higher
Pip package manager
Access to a CUDA-compatible GPU for faster processing (optional but recommended).

Installation

Clone the repository

git clone https://github.com/your-repository/Text-to-Video-VQGAN-CLIP.git
cd Text-to-Video-VQGAN-CLIP

Install dependencies
```
pip install -r requirements.txt
```
Install dependencies Download the necessary VQGAN model configuration and checkpoint files. These files should be placed in the appropriate directories specified in the main script.

Usage

Run the main script to generate frames based on predefined text prompts. Each frame is saved as an image in the output directory.

python src/main.py

You can modify the text prompts directly in the main.py file to create different images.

Extending the Project

Feel free to add more functionalities, such as:

Real-time text input for generating images on the fly.
Integration with web frameworks for an interactive user interface.
More complex transformations and effects.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
models		models
src		src
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Text-to-Video 3D

Text-to-Video Synthesis with VQGAN and CLIP

Features

Prerequisites

Installation

Usage

Extending the Project

About

Releases

Packages

Languages

SJ9VRF/Text-to-Video-3D

Folders and files

Latest commit

History

Repository files navigation

Text-to-Video 3D

Text-to-Video Synthesis with VQGAN and CLIP

Features

Prerequisites

Installation

Usage

Extending the Project

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages