Multi-modal video seq2seq model

This model is a cascade of multiple networks for predicting video frames. The input can be an early fusion of different visual modalities (depth and RGB).

Note Please refer to my bachelor's thesis for details: Evaluating multi-stream networks for self-supervised representation learning

Run experiments

Install YASSMLTK from https://git.tu-berlin.de/cvrs/mltk
Clone this project
Go to the experiment folder e.g.
```
cd <PATH_TO_THIS_REPO>/example
```
Edit the config.yml accordingly
Create the DATA_DIR if necessary, and go into it
```
cd <DATA_DIR>
```
Make sure you have at least 35GB free space left

Download the example dataset carla/default3_small. This is a small dataset similar the carla/dataset4 which was used in the thesis.

mkdir -p downloads/manual/carla/default3_small
cd downloads/manual/carla/default3_small/

# download them manually or with e.g. wget (8GB)
wget https://tubcloud.tu-berlin.de/s/BMp8ZmZi3S3mxbq/download -O params.zip
wget https://tubcloud.tu-berlin.de/s/mzGJB8wZRDCYTwa/download -O Town01_Opt.zip
wget https://tubcloud.tu-berlin.de/s/J73sPnacQKFgttt/download -O Town10HD_Opt.zip

Go back to the experiment folder e.g.
```
cd <PATH_TO_THIS_REPO>/example
```

Run the experiment with mltk e.g. (see documentation of YASSMLTK for parameters)

# this will download the docker images (12GB), extract the downloaded zips (14GB), train and evaluate the experiment
python -m yassmltk.run <PATH_TO_THIS_REPO>/example

All evaluation results are saved in <PATH_TO_THIS_REPO>/example/eval. E.g. the evaluation metrics in metrics.yml
Use tensorboard --logdir . to view the training curves and to track the training process

If you want to use carla/default4 (112GB raw data, 44GB TFRecords) instead, you need to generate it first with lnschroeder/carla-dataset-generator.

Entry point to code

The YASSMLTK tool first calls train(), then evaluate() in the src.models.srivastava module.

As a side note: Srivastava is the author of the composite model on which our model is based on. See: https://arxiv.org/abs/1502.04681.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
example		example
src		src
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
model.svg		model.svg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multi-modal video seq2seq model

Run experiments

Entry point to code

About

Languages

lnschroeder/multi-modal-video-seq2seq-model

Folders and files

Latest commit

History

Repository files navigation

Multi-modal video seq2seq model

Run experiments

Entry point to code

About

Topics

Resources

Stars

Watchers

Forks

Languages