Embodied Vision-and-Language Navigation with Dynamic Convolutional Filters

This is the PyTorch implementation for our paper:

Embodied Vision-and-Language Navigation with Dynamic Convolutional Filters
Federico Landi, Lorenzo Baraldi, Massimiliano Corsini, Rita Cucchiara
British Machine Vision Conference (BMVC), 2019
Oral Presentation

Visit the main website for more details.

Reference

If you use our code for your research, please cite our paper (BMVC 2019 oral):

Bibtex:

@inproceedings{landi2019embodied,
      title={Embodied Vision-and-Language Navigation with Dynamic Convolutional Filters},
      author={Landi, Federico and Baraldi, Lorenzo and Corsini, Massimiliano and Cucchiara, Rita},
      booktitle={Proceedings of the British Machine Vision Conference},
      year={2019}
    }

Installation

Clone Repo

Clone the repository:

# Make sure to clone with --recursive
git clone --recursive https://github.com/fdlandi/DynamicConv-agent.git
cd DynamicConv-agent

If you didn't clone with the --recursive flag, then you'll need to manually clone the pybind submodule from the top-level directory:

git submodule update --init --recursive

Python setup

Python 3.6 is required to run our code. You can install the other modules via:

cd speaksee
pip install -e .
cd ..
pip install -r requirements.txt

Building with Docker

Please follow the instructions on the Matterport3DSimulator to install the simulator via Docker.

Bulding without Docker

The simulator can be built outside of a docker container using the cmake build commands described above. However, this is not the recommended approach, as all dependencies will need to be installed locally and may conflict with existing libraries. The main requirements are:

Ubuntu >= 14.04
Nvidia-driver with CUDA installed
C++ compiler with C++11 support
CMake >= 3.10
OpenCV >= 2.4 including 3.x
OpenGL
GLM
Numpy

Optional dependences (depending on the cmake rendering options):

OSMesa for OSMesa backend support
epoxy for EGL backend support

Build and Test

Build the simulator and run the unit tests:

cd DynamicConv-agent
mkdir build && cd build
cmake -DEGL_RENDERING=ON ..
make
cd ../
./build/tests ~Timing

If you use a conda environment for your experiments, you should specify the python path in the cmake options:

cmake -DEGL_RENDERING=ON -DPYTHON_EXECUTABLE:FILEPATH='path_to_your_python_bin' ..

Precomputing ResNet Image Features

Alternatively, skip the generation and just download and extract our tsv files into the img_features directory:

Training and Testing

You can train our agent by running:

python tasks/R2R/main.py

The number of dynamic filters can be set with the --num_heads parameter:

python tasks/R2R/main.py --num_heads=4

Reproducibility Note

Results in our paper were obtained with version v0.1 of the Matterport3DSimulator. Due to this difference, results could vary from the one in the paper. Using different GPUs for training, as well as different random seeds, may also affect results.

We provide the weights obtained with our training. To reproduce results from the paper, run:

python tasks/R2R/main.py --name=normal_data --num_heads=4 --eval_only

or:

python tasks/R2R/main.py --name=data_augmentation --num_heads=4 --eval_only

License

The Matterport3D dataset, and data derived from it, is released under the Matterport3D Terms of Use. Our code is released under the MIT license.

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
cmake/Modules		cmake/Modules
connectivity		connectivity
include		include
models		models
pybind11 @ 34c2281		pybind11 @ 34c2281
scripts		scripts
speaksee @ e2fd8bb		speaksee @ e2fd8bb
src		src
tasks/R2R		tasks/R2R
web		web
webgl_imgs		webgl_imgs
.gitmodules		.gitmodules
CMakeLists.txt		CMakeLists.txt
Dockerfile		Dockerfile
Doxyfile		Doxyfile
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Embodied Vision-and-Language Navigation with Dynamic Convolutional Filters

Reference

Bibtex:

Installation

Clone Repo

Python setup

Building with Docker

Bulding without Docker

Build and Test

Precomputing ResNet Image Features

Training and Testing

Reproducibility Note

License

About

Releases

Packages

Languages

License

aimagelab/DynamicConv-agent

Folders and files

Latest commit

History

Repository files navigation

Embodied Vision-and-Language Navigation with Dynamic Convolutional Filters

Reference

Bibtex:

Installation

Clone Repo

Python setup

Building with Docker

Bulding without Docker

Build and Test

Precomputing ResNet Image Features

Training and Testing

Reproducibility Note

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages