Since this code is based on ScanRefer, you can use the same 3D features. Please also refer to the ScanRefer data preparation.
-
Download the ScanQA dataset under
data/qa/
."scene_id": [ScanNet scene id, e.g. "scene0000_00"], "object_id": [ScanNet object ids (corresponds to "objectId" in ScanNet aggregation file), e.g. "[8]"], "object_names": [ScanNet object names (corresponds to "label" in ScanNet aggregation file), e.g. ["cabinet"]], "question_id": [...], "question": [...], "answers": [...],
-
Download the preprocessed GLoVE embedding and put them under
data/
. -
Download the ScanNetV2 dataset and put (or link)
scans/
under (or to)data/scannet/scans/
(Please follow the ScanNet Instructions for downloading the ScanNet dataset). -
Pre-process ScanNet data. A folder named
scannet_data/
will be generated underdata/scannet/
after running the following command:cd data/scannet/ python batch_load_scannet_data.py
-
(Optional) Pre-process the multiview features from ENet.
a. Download the ENet pretrained weights and put it under
data/
b. Download and unzip the extracted ScanNet frames under
data/
c. Change the data paths in
config.py
marked with TODO accordingly.d. Extract the ENet features:
python scripts/compute_multiview_features.py
e. Project ENet features from ScanNet frames to point clouds:
python scripts/project_multiview_features.py --maxpool