Embodied Family Code Base

We will update the instructions for this codebase as soon as possible.

Installation

See INSTALLATION.md

Data Preparation

Download the EgoCOT dataset.
Download the COCO-2017 dataset.

Download the Pretrained Model

Download the testing model Embodied_family_7btiny.

Prepare the Text Data Paired with Video and Image

Unzip datasets_share.zip, which contains the text part of the multi-modal dataset, to the ./datasets/ directory.

🏠 Overview

🎁 Major Features

Usage

This repo can be used in conjunction with PyTorch's Dataset and DataLoader for training models on heterogeneous data. Here's a brief overview of the classes and their functionalities:

BaseDataset

The BaseDataset class extends PyTorch's Dataset and is designed to handle different media types (images, videos, and text). It includes a transformation process to standardize the input data and a processor to handle the data specific to the task.

Example

from robohusky.base_dataset_uni import BaseDataset

# Initialize the dataset with the required parameters
dataset = BaseDataset(
    dataset,  # Your dataset here
    processor,  # Your processor here
    image_path="path/to/images",
    input_size=224,
    num_segments=8,
    norm_type="openai",
    media_type="image"
)

# Use the dataset with a PyTorch DataLoader
from torch.utils.data import DataLoader

data_loader = DataLoader(dataset, batch_size=32, shuffle=True)

WeightedConcatDataset

The WeightedConcatDataset class extends PyTorch's ConcatDataset and allows for the creation of a unified dataset by concatenating multiple datasets with specified weights.

Example

from robohusky.base_dataset_uni import WeightedConcatDataset

# Assume we have multiple datasets for different tasks
dataset1 = BaseDataset(...)
dataset2 = BaseDataset(...)
dataset3 = BaseDataset(...)

# Define the weights for each dataset
weights = [0.5, 0.3, 0.2]

# Create a weighted concatenated dataset
weighted_dataset = WeightedConcatDataset([dataset1, dataset2, dataset3], weights=weights)

# Use the weighted dataset with a PyTorch DataLoader
data_loader = DataLoader(weighted_dataset, batch_size=32, shuffle=True)

Customization

The package is designed to be flexible and customizable. You can implement your own transformation and processing logic by subclassing BaseDataset and overriding the necessary methods.

🎫 License

This project is released under the Apache 2.0 license.

🖊️ Citation

If you find this project useful in your research, please consider cite:

@article{mu2024embodiedgpt,
  title={Embodiedgpt: Vision-language pre-training via embodied chain of thought},
  author={Mu, Yao and Zhang, Qinglong and Hu, Mengkang and Wang, Wenhai and Ding, Mingyu and Jin, Jun and Wang, Bin and Dai, Jifeng and Qiao, Yu and Luo, Ping},
  journal={Advances in Neural Information Processing Systems},
  volume={36},
  year={2024}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Embodied Family Code Base

Installation

Data Preparation

Download the Pretrained Model

Prepare the Text Data Paired with Video and Image

🏠 Overview

🎁 Major Features

Usage

BaseDataset

Example

WeightedConcatDataset

Example

Customization

🎫 License

🖊️ Citation

Files

README.md

Latest commit

History

README.md

File metadata and controls

Embodied Family Code Base

Installation

Data Preparation

Download the Pretrained Model

Prepare the Text Data Paired with Video and Image

🏠 Overview

🎁 Major Features

Usage

BaseDataset

Example

WeightedConcatDataset

Example

Customization

🎫 License

🖊️ Citation