GitHub - smarter-vlm/smarter: next gen smart vlm reasoner

Code and requirements

git clone https://github.com/D-Roberts/smarter.git
cd smarter
conda create --name smarter python=3.10
conda activate smarter
pip install -r requirements.txt

To install conda if necessary, can do miniconda.

Small Runs

To be able to run a small experiment without the need to download the full dataset, a math puzzle is committed to the repo. See the SMART101 Data section of the README to see how to download the full dataset to run full final models train and eval.

To run training and eval of smaller models on the small subset of the dataset which is committed to this repo for initial insights into the deep learning training of vision-language reasoners, from the repo root:

unzip small-data.zip

python main_reasoner.py \
--model_name fused_dinov2_siglip \
--log \
--word_embed siglip \
--save_root small-runs \
--data_root small-data/SMART101-Data \
--lr 0.0001 \
--wd 0.2 \
--batch_size 4 \
--num_heads 2 \
--repr_size 128 \
--qf_layer \
--eps 1e-8 \
--beta2 0.98 \
--pdrop 0.2 \
--ln_eps 1e-6 \
--h_sz 64 \
--seed 0 \
--num_workers 1 \
--data_tot 128 \
--train_diff easy \
--num_epochs 2 \
--puzzles 58

rm -rf small-data
rm -rf small-runs

The small dataset has one puzzle, 58, illustrated in the article, from the math skill with multiple choice answer. Note that the skill-based accuracies will not be calculated since at least 3 puzzles are necessary.

Args are described in main_reasoner.py.

Machine learning experiment tracking with CometML

Experiments are tracked in CometML. A public account is made available for trying out the code, and experiment panels (loss and accuracy curves) can be seen here https://www.comet.com/ai-daor/smarter/view/new/panels.

To be able to create personal experiments, a Comet API Key must be created and placed in the smarter/.comet_token file and a Comet account username must be written to smarter/.comet_workspace (from your CometML account), replacing the public one.

SMART101 Data

To download the full SMART101 dataset (from merl), please execute the get_SMART_data.sh script in the repository. Depending on the internet connection, it can take 1-5hrs to download.

Final Models

To run training and eval of final models, from the repo root (need at least 40GB mem, at least 16 cores, and a V100 GPU (or A100 or H100)):

python main_reasoner.py \
--model_name fused_dinov2_siglip \
--log \
--word_embed siglip \
--save_root final-runs \
--data_root data/smart-data/SMART101-release-v1/SMART101-Data \
--lr 0.0003 \
--wd 0.2 \
--batch_size 128 \
--num_heads 2 \
--repr_size 128 \
--qf_layer \
--eps 1e-8 \
--beta2 0.98 \
--pdrop 0.2 \
--ln_eps 1e-6 \
--h_sz 256 \
--seed 0 \
--num_workers 16 \
--data_tot 1000 \
--train_diff easy \
--num_epochs 3 \
--puzzles all

Tested on Ubuntu20.04 LTS, Mac (x86_64 and M1) CPU-only, V100, A100, H100.

Name		Name	Last commit message	Last commit date
Latest commit History 378 Commits
dataset		dataset
.comet_api		.comet_api
.comet_workspace		.comet_workspace
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
data_utils.py		data_utils.py
deep_vlm_reasoners.py		deep_vlm_reasoners.py
get_SMART_data.sh		get_SMART_data.sh
layers.py		layers.py
losses.py		losses.py
main_reasoner.py		main_reasoner.py
requirements.txt		requirements.txt
small-data.zip		small-data.zip
text_encoder.py		text_encoder.py
utils.py		utils.py
vocab_utils.py		vocab_utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Code and requirements

Small Runs

Machine learning experiment tracking with CometML

SMART101 Data

Final Models

About

Releases

Packages

Contributors 2

Languages

License

smarter-vlm/smarter

Folders and files

Latest commit

History

Repository files navigation

Code and requirements

Small Runs

Machine learning experiment tracking with CometML

SMART101 Data

Final Models

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages