We provide ImageNet-1K training, ImageNet-22K pre-training, and ImageNet-1K fine-tuning commands here. Please check INSTALL.md for installation instructions first.
We use multi-node training on a SLURM cluster with submitit for producing the results and models in the paper. Please install:
pip install submitit
We will give example commands for both multi-node and single-machine training below.
ConvNeXt-T training on ImageNet-1K with 4 8-GPU nodes:
python run_with_submitit.py --nodes 4 --ngpus 8 \
--model convnext_tiny --drop_path 0.1 \
--batch_size 128 --lr 4e-3 --update_freq 1 \
--model_ema true --model_ema_eval true \
--data_path /path/to/imagenet-1k \
--job_dir /path/to/save_results
- You may need to change cluster-specific arguments in
run_with_submitit.py
. - You can add
--use_amp true
to train in PyTorch's Automatic Mixed Precision (AMP). - Use
--resume /path_or_url/to/checkpoint.pth
to resume training from a previous checkpoint; use--auto_resume true
to auto-resume from latest checkpoint in the specified output folder. --batch_size
: batch size per GPU;--update_freq
: gradient accumulation steps.- The effective batch size =
--nodes
*--ngpus
*--batch_size
*--update_freq
. In the example above, the effective batch size is4*8*128*1 = 4096
. You can adjust these four arguments together to keep the effective batch size at 4096 and avoid OOM issues, based on the model size, number of nodes and GPU memory.
You can use the following command to run this experiment on a single machine:
python -m torch.distributed.launch --nproc_per_node=8 main.py \
--model convnext_tiny --drop_path 0.1 \
--batch_size 128 --lr 4e-3 --update_freq 4 \
--model_ema true --model_ema_eval true \
--data_path /path/to/imagenet-1k
--output_dir /path/to/save_results
- Here, the effective batch size =
--nproc_per_node
*--batch_size
*--update_freq
. In the example above, the effective batch size is8*128*4 = 4096
. Running on one machine, we increasedupdate_freq
so that the total batch size is unchanged.
To train other ConvNeXt variants, --model
and --drop_path
need to be changed. Examples are given below, each with both multi-node and single-machine commands:
ConvNeXt-S
Multi-node
python run_with_submitit.py --nodes 4 --ngpus 8 \
--model convnext_small --drop_path 0.4 \
--batch_size 128 --lr 4e-3 --update_freq 1 \
--model_ema true --model_ema_eval true \
--data_path /path/to/imagenet-1k \
--job_dir /path/to/save_results
Single-machine
python -m torch.distributed.launch --nproc_per_node=8 main.py \
--model convnext_small --drop_path 0.4 \
--batch_size 128 --lr 4e-3 --update_freq 4 \
--model_ema true --model_ema_eval true \
--data_path /path/to/imagenet-1k \
--output_dir /path/to/save_results
ConvNeXt-B
Multi-node
python run_with_submitit.py --nodes 4 --ngpus 8 \
--model convnext_base --drop_path 0.5 \
--batch_size 128 --lr 4e-3 --update_freq 1 \
--model_ema true --model_ema_eval true \
--data_path /path/to/imagenet-1k \
--job_dir /path/to/save_results
Single-machine
python -m torch.distributed.launch --nproc_per_node=8 main.py \
--model convnext_base --drop_path 0.5 \
--batch_size 128 --lr 4e-3 --update_freq 4 \
--model_ema true --model_ema_eval true \
--data_path /path/to/imagenet-1k \
--output_dir /path/to/save_results
ConvNeXt-L
Multi-node
python run_with_submitit.py --nodes 8 --ngpus 8 \
--model convnext_large --drop_path 0.5 \
--batch_size 64 --lr 4e-3 --update_freq 1 \
--model_ema true --model_ema_eval true \
--data_path /path/to/imagenet-1k \
--job_dir /path/to/save_results
Single-machine
python -m torch.distributed.launch --nproc_per_node=8 main.py \
--model convnext_large --drop_path 0.5 \
--batch_size 64 --lr 4e-3 --update_freq 8 \
--model_ema true --model_ema_eval true \
--data_path /path/to/imagenet-1k \
--output_dir /path/to/save_results
ConvNeXt-S (isotropic)
Multi-node
python run_with_submitit.py --nodes 4 --ngpus 8 \
--model convnext_isotropic_small --drop_path 0.1 \
--batch_size 128 --lr 4e-3 --update_freq 1 \
--layer_scale_init_value 0 \
--warmup_epochs 50 --model_ema true --model_ema_eval true \
--data_path /path/to/imagenet-1k \
--job_dir /path/to/save_results
Single-machine
python -m torch.distributed.launch --nproc_per_node=8 main.py \
--model convnext_isotropic_small --drop_path 0.1 \
--batch_size 128 --lr 4e-3 --update_freq 4 \
--layer_scale_init_value 0 \
--warmup_epochs 50 --model_ema true --model_ema_eval true \
--data_path /path/to/imagenet-1k \
--output_dir /path/to/save_results
ConvNeXt-B (isotropic)
Multi-node
python run_with_submitit.py --nodes 4 --ngpus 8 \
--model convnext_isotropic_base --drop_path 0.2 \
--batch_size 128 --lr 4e-3 --update_freq 1 \
--layer_scale_init_value 0 \
--warmup_epochs 50 --model_ema true --model_ema_eval true \
--data_path /path/to/imagenet-1k \
--job_dir /path/to/save_results
Single-machine
python -m torch.distributed.launch --nproc_per_node=8 main.py \
--model convnext_isotropic_base --drop_path 0.2 \
--batch_size 128 --lr 4e-3 --update_freq 4 \
--layer_scale_init_value 0 \
--warmup_epochs 50 --model_ema true --model_ema_eval true \
--data_path /path/to/imagenet-1k \
--output_dir /path/to/save_results
ConvNeXt-L (isotropic)
Multi-node
python run_with_submitit.py --nodes 8 --ngpus 8 \
--model convnext_isotropic_large --drop_path 0.5 \
--batch_size 64 --lr 4e-3 --update_freq 1 \
--warmup_epochs 50 --model_ema true --model_ema_eval true \
--data_path /path/to/imagenet-1k \
--job_dir /path/to/save_results
Single-machine
python -m torch.distributed.launch --nproc_per_node=8 main.py \
--model convnext_isotropic_large --drop_path 0.5 \
--batch_size 64 --lr 4e-3 --update_freq 8 \
--warmup_epochs 50 --model_ema true --model_ema_eval true \
--data_path /path/to/imagenet-1k \
--output_dir /path/to/save_results
ImageNet-22K is significantly larger than ImageNet-1K in terms of data size, so we use 16 8-GPU nodes for pre-training on ImageNet-22K.
ConvNeXt-B pre-training on ImageNet-22K:
Multi-node
python run_with_submitit.py --nodes 16 --ngpus 8 \
--model convnext_base --drop_path 0.1 \
--batch_size 32 --lr 4e-3 --update_freq 1 \
--warmup_epochs 5 --epochs 90 \
--data_set image_folder --nb_classes 21841 --disable_eval true \
--data_path /path/to/imagenet-22k \
--job_dir /path/to/save_results
Single-machine
python -m torch.distributed.launch --nproc_per_node=8 main.py \
--model convnext_base --drop_path 0.1 \
--batch_size 32 --lr 4e-3 --update_freq 16 \
--warmup_epochs 5 --epochs 90 \
--data_set image_folder --nb_classes 21841 --disable_eval true \
--data_path /path/to/imagenet-22k \
--output_dir /path/to/save_results
ConvNeXt-L
Multi-node
python run_with_submitit.py --nodes 16 --ngpus 8 \
--model convnext_large --drop_path 0.1 \
--batch_size 32 --lr 4e-3 --update_freq 1 \
--warmup_epochs 5 --epochs 90 \
--data_set image_folder --nb_classes 21841 --disable_eval true \
--data_path /path/to/imagenet-22k \
--job_dir /path/to/save_results
Single-machine
python -m torch.distributed.launch --nproc_per_node=8 main.py \
--model convnext_large --drop_path 0.1 \
--batch_size 32 --lr 4e-3 --update_freq 16 \
--warmup_epochs 5 --epochs 90 \
--data_set image_folder --nb_classes 21841 --disable_eval true \
--data_path /path/to/imagenet-22k \
--output_dir /path/to/save_results
ConvNeXt-XL
Multi-node
python run_with_submitit.py --nodes 16 --ngpus 8 \
--model convnext_xlarge --drop_path 0.2 \
--batch_size 32 --lr 4e-3 --update_freq 1 \
--warmup_epochs 5 --epochs 90 \
--data_set image_folder --nb_classes 21841 --disable_eval true \
--data_path /path/to/imagenet-22k \
--job_dir /path/to/save_results
Single-machine
python -m torch.distributed.launch --nproc_per_node=8 main.py \
--model convnext_xlarge --drop_path 0.2 \
--batch_size 32 --lr 4e-3 --update_freq 16 \
--warmup_epochs 5 --epochs 90 \
--data_set image_folder --nb_classes 21841 --disable_eval true \
--data_path /path/to/imagenet-22k \
--output_dir /path/to/save_results
The training commands given above for ImageNet-1K use the default resolution (224). We also fine-tune these trained models with a larger resolution (384). Please specify the path or url to the checkpoint in --finetune
.
ConvNeXt-B fine-tuning on ImageNet-1K (384x384):
Multi-node
python run_with_submitit.py --nodes 2 --ngpus 8 \
--model convnext_base --drop_path 0.8 --input_size 384 \
--batch_size 32 --lr 5e-5 --update_freq 1 \
--warmup_epochs 0 --epochs 30 --weight_decay 1e-8 \
--layer_decay 0.7 --head_init_scale 0.001 --cutmix 0 --mixup 0 \
--finetune /path/to/checkpoint.pth \
--data_path /path/to/imagenet-1k \
--job_dir /path/to/save_results
Single-machine
python -m torch.distributed.launch --nproc_per_node=8 main.py \
--model convnext_base --drop_path 0.8 --input_size 384 \
--batch_size 32 --lr 5e-5 --update_freq 2 \
--warmup_epochs 0 --epochs 30 --weight_decay 1e-8 \
--layer_decay 0.7 --head_init_scale 0.001 --cutmix 0 --mixup 0 \
--finetune /path/to/checkpoint.pth \
--data_path /path/to/imagenet-1k \
--output_dir /path/to/save_results
ConvNeXt-L (384x384)
Multi-node
python run_with_submitit.py --nodes 2 --ngpus 8 \
--model convnext_large --drop_path 0.95 --input_size 384 \
--batch_size 32 --lr 5e-5 --update_freq 1 \
--warmup_epochs 0 --epochs 30 --weight_decay 1e-8 \
--layer_decay 0.7 --head_init_scale 0.001 --cutmix 0 --mixup 0 \
--finetune /path/to/checkpoint.pth \
--data_path /path/to/imagenet-1k \
--job_dir /path/to/save_results
Single-machine
python -m torch.distributed.launch --nproc_per_node=8 main.py \
--model convnext_large --drop_path 0.95 --input_size 384 \
--batch_size 32 --lr 5e-5 --update_freq 2 \
--warmup_epochs 0 --epochs 30 --weight_decay 1e-8 \
--layer_decay 0.7 --head_init_scale 0.001 --cutmix 0 --mixup 0 \
--finetune /path/to/checkpoint.pth \
--data_path /path/to/imagenet-1k \
--output_dir /path/to/save_results
- The fine-tuning for ImageNet-1K pre-trained ConvNeXt-L starts from the best ema weights during pre-training. You can add
--model_key model_ema
to load from a saved checkpoint that hasmodel_ema
as a key (e.g., obtained by training with--model_ema true
), to load ema weights. Note that our provided pre-trained checkpoints only havemodel
as the only key.
We finetune from ImageNet-22K pre-trained models, in both 224 and 384 resolutions.
ConvNeXt-B fine-tuning on ImageNet-1K (224x224)
Multi-node
python run_with_submitit.py --nodes 2 --ngpus 8 \
--model convnext_base --drop_path 0.2 --input_size 224 \
--batch_size 32 --lr 5e-5 --update_freq 1 \
--warmup_epochs 0 --epochs 30 --weight_decay 1e-8 \
--layer_decay 0.8 --head_init_scale 0.001 --cutmix 0 --mixup 0 \
--finetune /path/to/checkpoint.pth \
--data_path /path/to/imagenet-1k \
--job_dir /path/to/save_results
Single-machine
python -m torch.distributed.launch --nproc_per_node=8 main.py \
--model convnext_base --drop_path 0.2 --input_size 224 \
--batch_size 32 --lr 5e-5 --update_freq 2 \
--warmup_epochs 0 --epochs 30 --weight_decay 1e-8 \
--layer_decay 0.8 --head_init_scale 0.001 --cutmix 0 --mixup 0 \
--finetune /path/to/checkpoint.pth \
--data_path /path/to/imagenet-1k \
--output_dir /path/to/save_results
ConvNeXt-L (224x224)
Multi-node
python run_with_submitit.py --nodes 2 --ngpus 8 \
--model convnext_large --drop_path 0.3 --input_size 224 \
--batch_size 32 --lr 5e-5 --update_freq 1 \
--warmup_epochs 0 --epochs 30 --weight_decay 1e-8 \
--layer_decay 0.8 --head_init_scale 0.001 --cutmix 0 --mixup 0 \
--finetune /path/to/checkpoint.pth \
--data_path /path/to/imagenet-1k \
--job_dir /path/to/save_results
Single-machine
python -m torch.distributed.launch --nproc_per_node=8 main.py \
--model convnext_large --drop_path 0.3 --input_size 224 \
--batch_size 32 --lr 5e-5 --update_freq 2 \
--warmup_epochs 0 --epochs 30 --weight_decay 1e-8 \
--layer_decay 0.8 --head_init_scale 0.001 --cutmix 0 --mixup 0 \
--finetune /path/to/checkpoint.pth \
--data_path /path/to/imagenet-1k \
--output_dir /path/to/save_results
ConvNeXt-XL (224x224)
Multi-node
python run_with_submitit.py --nodes 4 --ngpus 8 \
--model convnext_xlarge --drop_path 0.4 --input_size 224 \
--batch_size 16 --lr 5e-5 --update_freq 1 \
--warmup_epochs 0 --epochs 30 --weight_decay 1e-8 \
--layer_decay 0.8 --head_init_scale 0.001 --cutmix 0 --mixup 0 \
--finetune /path/to/checkpoint.pth \
--data_path /path/to/imagenet-1k \
--job_dir /path/to/save_results \
--model_ema true --model_ema_eval true
Single-machine
python -m torch.distributed.launch --nproc_per_node=8 main.py \
--model convnext_xlarge --drop_path 0.4 --input_size 224 \
--batch_size 16 --lr 5e-5 --update_freq 4 \
--warmup_epochs 0 --epochs 30 --weight_decay 1e-8 \
--layer_decay 0.8 --head_init_scale 0.001 --cutmix 0 --mixup 0 \
--finetune /path/to/checkpoint.pth \
--data_path /path/to/imagenet-1k \
--output_dir /path/to/save_results \
--model_ema true --model_ema_eval true
ConvNeXt-B (384x384)
Multi-node
python run_with_submitit.py --nodes 4 --ngpus 8 \
--model convnext_base --drop_path 0.2 --input_size 384 \
--batch_size 16 --lr 5e-5 --update_freq 1 \
--warmup_epochs 0 --epochs 30 --weight_decay 1e-8 \
--layer_decay 0.8 --head_init_scale 0.001 --cutmix 0 --mixup 0 \
--finetune /path/to/checkpoint.pth \
--data_path /path/to/imagenet-1k \
--job_dir /path/to/save_results
Single-machine
python -m torch.distributed.launch --nproc_per_node=8 main.py \
--model convnext_base --drop_path 0.2 --input_size 384 \
--batch_size 16 --lr 5e-5 --update_freq 4 \
--warmup_epochs 0 --epochs 30 --weight_decay 1e-8 \
--layer_decay 0.8 --head_init_scale 0.001 --cutmix 0 --mixup 0 \
--finetune /path/to/checkpoint.pth \
--data_path /path/to/imagenet-1k \
--output_dir /path/to/save_results
ConvNeXt-L (384x384)
Multi-node
python run_with_submitit.py --nodes 4 --ngpus 8 \
--model convnext_large --drop_path 0.3 --input_size 384 \
--batch_size 16 --lr 5e-5 --update_freq 1 \
--warmup_epochs 0 --epochs 30 --weight_decay 1e-8 \
--layer_decay 0.8 --head_init_scale 0.001 --cutmix 0 --mixup 0 \
--finetune /path/to/checkpoint.pth \
--data_path /path/to/imagenet-1k \
--job_dir /path/to/save_results
Single-machine
python -m torch.distributed.launch --nproc_per_node=8 main.py \
--model convnext_large --drop_path 0.3 --input_size 384 \
--batch_size 16 --lr 5e-5 --update_freq 4 \
--warmup_epochs 0 --epochs 30 --weight_decay 1e-8 \
--layer_decay 0.8 --head_init_scale 0.001 --cutmix 0 --mixup 0 \
--finetune /path/to/checkpoint.pth \
--data_path /path/to/imagenet-1k \
--output_dir /path/to/save_results
ConvNeXt-XL (384x384)
Multi-node
python run_with_submitit.py --nodes 8 --ngpus 8 \
--model convnext_xlarge --drop_path 0.4 --input_size 384 \
--batch_size 8 --lr 5e-5 --update_freq 1 \
--warmup_epochs 0 --epochs 30 --weight_decay 1e-8 \
--layer_decay 0.8 --head_init_scale 0.001 --cutmix 0 --mixup 0 \
--finetune /path/to/checkpoint.pth \
--data_path /path/to/imagenet-1k \
--job_dir /path/to/save_results \
--model_ema true --model_ema_eval true
Single-machine
python -m torch.distributed.launch --nproc_per_node=8 main.py \
--model convnext_xlarge --drop_path 0.4 --input_size 384 \
--batch_size 8 --lr 5e-5 --update_freq 8 \
--warmup_epochs 0 --epochs 30 --weight_decay 1e-8 \
--layer_decay 0.8 --head_init_scale 0.001 --cutmix 0 --mixup 0 \
--finetune /path/to/checkpoint.pth \
--data_path /path/to/imagenet-1k \
--output_dir /path/to/save_results \
--model_ema true --model_ema_eval true