This repository has the source code and the Stanford Online Products dataset for the paper "Deep Metric Learning via Lifted Structured Feature Embedding" (CVPR16). The paper is available on cv-foundation. If you just need the Caffe code, check out the Submodule. For the loss layer implementation, look at here.
If you find this work useful in your research, please consider citing:
@inproceedings{songCVPR16,
Author = {Hyun Oh Song and Yu Xiang and Stefanie Jegelka and Silvio Savarese},
Title = {Deep Metric Learning via Lifted Structured Feature Embedding},
Booktitle = {Computer Vision and Pattern Recognition (CVPR)},
Year = {2016}
}
- Install prerequsites for
Caffe
(see: Caffe installation instructions) - Compile the
Caffe-Deep-Metric-Learning-CVPR16
Github submodule.
- Download pretrained GoogLeNet model from here
- Download the ILSVRC12 ImageNet mean file for mean subtraction. Refer to Caffe the ImageNet examples here.
- Modify and run
code/gen_splits.m
to create train/test split. - Modify and run
code/gen_images.m
to prepare the preprocessed images.
- Generate the LMDB file to convert the training set of images to the DB format. Example scripts are in
code/
directory.
- Modify and run
code/compile.m
to mex compile the cpp files used for LMDB generation. - Modify
code/config.m
to set save paths. - Run
code/gen_caffe_dataset_multilabel_m128.m
to start the LMDB generation process.
- Create the
model/train*.prototxt
andmodel/solver*.prototxt
files. Please refer to the included*.prototxt
files inmodel/
directory for examples. You also need to provide the path to the ImageNet mean file (usually calledimagenet_mean.binaryproto
) you downloaded in step 2. - Inside the caffe submodule, launch the Caffe training procedure.
caffe/build/tools/caffe train -solver [path-to-training-prototxt-file] -weights [path-to-pretrained-googlenet] -gpu [gpuid]
- Modify and run
code/gen_caffe_validation_imageset.m
to convert the test images to LMDB format. - Modify the test set path in
model/extract_googlenet*.prototxt
. - Modify the model and test set path and run
code/compute_googlenet_distance_matrix_cuda_embeddings_liftedstructsim_softmax_pair_m128.py
.
- Use
code/evaluation/evaluate_clustering.m
to evaluate the clustering performance. - Use
code/evaluation/evaluate_recall.m
to evaluate recall@K for image retrieval.
You can download the Stanford Online Products dataset (2.9G) from ftp://cs.stanford.edu/cs/cvgl/Stanford_Online_Products.zip or https://drive.google.com/uc?export=download&id=1TclrpQOF_ullUP99wk_gjGN8pKvtErG8
- We also have the text meta data for each product images. Please let us know if you're interested in using them.
You can download our pre-trained models on the Cars196 dataset, the CUB200 dataset and the Online Products dataset (265M) from ftp://cs.stanford.edu/cs/cvgl/pretrained_models.zip
MIT Licence