Implement the RBM layer to learn binary codes for large scale image retrieval #274

kloudkl · 2014-03-29T20:18:26Z

Using the Caffe reference ImageNet model, the features extracted for the training images of the ILSVRC 2013 image classification task are more than 25GB stored in leveldb. It is evident that the occupied space must be reduced for efficient and scalable large scale image retrieval. In recent years, Restricted

Boltzmann Machine (RBM) has been successfully used to embed real-valued features into binary codes for computation and storage efficient document retrieval [1] and image retrieval [2][3][4]. Deep Boltzmann Machines (DBM) consisted of stacked Multilayer RBM are able to map the floating point features of similar images or texts into neighborhoods in the Hamming space. Thus the feature dimensions are reduced at least 32 times and the similarities among the original features are reserved. The final binary features of the ILSVRC 2013 training images would be less than 1GB and fit comfortably into a single CPU memory card or GPU card.

This work is in progress.

[1] Salakhutdinov R. R, and Hinton, G. E. Semantic Hashing. Proceedings of the SIGIR Workshop on Information Retrieval and Applications of Graphical Models, Amsterdam. 2007.
[2] A. Torralba, R. Fergus, and Y. Weiss. Small codes and large image databases for recognition. In Proc. of CVPR, pages 1–8, 2008.
[3] Ranzato, M., Krizhevsky, A. and Hinton, G. E. Factored 3-way restricted Boltzmann machines for modeling natural images. Proc. Thirteenth International Conference on Artificial Intelligence and Statistics. 2010.
[4] Krizhevsky, A. and Hinton, G.E. Using Very Deep Autoencoders for Content-Based Image Retrieval. European Symposium on Artificial Neural Networks ESANN-2011, Bruges, Belgium. 2011.

shelhamer · 2014-04-04T04:21:16Z

Are you sure you want to do this as a Caffe layer instead of training RBMs on Caffe-extracted features with an existing package like Theano?

kloudkl · 2014-04-04T08:54:40Z

Good idea! But the trained model still needs to be converted to Caffe format to facilitate online applications.

kloudkl · 2014-04-08T06:18:37Z

It is very straightforward to train a Gaussian Binary RBM using pylearn2 to convert the real-valued Caffe features into binary codes.

Convert Caffe features into pylearn2 dataset

import leveldb
from pylearn2.datasets.dense_design_matrix import DenseDesignMatrix
from pylearn2.utils import serial
sys.path.append('../../python/caffe/proto')
from caffe_pb2 import *

db = leveldb.LevelDB('features_leveldb')
max_num_features = 1290
features = []
while index < max_num_features:
    try:
        key = str(index)
        value = db.Get(key)
        num_records += 1
        datum = Datum.FromString(value)
        feature = list(datum.float_data)
        features.append(feature)
    except:
        pass
    index += 1

matrix = DenseDesignMatrix(X = np.array(features, np.float32))
serial.save('features.joblib', matrix)

Train a Gaussian Binary RBM model

PYLEARN_ROOT=/path/to/pylearn2
sudo PATH=/usr/local/cuda/bin:$PATH THEANO_FLAGS="device=gpu,floatX=float32" $PYLEARN_ROOT/pylearn2/scripts/train.py features_grbm_smd.yaml

Use the learned weight matrix

model = serial.load(model_file)
## This is a numpy array with the shape (input_dim, output_dim)
weights = model.get_weights()
binary_features = features * weights

The binary_features are still stored as np.float32 and need to be further compacted into bits if you want to save more space.

shelhamer · 2014-04-09T17:34:33Z

@kloudkl now that you have done this in pylearn2, it would be nice to compare it with a Caffe implementation. Will you resume your work on this?

kloudkl · 2014-04-10T00:50:22Z

As #308 has almost solved #189 which is one of the issues of the milestone 1.0, I will first finish #244 to solve #148 as soon as possible.

kloudkl added 3 commits March 30, 2014 03:37

Implement caffe CPU and GPU sigmoid math functions

cf5adb3

Add Restricted Boltzmann Machine (RBM) layer

7f43120

Add the RBMLayer to the layer factory

1420980

kloudkl mentioned this pull request Apr 4, 2014

Image retrieval example #243

Closed

kloudkl closed this Apr 8, 2014

shelhamer added enhancement labels Apr 9, 2014

shelhamer mentioned this pull request Oct 4, 2014

RBM layer ? (+DBN) #1207

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement the RBM layer to learn binary codes for large scale image retrieval #274

Implement the RBM layer to learn binary codes for large scale image retrieval #274

kloudkl commented Mar 29, 2014

shelhamer commented Apr 4, 2014

kloudkl commented Apr 4, 2014

kloudkl commented Apr 8, 2014

shelhamer commented Apr 9, 2014

kloudkl commented Apr 10, 2014

Implement the RBM layer to learn binary codes for large scale image retrieval #274

Implement the RBM layer to learn binary codes for large scale image retrieval #274

Conversation

kloudkl commented Mar 29, 2014

shelhamer commented Apr 4, 2014

kloudkl commented Apr 4, 2014

kloudkl commented Apr 8, 2014

shelhamer commented Apr 9, 2014

kloudkl commented Apr 10, 2014