Embed layer #2032

jeffdonahue · 2015-03-04T19:29:23Z

(Replaces #1872)

Based on #1977 (parameter gradient accumulation). This adds EmbedLayer (should probably change the name to EmbeddingLayer for consistency with PoolingLayer etc.), which essentially learns a lookup table for integer inputs, useful for language modeling and such. Its computation is equivalent to an InnerProductLayer with "one-hot" vector inputs, but instead of explicitly representing the one-hot vectors (which wastes lots of memory), this assumes the input itself is the indices of the "hot" index of those one-hot vectors (like the label inputs for the categorical losses). This should probably be replaced with SparseInnerProduct (#937) once that's merged, assuming that's faster -- this is a more lightweight change that continues the unfortunate trend of casting floats to ints as labels.

jeffdonahue · 2015-06-06T00:45:10Z

Rebased and ready for review. (Previously depended on gradient accumulation PR #1663.)

futurely · 2015-07-13T04:17:00Z

This layer works as a lookup table and could be renamed to LookupTable.
https://github.com/torch/nn/blob/master/doc/convolution.md#nn.LookupTable
https://github.com/torch/nn/blob/master/LookupTable.lua

(double impl from NVIDIA dev docs; float impl included in CUDA as "atomicAdd")

with unit tests

Embed layer for lookup table of one hot encodings

beniz · 2015-12-08T14:50:34Z

~~I am very confused by this Embed layer. My hunch is that no one uses it outside of the RNN/LSTM branch. So I doubt I'll get any answer but let's try just in case.~~

~~I've tried to use it in a simple MLP like this:~~

Here is an example of a typical prototxt section:

layer {
  name: "embed"
  type: "Embed"
  bottom: "data"
  top: "embed_data"
  embed_param {
    input_dim: 5454
    num_output: 200
    weight_filler {
      type: "uniform"
      min: -0.08
      max: 0.08
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "ip0"
  type: "InnerProduct"
  bottom: "embed_data"
  top: "ip0"
  inner_product_param {
    num_output: 200
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  } 
}

Filling up the data elements with the vocabulary indices of words in a sentence, naturally I get an error from the data_transformer since datum channels are now of various sizes. Then I tried padding the remaining elements to 0, as I understand it is done in https://github.com/BVLC/caffe/pull/1873/files

~~But in this case, there's no memory advantage of doing this vs one-hot vectors since the input dim is the same. Thus I am confused :)~~

~~Needless to say, any help is highly appreciated at this point!~~

Understood, of course the padding is to fix the input sequence length.

jeffdonahue mentioned this pull request Mar 4, 2015

Unrolled recurrent layers (RNN, LSTM) #2033

Closed

jeffdonahue force-pushed the embed-layer branch from 2db4482 to 747b94e Compare March 6, 2015 05:48

shelhamer added JL ES labels Mar 7, 2015

jeffdonahue added the ready for review label Mar 9, 2015

jeffdonahue force-pushed the embed-layer branch 3 times, most recently from d6eb72a to 9271abf Compare March 13, 2015 20:13

jeffdonahue force-pushed the embed-layer branch from 9271abf to 287d2c1 Compare June 5, 2015 03:37

jeffdonahue force-pushed the embed-layer branch from 287d2c1 to 69b0e8c Compare August 8, 2015 00:21

jeffdonahue added 4 commits August 7, 2015 22:20

Add gpu_util.cuh, with caffe_gpu_atomic_add

443b16f

(double impl from NVIDIA dev docs; float impl included in CUDA as "atomicAdd")

test_gradient_check_util: check_bottom < -1 only checks params

6067869

Add EmbedLayer for inner products with sparse input (one-hot vectors),

4d299c3

with unit tests

EmbedBackward with no loops -- use caffe_gpu_atomic_add instead

ac9e29f

jeffdonahue force-pushed the embed-layer branch from 69b0e8c to ac9e29f Compare August 8, 2015 05:20

shelhamer added a commit that referenced this pull request Aug 25, 2015

Merge pull request #2032 from jeffdonahue/embed-layer

80579b8

Embed layer for lookup table of one hot encodings

shelhamer merged commit 80579b8 into BVLC:master Aug 25, 2015

shelhamer mentioned this pull request Aug 25, 2015

TileLayer #2083

Merged

jeffdonahue deleted the embed-layer branch August 26, 2015 00:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Embed layer #2032

Embed layer #2032

jeffdonahue commented Mar 4, 2015

jeffdonahue commented Jun 6, 2015

futurely commented Jul 13, 2015

beniz commented Dec 8, 2015 •

edited

Loading

Embed layer #2032

Embed layer #2032

Conversation

jeffdonahue commented Mar 4, 2015

jeffdonahue commented Jun 6, 2015

futurely commented Jul 13, 2015

beniz commented Dec 8, 2015 • edited Loading

beniz commented Dec 8, 2015 •

edited

Loading