minor typo in PDF · oxford-cs-ml-2015/practical6@96749c8

mszlazak · 2015-04-28T05:51:53Z

Keep getting this "contiguous tensor" issue running this practical. Any ideas?

~/torch$ th train.lua -vocabfile vocab.t7 -datafile train.t7
-- ignore option print_every
-- ignore option save_every
-- ignore option savefile
-- ignore option vocabfile
-- ignore option datafile
loading data files...
cutting off end of data so that the batches/sequences divide evenly
reshaping tensor...
/home/mark/torch/install/bin/luajit: /home/mark/torch/install/share/lua/5.1/torch/Tensor.lua:450: expecting a contiguous tensor
stack traceback:
[C]: in function 'assert'
/home/mark/torch/install/share/lua/5.1/torch/Tensor.lua:450: in function 'view'
./data/CharLMMinibatchLoader.lua:40: in function 'create'
train.lua:37: in main chunk
[C]: in function 'dofile'
...mark/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:131: in main chunk
[C]: at 0x0804d650

bshillingford · 2015-04-29T01:04:53Z

What is your OS and CPU architecture? Make sure you also update torch, in case you have an old installation. There might also be issues from having too new of an installation, but that doesn't seem to be the case.

mszlazak · 2015-04-29T15:32:05Z

The OS is Ubuntu 14.04 on an older 32-bit Intel CPU. In your CharLMMinibatchLoader.create() After the code for cutting off the end so that it divides evenly, I added to following and fixed the contiguous problem. Still do not know why it is happening. ``` -- FIX: added check if contiguous if not data:isContiguous() then data = data:contiguous() end ``` An easy one for convenience sake. Add a semicolon at the end of this line: ydata:sub(1,-2):copy(data:sub(2,-1)); I copy/paste code into the console while in th> to quickly check things and got memory overflow. That's it. No more problems so far.

mszlazak · 2015-04-30T04:34:00Z

Found another problem. In train.lua within the for-loop of the forward pass, this line:

embeddings[t] = clones.embed[t]:forward(x[{{}, t}])

throws an error:

/home/mark/torch/install/bin/luajit: bad argument #2 to '?' (out of range)
stack traceback:
[C]: at 0xb7341ea0
[C]: in function '__index'
./Embedding.lua:29: in function 'forward'
train.lua:91: in function 'opfunc'
/home/mark/torch/install/share/lua/5.1/optim/adagrad.lua:30: in function 'adagrad'
train.lua:141: in main chunk
[C]: in function 'dofile'
...mark/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:131: in main chunk
[C]: at 0x0804d650

It's related to this line in Embedding:updateOutput(input):
self.output[i]:copy(self.weight[input[i]])
The value of input[i] is a char value, since array input is array x.
This makes no sense??

bshillingford · 2015-05-01T15:43:55Z

It sounds like the issue is a differing binary format between x86 and x86-64. Perhaps a different size for long ints, or the type used to store offsets and strides in the tensors. That would cause the second error, as well.

If you like, I can provide the data in another format, or you can use your own data (see data/ and data_preparation/ for the relevant scripts). Then generate your own vocab.t7 and train.t7. Any large (~50MB or larger) text file will do.

mszlazak · 2015-05-01T15:52:51Z

Thanks, yup that was the suspicion.
When you say "second error" do you mean my last post about Embedding:updateOutput?

Yes please provide it in a different format. The Embedding:updateOutput issue seemed more like a logic issue so another data format would help me resolve it. Thank you!

Here is my email: mszlazak@aol.com

bshillingford · 2015-05-01T15:57:58Z

Yes, I meant both errors were stemming from garbage data being loaded in, due to alignment issues from the datatype sizes. In the meantime, could you please check if vocab.t7 loads correctly? Just do print(torch.load('vocab.t7')) and see if you get a table mapping roughly 30 characters to ints.

…

On Fri, May 1, 2015 at 4:52 PM, mszlazak ***@***.*** wrote: Thanks, yup that was the suspicion. When you say "second error" do you mean my last post about Embedding:updateOutput? Yes please provide it in a different format. The Embedding:updateOutput issue seemed more like a logic issue so another data format would help me resolve it. Thank you! Here is my email: ***@***.*** — Reply to this email directly or view it on GitHub 96749c8#commitcomment-11003098 .

mszlazak · 2015-05-01T16:01:04Z

Yes I get 35 characters. What range of values do i expect. Things I get are arrays with 80's 96's etc. Funny thing is these arrays are filled with just one number.

mszlazak · 2015-05-01T16:04:33Z

Another side issue while we are at it. If one is not going to flatten everything out like you do then could you not just use the clone() function Torch provides instead the reader() ... etc?

bshillingford · 2015-05-01T16:05:54Z

What do you mean by "flatten everything out"?

…

On Fri, May 1, 2015 at 5:04 PM, mszlazak ***@***.*** wrote: Another side issue while we are at it. If you where not going to flatten everything out like you do then could you not use the clone() function Torch provides instead the reader() ... etc? — Reply to this email directly or view it on GitHub 96749c8#commitcomment-11003301 .

bshillingford · 2015-05-01T16:07:24Z

Download and extract these two replacement files, now in torch's text-based format instead of the binary format: http://www.cs.ox.ac.uk/people/brendan.shillingford/teaching/practical6-data-ascii/index.html And change all occurrences of torch.load(something) to say torch.load(something,'ascii'). On Fri, May 1, 2015 at 5:05 PM, Brendan Shillingford brendan2609@gmail.com wrote:

…

What do you mean by "flatten everything out"? On Fri, May 1, 2015 at 5:04 PM, mszlazak ***@***.*** wrote: > Another side issue while we are at it. If you where not going to flatten > everything out like you do then could you not use the clone() function > Torch provides instead the reader() ... etc? > > — > Reply to this email directly or view it on GitHub > 96749c8#commitcomment-11003301 > .

mszlazak · 2015-05-01T16:09:50Z

th> print(torch.load('vocab.t7'))
{
q : 26
. : 6
c : 12
b : 11
s : 28
d : 13
( : 3
e : 14
t : 29
n : 23
g : 16
w : 32
; : 8
h : 17
, : 5
i : 18
f : 15
y : 34
v : 31
k : 20
j : 19
: 1
? : 9
l : 21
m : 22
! : 2
u : 30
r : 27
o : 24
: : 7
z : 35
x : 33
) : 4
a : 10
p : 25
}

bshillingford · 2015-05-01T16:12:14Z

Looks fine, must be just the tensor then. The one I sent in the link should work fine. It's the exact same dataset, just converted format.

…

On Fri, May 1, 2015 at 5:09 PM, mszlazak ***@***.*** wrote: th> print(torch.load('vocab.t7')) { q : 26 . : 6 c : 12 b : 11 s : 28 d : 13 ( : 3 e : 14 t : 29 n : 23 g : 16 w : 32 ; : 8 h : 17 , : 5 i : 18 f : 15 y : 34 v : 31 k : 20 j : 19 : 1 ? : 9 l : 21 m : 22 ! : 2 u : 30 r : 27 o : 24 : : 7 z : 35 x : 33 ) : 4 a : 10 p : 25 } — Reply to this email directly or view it on GitHub 96749c8#commitcomment-11003375 .

mszlazak · 2015-05-01T16:13:29Z

I just want to keep things as simple as possible at first and do not gather things up into one long storage. It just for teaching/self-learning purposes.

More simply, why not just use clone() provided by Torch instead of reader()?

BTW, see that auto differentiation script you just made. Thanks ... again. LOL

bshillingford · 2015-05-01T16:14:50Z

Sure, could you point me to which line/file you're referring to, though?

…

On Fri, May 1, 2015 at 5:13 PM, mszlazak ***@***.*** wrote: I just want to keep things as simple as possible at first and do not gather things up into one long storage. It just for teaching/self-learning purposes. More simply, why not just use clone() provided by Torch instead of reader()? BTW, seem that auto differentiation script you just made. Thanks ... again. LOL — Reply to this email directly or view it on GitHub 96749c8#commitcomment-11003429 .

mszlazak · 2015-05-01T16:17:56Z

Why not use something like the following instead of model_utils.clone_many_times,
function makeClones(model, numClones)
local clones = {}
for name, module in pairs(model) do
if module.getParameters then
clones[name] = {}
for i = 1, numClones do
clones[name][i] = module:clone()
end
print('Cloning\t'..#clones[name]..' '..name..' modules.')
end
end
return clones
end

bshillingford · 2015-05-01T16:28:44Z

You also need to make sure the parameter and gradients share the same tensors, for the purposes of weight sharing. In the example you give, you won't share the parameters/gradients. The code is a modified version of this, see the comment: https://github.com/wojciechz/learning_to_execute/blob/master/utils/utils.lua#L24 For an few details about parameter/gradient sharing in torch, this old issue has a few details: torch/DEPRECEATED-torch7-distro#156

…

On Fri, May 1, 2015 at 5:17 PM, mszlazak ***@***.*** wrote: Why not use something like the following instead of model_utils.clone_many_times, function makeClones(model, numClones) local clones = {} for name, module in pairs(model) do if module.getParameters then clones[name] = {} for i = 1, numClones do clones[name][i] = module:clone() end print('Cloning\t'..#clones[name]..' '..name..' modules.') end end return clones end — Reply to this email directly or view it on GitHub 96749c8#commitcomment-11003495 .

mszlazak · 2015-05-01T16:32:59Z

Thank you, I will take a look.

BTW, here is a shorter version of Wojciechz code:

https://github.com/wojzaremba/lstm

bshillingford · 2015-05-01T16:37:44Z

Yup, you may prefer to try that one if you want to try fbcunn.

…

On Fri, May 1, 2015 at 5:32 PM, mszlazak ***@***.*** wrote: Thank you, I will take a look. BTW, here is a shorter version of Wojciechz code: https://github.com/wojzaremba/lstm — Reply to this email directly or view it on GitHub 96749c8#commitcomment-11003703 .

mszlazak · 2015-05-01T19:05:37Z

Thanks Mr. Shillingford!
Reading in the files as ascii fixed all the problems.
Have to see whether they fixed the cloning issues with shared parameters ... that was a while back.
Fixing the cloning stuff plus your autobw will make things a lot easier.
https://github.com/bshillingford/autobw.torch

bshillingford · 2015-05-01T19:18:49Z

Parameters are not shared by default in clone(), but there's an optional args for the names of the parameters to clone. AFAIK this doesn't work with nngraph modules though, since you need to specify the names of the parameters, but I could be wrong. If you look at the clone() implementation in nn.Module it's very similar to clonemanytimes anyway.

…

On Fri, May 1, 2015 at 8:05 PM, mszlazak ***@***.*** wrote: Thanks Mr. Shillingford! Reading in the files as ascii fixed all the problems. Have to see whether they fixed the cloning issues with shared parameters ... that was a while back. Fixing the cloning stuff plus your autobw will make things a lot easier. https://github.com/bshillingford/autobw.torch — Reply to this email directly or view it on GitHub 96749c8#commitcomment-11005993 .

mszlazak · 2015-05-02T15:52:13Z

Yup! nngraph does not work with the usual parameter names like 'bias', 'weight', etc.

Curious about why you created a getParameters function for multiple modules:
model_utils.combine_all_parameters(...)

If I put protos.embed, protos.lstm, protos.softmax in one module of nngraph then i can get the same thing with getParameters().

bshillingford · 2015-05-02T15:55:54Z

Mostly for the purposes of encapsulation/isolation of operations into classes/nngraph factory methods that do one thing and one thing only. IMO easier to keep it cleaner when experimenting.

…

On May 2, 2015 4:52 PM, "mszlazak" ***@***.*** wrote: Yup! nngraph does not work with the usual parameter names like 'bias', 'weight', etc. Curious about why you created a getParameters function for multiple modules: model_utils.combine_all_parameters(...) If I put protos.embed, protos.lstm, protos.softmax in one module of nngraph then i can get the same thing with getParameters(). — Reply to this email directly or view it on GitHub 96749c8#commitcomment-11012340 .

mszlazak · 2015-05-02T16:19:53Z

OK, that is a good reason. However, it does get a bit more complicated in a teaching scenario when learning to use Torch 7. To many "new moving parts" to understand at first and if simpler or familiar ones work then I tend to go that way first.

Also, you used the Embedding package. Does this have the same function as the following in create_network() of https://github.com/wojzaremba/lstm/blob/master/main.lua ?

LookupTable = nn.LookupTable
local i = {[0] = LookupTable(params.vocab_size, params.rnn_size)(x)}

bshillingford · 2015-05-02T16:42:11Z

That's true, personally I think separating parameters and activations into separate OOP entities would be cleaner for parameter sharing and RNNs. Yes, in that code LookupTable is an alias to either the torch one or the fbcunn optimized LookupTableGPU, which have the same purpose as Google's Embedding class from the learning to execute code.

…

On May 2, 2015 5:19 PM, "mszlazak" ***@***.*** wrote: OK, that is a good reason. However, it does get a bit more complicated in a teaching scenario when learning to use Torch 7. To many "new moving parts" to understand at first and if simpler or familiar ones work then I tend to go that way first. Also, you used the Embedding package. Does this have the same function as the following in create_network() of https://github.com/wojzaremba/lstm/blob/master/main.lua ? LookupTable = nn.LookupTable local i = {[0] = LookupTable(params.vocab_size, params.rnn_size)(x)} — Reply to this email directly or view it on GitHub 96749c8#commitcomment-11012447 .

mszlazak · 2015-05-02T21:37:20Z

Btw, Alex Wiltschko of Harvard is working on a package that is the equivalent of your autobw for all torch tensor operations. Something similar to https://github.com/HIPS/autograd/

mszlazak · 2015-05-03T00:40:04Z

Yup, i sent autobw link to smth chntla and he informed me of Alex working on something similar.

mszlazak · 2015-05-04T20:46:47Z

Benjamin Maréchal helped with sharing and getting at nngraph modules/parameters
https://groups.google.com/forum/#!topic/torch7/4f_wMQ6G2io
https://groups.google.com/forum/#!topic/torch7/KheG-Rlfa9k

gModules = {}
for indexNode, node in ipairs(gmod.forwardnodes) do
if node.data.module then
print(node.data.module)
gModules[#gModules+1] = node.data.module
end
end
gModules[1].bias
gModules[1].weight
p, gradP = gModules[1]:getParameters()

bshillingford · 2015-05-04T20:50:17Z

I do something like that manually while debugging, but actually it'd be cool to have a simplistic query mechanism, analogous to simple CSS selectors where you can select based on type name and query annotations. NN containers already have a find method that searches by type or something.

…

On May 4, 2015 9:46 PM, "mszlazak" ***@***.*** wrote: Benjamin Maréchal helped with sharing and getting at nngraph modules/parameters https://groups.google.com/forum/#!topic/torch7/4f_wMQ6G2io https://groups.google.com/forum/#!topic/torch7/KheG-Rlfa9k gModules = {} for indexNode, node in ipairs(gmod.forwardnodes) do if node.data.module then print(node.data.module) gModules[#gModules+1] = node.data.module end end gModules[1].bias gModules[1].weight p, gradP = gModules[1]:getParameters() — Reply to this email directly or view it on GitHub 96749c8#commitcomment-11033195 .

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Commit

There are no files selected for viewing

29 comments on commit `96749c8`

mszlazak commented on `96749c8` Apr 28, 2015

bshillingford commented on `96749c8` Apr 29, 2015

mszlazak commented on `96749c8` Apr 29, 2015 via email

mszlazak commented on `96749c8` Apr 30, 2015

bshillingford commented on `96749c8` May 1, 2015

mszlazak commented on `96749c8` May 1, 2015

bshillingford commented on `96749c8` May 1, 2015 via email

mszlazak commented on `96749c8` May 1, 2015

mszlazak commented on `96749c8` May 1, 2015

bshillingford commented on `96749c8` May 1, 2015 via email

bshillingford commented on `96749c8` May 1, 2015 via email

mszlazak commented on `96749c8` May 1, 2015

bshillingford commented on `96749c8` May 1, 2015 via email

mszlazak commented on `96749c8` May 1, 2015

bshillingford commented on `96749c8` May 1, 2015 via email

mszlazak commented on `96749c8` May 1, 2015

bshillingford commented on `96749c8` May 1, 2015 via email

mszlazak commented on `96749c8` May 1, 2015

bshillingford commented on `96749c8` May 1, 2015 via email

mszlazak commented on `96749c8` May 1, 2015

bshillingford commented on `96749c8` May 1, 2015 via email

mszlazak commented on `96749c8` May 2, 2015

bshillingford commented on `96749c8` May 2, 2015 via email

mszlazak commented on `96749c8` May 2, 2015

bshillingford commented on `96749c8` May 2, 2015 via email

mszlazak commented on `96749c8` May 2, 2015

mszlazak commented on `96749c8` May 3, 2015

mszlazak commented on `96749c8` May 4, 2015

bshillingford commented on `96749c8` May 4, 2015 via email

Commit

There are no files selected for viewing

29 comments on commit 96749c8

mszlazak commented on 96749c8 Apr 28, 2015

Choose a reason for hiding this comment

bshillingford commented on 96749c8 Apr 29, 2015

Choose a reason for hiding this comment

mszlazak commented on 96749c8 Apr 29, 2015 via email

Choose a reason for hiding this comment

mszlazak commented on 96749c8 Apr 30, 2015

Choose a reason for hiding this comment

bshillingford commented on 96749c8 May 1, 2015

Choose a reason for hiding this comment

mszlazak commented on 96749c8 May 1, 2015

Choose a reason for hiding this comment

bshillingford commented on 96749c8 May 1, 2015 via email

Choose a reason for hiding this comment

mszlazak commented on 96749c8 May 1, 2015

Choose a reason for hiding this comment

mszlazak commented on 96749c8 May 1, 2015

Choose a reason for hiding this comment

bshillingford commented on 96749c8 May 1, 2015 via email

Choose a reason for hiding this comment

bshillingford commented on 96749c8 May 1, 2015 via email

Choose a reason for hiding this comment

mszlazak commented on 96749c8 May 1, 2015

Choose a reason for hiding this comment

bshillingford commented on 96749c8 May 1, 2015 via email

Choose a reason for hiding this comment

mszlazak commented on 96749c8 May 1, 2015

Choose a reason for hiding this comment

bshillingford commented on 96749c8 May 1, 2015 via email

Choose a reason for hiding this comment

mszlazak commented on 96749c8 May 1, 2015

Choose a reason for hiding this comment

bshillingford commented on 96749c8 May 1, 2015 via email

Choose a reason for hiding this comment

mszlazak commented on 96749c8 May 1, 2015

Choose a reason for hiding this comment

bshillingford commented on 96749c8 May 1, 2015 via email

Choose a reason for hiding this comment

mszlazak commented on 96749c8 May 1, 2015

Choose a reason for hiding this comment

bshillingford commented on 96749c8 May 1, 2015 via email

Choose a reason for hiding this comment

mszlazak commented on 96749c8 May 2, 2015

Choose a reason for hiding this comment

bshillingford commented on 96749c8 May 2, 2015 via email

Choose a reason for hiding this comment

mszlazak commented on 96749c8 May 2, 2015

Choose a reason for hiding this comment

bshillingford commented on 96749c8 May 2, 2015 via email

Choose a reason for hiding this comment

mszlazak commented on 96749c8 May 2, 2015

Choose a reason for hiding this comment

mszlazak commented on 96749c8 May 3, 2015

Choose a reason for hiding this comment

mszlazak commented on 96749c8 May 4, 2015

Choose a reason for hiding this comment

bshillingford commented on 96749c8 May 4, 2015 via email

Choose a reason for hiding this comment

29 comments on commit `96749c8`

mszlazak commented on `96749c8` Apr 28, 2015

bshillingford commented on `96749c8` Apr 29, 2015

mszlazak commented on `96749c8` Apr 29, 2015 via email

mszlazak commented on `96749c8` Apr 30, 2015

bshillingford commented on `96749c8` May 1, 2015

mszlazak commented on `96749c8` May 1, 2015

bshillingford commented on `96749c8` May 1, 2015 via email

mszlazak commented on `96749c8` May 1, 2015

mszlazak commented on `96749c8` May 1, 2015

bshillingford commented on `96749c8` May 1, 2015 via email

bshillingford commented on `96749c8` May 1, 2015 via email

mszlazak commented on `96749c8` May 1, 2015

bshillingford commented on `96749c8` May 1, 2015 via email

mszlazak commented on `96749c8` May 1, 2015

bshillingford commented on `96749c8` May 1, 2015 via email

mszlazak commented on `96749c8` May 1, 2015

bshillingford commented on `96749c8` May 1, 2015 via email

mszlazak commented on `96749c8` May 1, 2015

bshillingford commented on `96749c8` May 1, 2015 via email

mszlazak commented on `96749c8` May 1, 2015

bshillingford commented on `96749c8` May 1, 2015 via email

mszlazak commented on `96749c8` May 2, 2015

bshillingford commented on `96749c8` May 2, 2015 via email

mszlazak commented on `96749c8` May 2, 2015

bshillingford commented on `96749c8` May 2, 2015 via email

mszlazak commented on `96749c8` May 2, 2015

mszlazak commented on `96749c8` May 3, 2015

mszlazak commented on `96749c8` May 4, 2015

bshillingford commented on `96749c8` May 4, 2015 via email