main_train does still nothing #40

lBlitzdl · 2020-06-18T17:31:39Z

Essentially this problem again: #38
But since a solution was never provided I reopen it.

I am using Google colab's free GPU to generate data and train models. I got data Generated, Data converted, but can't get Model trained.
The issue is whenever i run the training code, the script terminates at line 60 in Training/train.lua at the line:
M.network(inputs)
This function never returns, doesn't give any error or warning either.
Tried wirh different batch sizes, on CPU, on GPU

airdine · 2020-06-28T20:52:15Z

Hello,

Look at this issue : aikupoker#15

I used a fresh OS install to solve it.
By the way, if you use cuda > 9.2 I recommend to clone torch source from this repo :
https://github.com/nagadomi/distro.git

Hope it helps !

lBlitzdl · 2020-06-30T17:48:43Z

Thank you. Can you give me more information on how you did it?
Ubuntu Version?
Torch Version?
Lua Version?
Coda Version?

Thank you very much :)

airdine · 2020-07-01T08:00:56Z

The simplest way was to use this docker image :
nvidia/cuda:10.0-cudnn7-devel-ubuntu16.04

you can inspire yourself from : aikupoker repo

Good luck

Sohaibb98 · 2020-07-05T16:12:55Z

for this, the host OS needs to be Ubuntu 16.04 or Ubuntu 18? on Ubuntu 18, it gives CMake error

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

main_train does still nothing #40

main_train does still nothing #40

lBlitzdl commented Jun 18, 2020

airdine commented Jun 28, 2020

lBlitzdl commented Jun 30, 2020

airdine commented Jul 1, 2020

Sohaibb98 commented Jul 5, 2020

main_train does still nothing #40

main_train does still nothing #40

Comments

lBlitzdl commented Jun 18, 2020

airdine commented Jun 28, 2020

lBlitzdl commented Jun 30, 2020

airdine commented Jul 1, 2020

Sohaibb98 commented Jul 5, 2020