Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nan loss after few epochs #9

Open
yassienshaalan opened this issue Jul 4, 2018 · 5 comments
Open

Nan loss after few epochs #9

yassienshaalan opened this issue Jul 4, 2018 · 5 comments

Comments

@yassienshaalan
Copy link

yassienshaalan commented Jul 4, 2018

Hey,

I got the code a while ago and trying to run it. However, after setting everything up I used to get Nan loss after the first epoch. I tried so many things, from changing optimizers to checking for null values in inputs, to changing learning rates. Only after your latest change of hard setting the batches_per_epoch to 1000 things became better, however still starting from epoch 6 it gives back nan loss values. What seems to be the problem? also, I couldn't reproduce the paper precision and recall values the best I could get after which is only after 5 epochs are:
This is the restaurant dataset
precision-recall f1-score support

     Food      0.786     0.183     0.296       887
    Staff      0.527     0.281     0.367       352
 Ambience      0.327     0.131     0.188       251
@ruidan
Copy link
Owner

ruidan commented Jul 4, 2018

Hi,

I am not sure why you got the NaN loss, since I didn't encounter this issue with the current code or the code before changing batches_per_epoch to 1000. So far, no other people has reported this issue. Maybe you use another machine to run?

As for evaluation, did you manually assign the cluster_map by yourself in evaluation.py? As the cluster_map I provided in the evaluation.py is only used for the uploaded trained model. If you train a model again, you need to manually assign the mapping first before evaluation.

You can try to evaluate the uploaded trained restaurant model by running evaluation.py directly. This will give you similar results as reported in paper. And you can have a look at how I assigned the aspect label to each cluster by looking at the cluster_map and the aspect.log in pre_trained_model/restaurant/.

@yassienshaalan
Copy link
Author

yassienshaalan commented Jul 4, 2018 via email

@SericWong
Copy link

I I had the same problem. since I used a large data(3million). Have you solved the problem yet?

@agarnitin86
Copy link

I am also getting nan values

Aspect 0:
welsh:nan staff:nan bitches:nan sick":nan stolen:nan christmas:nan edward:nan genius:nan selena:nan emily:nan socks:nan 21st:nan kings:nan roof:nan incredibly:nan walmart:nan bein:nan ga:nan luckily:nan gud:nan cricket:nan reunion:nan accidentally:nan kobe:nan steak:nan fridays:nan disneyland:nan snap:nan involved:nan carry:nan security:nan delivery:nan police:nan theatre:nan prince:nan iranelection:nan sounded:nan

Aspect 1:
welsh:nan staff:nan bitches:nan sick":nan stolen:nan christmas:nan edward:nan genius:nan selena:nan emily:nan socks:nan 21st:nan kings:nan roof:nan incredibly:nan walmart:nan bein:nan ga:nan luckily:nan gud:nan cricket:nan reunion:nan accidentally:nan kobe:nan steak:nan fridays:nan disneyland:nan snap:nan involved:nan carry:nan security:nan delivery:nan police:nan theatre:nan prince:nan iranelection:nan sounded:nan

and so on....

@pbabvey
Copy link

pbabvey commented Aug 15, 2019

I encountered this problem for some data. Actually, vector representation of aspects is exactly the same. Thus, ortho_reg loss is nan.
Here there is an edited version of the original code with some fixes and modification, I tried, it worked

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants