Nan loss after few epochs #9

yassienshaalan · 2018-07-04T04:53:53Z

Hey,

I got the code a while ago and trying to run it. However, after setting everything up I used to get Nan loss after the first epoch. I tried so many things, from changing optimizers to checking for null values in inputs, to changing learning rates. Only after your latest change of hard setting the batches_per_epoch to 1000 things became better, however still starting from epoch 6 it gives back nan loss values. What seems to be the problem? also, I couldn't reproduce the paper precision and recall values the best I could get after which is only after 5 epochs are:
This is the restaurant dataset
precision-recall f1-score support

     Food      0.786     0.183     0.296       887
    Staff      0.527     0.281     0.367       352
 Ambience      0.327     0.131     0.188       251

The text was updated successfully, but these errors were encountered:

ruidan · 2018-07-04T05:06:43Z

Hi,

I am not sure why you got the NaN loss, since I didn't encounter this issue with the current code or the code before changing batches_per_epoch to 1000. So far, no other people has reported this issue. Maybe you use another machine to run?

As for evaluation, did you manually assign the cluster_map by yourself in evaluation.py? As the cluster_map I provided in the evaluation.py is only used for the uploaded trained model. If you train a model again, you need to manually assign the mapping first before evaluation.

You can try to evaluate the uploaded trained restaurant model by running evaluation.py directly. This will give you similar results as reported in paper. And you can have a look at how I assigned the aspect label to each cluster by looking at the cluster_map and the aspect.log in pre_trained_model/restaurant/.

yassienshaalan · 2018-07-04T05:13:22Z

Thanks I will look into the evaluation part, however, there still some correlation between the gradient explosion (causing the nan loss) and the number of batches per epoch (may be a certain training set out of the dataset) cause the problem. I will still look into it and update you when I get something new. Regards From: Ruidan He Sent: Wednesday, 4 July 2018 3:06 PM To: ruidan/Unsupervised-Aspect-Extraction Cc: yassienshaalan; Author Subject: Re: [ruidan/Unsupervised-Aspect-Extraction] Nan loss after few epochs(#9) Hi, I am not sure why you got the NaN loss, since I didn't encounter this issue with the current code or the code before changing batches_per_epoch to 1000. So far, no other people has reported this issue. Maybe you use another machine to run? As for evaluation, did you manually assign the cluster_map by yourself in evaluation.py? As the cluster_map I provided in the evaluation.py is only used for the uploaded trained model. If you train a model again, you need to manually assign the mapping first before evaluation. You can try to evaluate the uploaded trained restaurant model by running evaluation.py directly. This will give you similar results as reported in paper. And you can have a look at how I assigned the aspect label to each cluster by looking at the cluster_map and the aspect.log in pre_trained_model/restaurant/. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

SericWong · 2019-02-13T10:10:13Z

I I had the same problem. since I used a large data（3million). Have you solved the problem yet?

agarnitin86 · 2019-04-03T13:46:49Z

I am also getting nan values

Aspect 0:
welsh:nan staff:nan bitches:nan sick":nan stolen:nan christmas:nan edward:nan genius:nan selena:nan emily:nan socks:nan 21st:nan kings:nan roof:nan incredibly:nan walmart:nan bein:nan ga:nan luckily:nan gud:nan cricket:nan reunion:nan accidentally:nan kobe:nan steak:nan fridays:nan disneyland:nan snap:nan involved:nan carry:nan security:nan delivery:nan police:nan theatre:nan prince:nan iranelection:nan sounded:nan

Aspect 1:
welsh:nan staff:nan bitches:nan sick":nan stolen:nan christmas:nan edward:nan genius:nan selena:nan emily:nan socks:nan 21st:nan kings:nan roof:nan incredibly:nan walmart:nan bein:nan ga:nan luckily:nan gud:nan cricket:nan reunion:nan accidentally:nan kobe:nan steak:nan fridays:nan disneyland:nan snap:nan involved:nan carry:nan security:nan delivery:nan police:nan theatre:nan prince:nan iranelection:nan sounded:nan

and so on....

pbabvey · 2019-08-15T21:24:57Z

I encountered this problem for some data. Actually, vector representation of aspects is exactly the same. Thus, ortho_reg loss is nan.
Here there is an edited version of the original code with some fixes and modification, I tried, it worked

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nan loss after few epochs #9

Nan loss after few epochs #9

yassienshaalan commented Jul 4, 2018 •

edited

Loading

ruidan commented Jul 4, 2018

yassienshaalan commented Jul 4, 2018 via email

SericWong commented Feb 13, 2019

agarnitin86 commented Apr 3, 2019

pbabvey commented Aug 15, 2019 •

edited

Loading

Nan loss after few epochs #9

Nan loss after few epochs #9

Comments

yassienshaalan commented Jul 4, 2018 • edited Loading

ruidan commented Jul 4, 2018

yassienshaalan commented Jul 4, 2018 via email

SericWong commented Feb 13, 2019

agarnitin86 commented Apr 3, 2019

pbabvey commented Aug 15, 2019 • edited Loading

yassienshaalan commented Jul 4, 2018 •

edited

Loading

pbabvey commented Aug 15, 2019 •

edited

Loading