-
Notifications
You must be signed in to change notification settings - Fork 117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nan loss after few epochs #9
Comments
Hi, I am not sure why you got the NaN loss, since I didn't encounter this issue with the current code or the code before changing batches_per_epoch to 1000. So far, no other people has reported this issue. Maybe you use another machine to run? As for evaluation, did you manually assign the cluster_map by yourself in evaluation.py? As the cluster_map I provided in the evaluation.py is only used for the uploaded trained model. If you train a model again, you need to manually assign the mapping first before evaluation. You can try to evaluate the uploaded trained restaurant model by running evaluation.py directly. This will give you similar results as reported in paper. And you can have a look at how I assigned the aspect label to each cluster by looking at the cluster_map and the aspect.log in pre_trained_model/restaurant/. |
Thanks I will look into the evaluation part, however, there still some correlation between the gradient explosion (causing the nan loss) and the number of batches per epoch (may be a certain training set out of the dataset) cause the problem. I will still look into it and update you when I get something new.
Regards
From: Ruidan He
Sent: Wednesday, 4 July 2018 3:06 PM
To: ruidan/Unsupervised-Aspect-Extraction
Cc: yassienshaalan; Author
Subject: Re: [ruidan/Unsupervised-Aspect-Extraction] Nan loss after few epochs(#9)
Hi,
I am not sure why you got the NaN loss, since I didn't encounter this issue with the current code or the code before changing batches_per_epoch to 1000. So far, no other people has reported this issue. Maybe you use another machine to run?
As for evaluation, did you manually assign the cluster_map by yourself in evaluation.py? As the cluster_map I provided in the evaluation.py is only used for the uploaded trained model. If you train a model again, you need to manually assign the mapping first before evaluation.
You can try to evaluate the uploaded trained restaurant model by running evaluation.py directly. This will give you similar results as reported in paper. And you can have a look at how I assigned the aspect label to each cluster by looking at the cluster_map and the aspect.log in pre_trained_model/restaurant/.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or mute the thread.
|
I I had the same problem. since I used a large data(3million). Have you solved the problem yet? |
I am also getting nan values Aspect 0: Aspect 1: and so on.... |
I encountered this problem for some data. Actually, vector representation of aspects is exactly the same. Thus, ortho_reg loss is nan. |
Hey,
I got the code a while ago and trying to run it. However, after setting everything up I used to get Nan loss after the first epoch. I tried so many things, from changing optimizers to checking for null values in inputs, to changing learning rates. Only after your latest change of hard setting the batches_per_epoch to 1000 things became better, however still starting from epoch 6 it gives back nan loss values. What seems to be the problem? also, I couldn't reproduce the paper precision and recall values the best I could get after which is only after 5 epochs are:
This is the restaurant dataset
precision-recall f1-score support
The text was updated successfully, but these errors were encountered: