-
Notifications
You must be signed in to change notification settings - Fork 215
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Organic Materials Database band gap dataset (OMDB-GAP1) #26
Add Organic Materials Database band gap dataset (OMDB-GAP1) #26
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work!
Anyway I would expect that you can get below a MAE of 0.38eV with the new pytorch setup compared to the tensorflow one
Thank you! I would like to complete the training session (and check the performance on the test set) just to make sure everything is OK, maybe there are some issues later on in this script. After that, if everything is OK, it can be merged. I will post the results here as soon as I have them. |
I trained it for 8 hours on a K80 GPU, with The validation loss / MAE / RMSE seems to flatten out too early, and the learning rate is not reduced even though perhaps it should? I will check why and try some other parameters for number of interaction blocks and such. |
The patience for reducing the learning rate is set to 50 epochs. Perhaps this is to much? I used these settings recently, but this of course depends on your dataset: |
The evaluation/test results are strangely off compared to the validation results. All the details:
Running for 24 hours, then killed by timeout on our cluster, but the best_model files have been saved anyway, so this is OK. The output: https://gist.github.com/bartolsthoorn/3cca1b366287ccefed5610801aad54e9 For evaluation:
The output:
So the test MAE is around 0.62 eV but it should be around 0.42 eV (judging from the validation MAE). This would still be worse than the result I reached with the tensorflow SchNet of 0.38 eV but at least quite close. Any ideas why this is happening? (Table 2 lists the tensorflow SchNet results: https://arxiv.org/pdf/1810.12814.pdf) |
I have no idea, the models should be identical. Perhaps an unlucky split? Or you can try with a larger cutoff? |
Did you figure out what was wrong? |
I did not have time to work on this last week unfortunately, but I will probably start some training sessions with different splits and different cutoffs today. |
Yes setting the cut-off to 5.0 I am starting to get good test results 🎉 , Now I will just do some small further tuning to make the learning a bit faster, and then I will commit the best set of default parameters here. |
Anything new? |
Now the PR is ready for merging. I made a small change to We will release an improved version 2 of the OMDB-GAP1 dataset soon and submit the update to arxiv and submit to a journal with the results obtained with SchNetPack. But anyway, this code is complete and ready. |
This PR adds a script and dataset that makes it easy to train for the band gap prediction with the OMDB-GAP1 dataset (organic crystal structures and their PBE band gap).
The paper of the dataset: https://arxiv.org/abs/1810.12814
I originally used the tensorflow SchNet implementation where I added my own xyz loading script, but I see that this new pytorch implementation is better, great work!
I am still training the model right now, will let you know how it performs. I expect to reach a MAE of 0.38 eV as I did with the old implementation.