-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Maybe rename "Coverage" to "Rediscovery" #38
Comments
aren't coverage and rediscovery different things (?):
|
The notion of coverage came from the CDVAE paper:
"Rediscovery" based on the word itself seems applicable since the metric implemented in From guacamol paper:
|
Guacamol also uses what they call similarity metrics:
I think this is also only used in the context of goal-directed generation. |
Some excerpts from the paper you linked:
Interesting that it says Kl divergence provides the same information as Pielou's evenness (the balance (B) metric) since KL divergence is one of the distribution metrics used by guacamol. Not sure I understand what "spread" means in the context of the disparity (D) metric. If I'm understanding correctly, a more reliable metric would be computing the concave hull in high-dimensional space (i.e. approximating the hypervolume of the sampled points in some sense), but they do it in a low-dimensional projection for simplicity. Variety (V) seems similar to what I've been calling uniqueness, i.e. measuring the dissimilarity of the generated compounds within themselves. I think |
From the following article:
They use the term "recovery rate":
|
yea, I know that Mohammad played a bit with the bins for those metrics (and one would need to check for convergence). This is the reason I do not like them too much. |
https://www.benevolent.com/guacamol
The text was updated successfully, but these errors were encountered: