Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MDS: amount of explained variance #6070

Closed
ajdapretnar opened this issue Jul 25, 2022 · 2 comments · Fixed by #6309
Closed

MDS: amount of explained variance #6070

ajdapretnar opened this issue Jul 25, 2022 · 2 comments · Fixed by #6309
Assignees
Labels
snack This will take an hour or two wish

Comments

@ajdapretnar
Copy link
Contributor

What's your use case?

I got a comment from a reviewer, that it would be useful if MDS showed the amount of explained variable. Given that most MDS projections start with PCA initialization, this could perhaps be useful.

What's your proposed solution?

Add info on amount of explained variance. Perhaps to the status bar or to the info box below parameter settings.

Are there any alternative solutions?
Look for the same info with PCA widget?

@ajdapretnar
Copy link
Contributor Author

Here's what I found as well:

The amount of generalized variance explained by the MDS solution can be expressed as P2 or Mardia criteria. P2 is the ratio of the sum of the eigenvalues over the total sum of the eigenvalues. Mardia criteria squares the numerator and denominator of the P2 values. Both are scaled from 0 to 1, with values closer to 1.0 indicating a good fit.

@janezd janezd added wish snack This will take an hour or two labels Jan 10, 2023
@janezd
Copy link
Contributor

janezd commented Jan 21, 2023

MDS starts with distances - even if the widget is given a table, it computes distances. Projection is non-linear so even if the original data was table-based, one can't "map" original coordinates to projection. I don't see how one could define explained variance. But then, I may not be an expert in MDS.

Mardia (which I haven't known before) measures the goodness of fit. Scikit doesn't have it, but we could display stress. But I am not sure that Scikit computes stress when using stress majorization. Computing it during updates could be too slow. We could compute it at the end, though.

@janezd janezd self-assigned this Jan 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
snack This will take an hour or two wish
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants