Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix BH collapse #235

Merged
merged 2 commits into from
Feb 15, 2023
Merged

Fix BH collapse #235

merged 2 commits into from
Feb 15, 2023

Conversation

pavlin-policar
Copy link
Owner

@pavlin-policar pavlin-policar commented Feb 15, 2023

Issue

Fixes #223.

The is_duplicate function in the BH implementation had a threshold set to 1e-6, likely taken from other implementations, perhaps scikit-learn. This meant that during the EE phase, where everything was squished together, this condition was triggered a lot, meaning BH wasn't splitting up the points into different quadrants, but was pretending it was dealing with a single point. This mean that there were no repulsive forces being applied to the points at all, leading to an increasing collapse of all the points.

Description of changes

I simply changed the threshold for duplicate detection to machine precision, and the issue goes away.
I also copied over the test by @dkobak from #234, but I've also let it run the standard phase. We want to make sure that the embedding isn't collapsed at the end of the optimization, and we don't really care if it is super compressed at the end of the EE phase.

EDIT: Not machine precision, because then, duplicate detection actually doesn't work, and the tests fail on iris. Setting the threshold to 1e-16 resolves both issues.

Includes
  • Code changes
  • Tests
  • Documentation

@pavlin-policar pavlin-policar merged commit 75a0a03 into master Feb 15, 2023
@pavlin-policar pavlin-policar deleted the fix-bh-collapse branch February 15, 2023 10:51
@dkobak
Copy link
Contributor

dkobak commented Feb 15, 2023

With all the recent changes it would be great to release a new version (0.6.3 or 0.7), what do you think? Some of the changes (like new learning rate defaults) affect the outcome with default params.

@pavlin-policar
Copy link
Owner Author

Yes, there have been a few changes since 0.6.2. We've added jittering which actually does change the resulting embeddings a bit, so this probably warrants an 0.7.0 release. The rest was mainly bugfixes I think.

I'd like to get #226 solved first, and there are some issues on conda, but once these get resolved, I'll make a new release.

@dkobak
Copy link
Contributor

dkobak commented Feb 15, 2023

Yes, init jittering, but also learning rate that is now set to n/exag in each epoch, and momentum fixed to 0.8 throughout.

Cool.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Switching spectral initialization to sklean.manifold.SpectralEmbeddings
2 participants