-
-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ENH] OWTSNE: Offload computation to separate thread #3604
Conversation
d5097c9
to
db0a98e
Compare
Codecov Report
@@ Coverage Diff @@
## master #3604 +/- ##
==========================================
+ Coverage 84.9% 84.95% +0.05%
==========================================
Files 374 374
Lines 68852 69084 +232
==========================================
+ Hits 58457 58692 +235
+ Misses 10395 10392 -3 |
dc1f337
to
a329ccd
Compare
|
||
def resume(self): | ||
self.__set_update_loop(self.tsne_iterator) | ||
self.run_button.setText("Stop") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this line is not needed, since the text is already set in start()
.
Maybe you could write some tests for the TSNERunner. Each its method is easily testable. And it would also be good to check that the embedding is not modified inplace. |
e628b78
to
077fc02
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some observations:
- OWFile (Iris) -> OWtSNE (Check Preserve global structure -> Info message appears) -> OWFile (Reload file) -> OWtSNE (Info message has disappeared)
- OWFile (Iris) -> OWtSNE
-> OWDataTable (Select 10 rows) -> OWtSNE (the same widget as above (input subset)) -> OWtSNE (Check Preserve global structure) -> OWDataTable (the same widget as above - reselect 10 rows) -> Crash
Traceback (most recent call last):
File "/Users/vesna/orange3/Orange/canvas/scheme/widgetsscheme.py", line 1082, in process_signals_for_widget
widget.handleNewSignals()
File "/Users/vesna/orange3/Orange/widgets/visualize/utils/widget.py", line 461, in handleNewSignals
self.graph.update_point_props()
File "/Users/vesna/orange3/Orange/widgets/visualize/owscatterplotgraph.py", line 554, in update_point_props
self.update_colors()
File "/Users/vesna/orange3/Orange/widgets/visualize/owscatterplotgraph.py", line 932, in update_colors
pen_data, brush_data = self.get_colors()
File "/Users/vesna/orange3/Orange/widgets/visualize/owscatterplotgraph.py", line 826, in get_colors
subset = self.master.get_subset_mask()
File "/Users/vesna/orange3/Orange/widgets/visualize/utils/widget.py", line 482, in get_subset_mask
dtype=np.bool, count=len(valid_data))
File "/Users/vesna/orange3/Orange/widgets/visualize/utils/widget.py", line 481, in <genexpr>
return np.fromiter((ex.id in self.subset_indices for ex in valid_data),
TypeError: unhashable type: 'numpy.ndarray'
- When running tsne with several interruptions I get slightly different results than without interrupting calculation:
OWFile (Iris) -> OWtSNE (Click Stop/Resume several times)
-> OWtSNE
-> Output Data vary
The following still happens:
|
Remove this fix because OWMDS fails. I've opened biolab#3723 instead.
6b38318
to
7c1dcea
Compare
7c1dcea
to
f7dbb03
Compare
6cba6f3
to
c712620
Compare
@VesnaT I fixed the two issues you found above. I've fixed them in a very simple and hacky way. I tried to find an elegant solution, but any of these would require touching the base widget and then fixing the other widgets. |
Issue
The t-SNE widget previously performed all computation inside the main thread, while simulating a thread i.e. it would pause every once in a while and handle user action. For larger data sets, this meant the widget would freeze for a while until the widget received control again.
Description of changes
The widget now offloads all of the computation into a separate thread, and the widget is only responsible for preparing a
Task
specification, which is then handled by the other thread.Other changes include:
openTSNE
implementation, but I think it's justified because 1. in case we ever opt for a different implementation, removing this from the widget requires little effort and 2. finding the nearest neighbors can take a very long time (up to half of all the computation time for larger data sets) therefore the widget should be smart enough to cache results wherever possible.Includes