[ENH] Widget for Self-Organizing Maps #3928

janezd · 2019-07-08T19:26:29Z

Implements #2800.

To do:

Includes

Code changes
Tests

ajdapretnar · 2019-07-09T07:23:30Z

I was waiting for this widget for 2 years! 😍

codecov · 2019-07-09T11:39:23Z

Codecov Report

❗ No coverage uploaded for pull request base (master@be34e1d). Click here to learn what that means.
The diff coverage is 1.61%.

@@            Coverage Diff            @@
##             master    #3928   +/-   ##
=========================================
  Coverage          ?   84.24%           
=========================================
  Files             ?      372           
  Lines             ?    65301           
  Branches          ?        0           
=========================================
  Hits              ?    55012           
  Misses            ?    10289           
  Partials          ?        0

codecov · 2019-07-09T11:39:23Z

Codecov Report

Merging #3928 into master will increase coverage by 0.02%.
The diff coverage is 86.86%.

@@            Coverage Diff             @@
##           master    #3928      +/-   ##
==========================================
+ Coverage   85.23%   85.26%   +0.02%     
==========================================
  Files         382      385       +3     
  Lines       67670    68759    +1089     
==========================================
+ Hits        57680    58624     +944     
- Misses       9990    10135     +145

janezd · 2019-07-13T21:31:27Z

Ready for review.

Reports still fail, but this is due to a problem in report_plot which uses scene's size as png size. This widget's scene coordinates are small, like 10x10,with a legend that does not resize.

Help (in form of a local fix or general solution) appreciated.

janezd · 2019-07-13T21:51:24Z

Correction: Reports don't fail, they produce a 10x10 png.

Pylint fails because the widget has to many attributes (24/20). This is a common problem that we'll have to discuss.

Also, I fear tests for this widget may sometimes fail because of threading problems. I suspect that onDeleteWidget finishes the optimization thread, but some signals are still being processed and may trigger redrawing. Is there a way to block/remove pending signals?

Stop / Restart button also doesn't work properly. @ales-erjavec could you check what I did wrong this time?

ajdapretnar · 2019-07-15T11:29:45Z

As always, I love to share my comments with you. 😆

Selection is strange. It is currently a square, which I could survive, but it goes behind the grid and I can't see what I am selecting. (select some data, then try to select a part of the current selection) If one could select just the hexagons, that would be even nicer! What I have in mind is something like Data Table, where selection would just be colored blue.
The legend is an issue. Use SOM on iris, then change the data to heart_disease. The legend is placed under the visualization for some reason.
Is the widget intentionally missing Apply button? I think it would be nice to have it, no?
The widget should color by class by default.
Nothing major, but should we not thing a bit about the first box? FreeViz has initialization then the button, MDS the button, then the steps, then initialization (+ jittering) buttons, tSNE has parameters then start button, SOM has start button then the initialization... Probably a more unified interface would be beneficial here.
Perhaps @larazupan could think of a better icon? This one looks more like it should be in the Geo add-on... 😬

ales-erjavec · 2019-07-16T13:35:08Z

Orange/widgets/unsupervised/owsom.py

+        def update(_progress, som):
+            from AnyQt.QtWidgets import qApp
+            progressbar.advance()
+            qApp.processEvents()  # This is apparently needed to advance the bar


This is apparently needed to advance the bar

And to invoke a MaximumRecursion error if NITERATIONS is ever increased to ~1000.
Instead avoid doing superfluous work in _assign_instances,_redraw.

Never call processEvents() from queued signal connections (and these are queued). It can/will recurse when the signals are emitted faster then they are processed on the receiver side. As a consequence the order of calls to self._assign_instances(som) can be inverted (as they are called after the recursion point).

In fact it seems like som.winners call in _assign_instances uses the som.weights while they are still being mutated by the continuing optimization.

Thanks. I think I made the same mistakes in the network analysis widget. I removed processEvents, and som.winners now gets a oopy of the data.

I can't remove much from _assign_instances and _redraw, but it seems to work alright now.

janezd · 2019-07-16T16:07:10Z

Square selection: I was lazy. But now I found a better way, it looks nice and its implementation is simpler. Try it, I'll think you'll like it.
Legend under the visualization: I already fixed, but then unfixed it. Now it's refixed.
Apply button: it had it. I didn't like it. I somehow feel the selection has to be auto-applied. Only other settings that require "application" are grid shape and dimensions. I can add the button, if you think I should.
Default color: I don't know why class var was not chosen. I changed something and now it seems to work.
Order of buttons: I think this makes sense. Initialization decides how SOM is ran. FreeViz and PCA are different because the user can manually manipulate the image (at least for FreeViz), so the optimization starts from some position. In case of SOM, the user would always have to press two buttons, if we separated them. I don't have a hard stance here and can change it, but we can have discussions about unifying all guis later.
Icon: I've drawn a different icon first, but I thought it's boring. I'm attaching it. What do you think?

SOM copy.svg.zip

janezd · 2019-07-16T19:14:28Z

@ajdapretnar, after playing with Adult data I agree about the Apply button.

But I'm not sure what it should do. Should changing the grid size (with auto apply off), clear the visualization until apply is pressed? Or should the visualization stay as it is, although the values in controls wouldn't match the visualization? Both will be annoying to implement, so I also welcome any other suggestions.

I also think it would be cleaner if auto apply only blocked grid shape and size change, while selection would always propagate. Also because these are two different types of "applies": one restarts the optimization and the other is about output.

ajdapretnar · 2019-07-17T08:14:52Z

Yes, Adult is exactly the data set I had in mind. When things get big, Auto apply starts to make sense. And I agree, what should changing the parameters do? Looking at other visualizations widgets, they block the selection. But that is because other parameters only change the size and color of the points, not the actual visualization. In t-SNE there is a special section for visualization parameters and once the user presses Start, the changes are applied. Perhaps instead of the Auto apply, there should be a simple button called Update visualization? Or something along this line... Nothing is ideal, I agree.

ajdapretnar · 2019-07-17T08:17:39Z

p.s. I like the beehive icon (big fan of bees 🐝 here). But I already asked Blaz to ask Lara to design them... 🙊

janezd · 2019-07-17T16:48:34Z

@ajdapretnar, like this? Not finished, cleaned up, and may crash. I just want your opinion.

You'll notice that it runs optimization immediately after receiving new data, which should usually be OK. But you can stop it, change parameters, rerun. It can easily handle Adult on 15x15 (works on 30x30, too, but it's slow).

Initializations are now in a combo instead of radios, like in some other widget(s). It looks nicer in this rearrangement, too.

BlazZupan · 2019-07-19T10:17:53Z

This is a great reincarnation of SOM widget. I have some comments, questions, and a request:

When I use the widget (say, on Iris data), some circles are hollow (there is a thick border with color of higher intensity), while majority of the other circles are filled. Why the distinction.

On Iris data set, the legend does not look right when displaying real-valued features

Selection works great but is a bit different from, say, Scatterplot widget. There, use of a shift modifier would introduce classes with selected items. This feature is actually great because it supports the definition of different groups of items and subsequent analysis of differences. Would it be possible to implement this functionality in SOM as well?
Can we have an option to remove the legend?
The widget's current name is "Self-organizing Maps". I have googled this name, and it looks like people capitalize "organizing" as well, to have "Self-Organizing Maps".
I will ask Lara to render the icon for the widget.
Could the widget implement an automatic start of computation? For smaller data sets, I, for instance, change the dimensions, but then have to press Start after every change, whereas the algorithm is fast enough to just run the computation automatically.
Throughout Orange, we need to decide what to do with categorical features. Some widgets automatically continuize such features. An example is PCA. If I take zoo data set (all features are categorical), the PCA would work, but SOM would complain that there are no real-valued features. I would vote for automatic and default continuization in cases where input data includes any categorical features.

janezd · 2019-07-26T10:34:55Z

some circles are hollow

The intensity of interior color shows the proportion of majority class. It was the same as the border color if the cell was pure. I made the border thinner (it was already thinner on my screen, compared to your screenshot) and the interior color is now always a bit lighter than the border.

the legend does not look right when displaying real-valued features

It worked until I optimized some code. :) Also the output data was wrong (normalized) due to this bug.

Can we have an option to remove the legend?

Ask @ajdapretnar how to (re)move the legend. :) I can add a checkbox... but no other widget can hide the legend, because we decided against it at some point and remove these checkboxes.

capitalize "organizing" as well, to have "Self-Organizing Maps".

OK.

use of a shift modifier would introduce classes with selected items

I thought SOM doesn't need this, but it makes sense even just for consistency. I added it.

Could the widget implement an automatic start of computation?

It annoys me, too, but I'm not able to implement it (not for lack of trying). When the optimization is running, changing the control should terminate and restart it, but it doesn't, and sometimes it even crashes. In short: no, I can't do this.

Throughout Orange, we need to decide what to do with categorical features. Some widgets automatically continuize such features.

Maybe write a separate issue, just so that we don't forget to discuss it?

janezd · 2019-07-26T11:19:34Z

Fails on pylint; one problem is not mine, the other will stay and we'll discuss it later. So it's ready for rereview.

Orange/widgets/unsupervised/owsom.py

ajdapretnar · 2019-08-01T09:37:15Z

Documentation provided in #3956.

ajdapretnar · 2019-08-05T10:22:40Z

Two problems remain, which I think should be addressed within this PR.

1.) Use brown-selected. SOM silently ignores instances with missing values. It should not. I prefer we use imputation as in other widgets and let the user know it happened.

2.) Use heart_disease. It is unclear why some instances are lighter than others. I suggest tooltips.
For discrete colors: absolute and relative numbers of each value of the coloring attribute.
For continuous colors: mean value of the instance group of the coloring attribute.

janezd · 2019-08-17T11:09:37Z

1.) Use brown-selected. SOM silently ignores instances with missing values. It should not. I prefer we use imputation as in other widgets and let the user know it happened.

Information about skipped instances was actually shown in the tooltip at the "input status" icon. But you're right, this was obscure.

Visualization widgets (from projections to mosaic and sieve) do not impute. I added a proper warning instead.

2.) Use heart_disease. It is unclear why some instances are lighter than others. I suggest tooltips.
For discrete colors: absolute and relative numbers of each value of the coloring attribute.
For continuous colors: mean value of the instance group of the coloring attribute.

Done, except that numeric variables are binned and colors correspond to bins (as the legend shows). Instead of showing just the mean, the widget shows the whole distribution (by bins). This works better for pie charts, as well as for single-color circles, where the color corresponds to the majority bin (and not to the bin that contains the average value).

lanzagar · 2019-08-23T09:03:43Z

There seem to be some problems with tooltips. E.g. check iris on the default 8x8 hex grid and hover over tiles in bottom row. I often get shown the tooltip for the tile to the right of the one I have my mouse over.
Choose a 5x10 grid. Clicking right of the grid correctly deselects, while clicking left of the grid makes a strange selection in that row (a couple of tiles, depending on how left/right you click). Not a big problem, but maybe an indicator of something fishy :)

lanzagar · 2019-08-23T09:15:29Z

Orange/projection/setup.py

+
+
+def configuration(parent_package='', top_path=None):
+    from numpy.distutils.misc_util import Configuration


Is there a reason this is imported inside the function instead of at the top? I see that setup.pys in other dirs do the same, but don't know if there is a reason behind it or are we just copy pasting and propagating this.

I ask because moving it up and changing import numpy to from numpy import get_include (and using that below) might be nice enough and make lint pass as well and avoid the ugly red cross on travis :)
(it is complaining that numpy is not used, which is a bit strange anyway)

Your suspicion is correct, I copied this from other setups. :) If we're sure this import doesn't need to be local, we should fix all setups. Which implies it doesn't belong to this PR. :)

janezd · 2019-08-23T12:48:25Z

There seem to be some problems with tooltips.

Tooltips were not wrong but appeared at wrong positions: you could also get a tooltip to the left of the (ellipse) object and below it. I wasn't able to discover the reason but implemented the whole thing differently (and, perhaps, better).

Choose a 5x10 grid. Clicking right of the grid correctly deselects, while clicking left ...

Fixed.

janezd · 2019-08-23T13:51:01Z

I fixed lint, but I won't attempt to improve the coverage - too much user interaction code.

janezd changed the title ~~[ENH] Widget for Self-Organizing Maps~~ [WIP] [ENH] Widget for Self-Organizing Maps Jul 8, 2019

janezd force-pushed the som branch from b17534e to 9bff586 Compare July 9, 2019 11:39

janezd force-pushed the som branch 3 times, most recently from 8329382 to c2aad32 Compare July 13, 2019 11:46

janezd changed the title ~~[WIP] [ENH] Widget for Self-Organizing Maps~~ [ENH] Widget for Self-Organizing Maps Jul 13, 2019

ales-erjavec reviewed Jul 16, 2019

View reviewed changes

lanzagar assigned BlazZupan Jul 19, 2019

janezd force-pushed the som branch from f57c7d5 to 0593aa1 Compare July 26, 2019 10:30

janezd force-pushed the som branch from 0593aa1 to be714f1 Compare July 26, 2019 10:52

ales-erjavec reviewed Jul 29, 2019

View reviewed changes

Orange/widgets/unsupervised/owsom.py Show resolved Hide resolved

ajdapretnar mentioned this pull request Aug 1, 2019

OWSOM - documentation #3956

Merged

3 tasks

janezd force-pushed the som branch from 05d84b4 to f195b8f Compare August 17, 2019 12:22

janezd added 19 commits August 23, 2019 10:19

OWSOM: Refactor grid drawing

3046980

OWSOM: Random fixes

1cefbf3

OWSOM: Add annotated output

abf3b62

OWSOM: Add legend

a6da992

OWSOM: Move single selection with keyboard

85d2e43

OWSOM: Coloring by numeric features

c04a7e5

OWSOM: Icon

0ef409b

OWSOM: Add report (but it doesn't work)

e7ff1f8

OWSOM: Pylint and refactoring

7e075c5

OWSOM: Use decimal binning

619f06e

OSOM: Add tests. Clean up.

dd3bb04

OWSOM: Nicer selection of hexagons

d5d36af

OWSOM: Fix legend position

c10ff35

OWSOM: Remove recursive call to processEvents; copy data from thread

76a9593

OWSOM: Reorganize gui

0b34e80

OWSOM: Selection groups and minor fixes

1c00d19

OWSOM: Increase thread stack size

ccc8a2f

OWSOM: Add warning about ignoring instances with undefined values

04095d4

OWSOM: Add tooltips

1ea983b

janezd force-pushed the som branch from ff79e74 to 1ea983b Compare August 23, 2019 08:19

lanzagar reviewed Aug 23, 2019

View reviewed changes

janezd force-pushed the som branch from 692e669 to e5ed1e4 Compare August 23, 2019 13:08

janezd added 2 commits August 23, 2019 15:16

OWSOM: Fix misaligned tooltips and selection

49eb342

Orange.setup: pylint

11a93b0

janezd force-pushed the som branch from e5ed1e4 to 11a93b0 Compare August 23, 2019 13:16

lanzagar merged commit e218f68 into biolab:master Aug 23, 2019

janezd mentioned this pull request Aug 23, 2019

Add Self-Organizing Maps (SOM) widget #2800

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ENH] Widget for Self-Organizing Maps #3928

[ENH] Widget for Self-Organizing Maps #3928

janezd commented Jul 8, 2019 •

edited

Loading

ajdapretnar commented Jul 9, 2019

codecov bot commented Jul 9, 2019

codecov bot commented Jul 9, 2019 •

edited

Loading

janezd commented Jul 13, 2019

janezd commented Jul 13, 2019

ajdapretnar commented Jul 15, 2019

ales-erjavec Jul 16, 2019

janezd Jul 16, 2019

janezd commented Jul 16, 2019

janezd commented Jul 16, 2019

ajdapretnar commented Jul 17, 2019

ajdapretnar commented Jul 17, 2019

janezd commented Jul 17, 2019

BlazZupan commented Jul 19, 2019 •

edited

Loading

janezd commented Jul 26, 2019

janezd commented Jul 26, 2019

ajdapretnar commented Aug 1, 2019

ajdapretnar commented Aug 5, 2019

janezd commented Aug 17, 2019

lanzagar commented Aug 23, 2019

lanzagar Aug 23, 2019

janezd Aug 23, 2019

janezd commented Aug 23, 2019

janezd commented Aug 23, 2019



		def configuration(parent_package='', top_path=None):
		from numpy.distutils.misc_util import Configuration

[ENH] Widget for Self-Organizing Maps #3928

[ENH] Widget for Self-Organizing Maps #3928

Conversation

janezd commented Jul 8, 2019 • edited Loading

Includes

ajdapretnar commented Jul 9, 2019

codecov bot commented Jul 9, 2019

Codecov Report

codecov bot commented Jul 9, 2019 • edited Loading

Codecov Report

janezd commented Jul 13, 2019

janezd commented Jul 13, 2019

ajdapretnar commented Jul 15, 2019

ales-erjavec Jul 16, 2019

Choose a reason for hiding this comment

janezd Jul 16, 2019

Choose a reason for hiding this comment

janezd commented Jul 16, 2019

janezd commented Jul 16, 2019

ajdapretnar commented Jul 17, 2019

ajdapretnar commented Jul 17, 2019

janezd commented Jul 17, 2019

BlazZupan commented Jul 19, 2019 • edited Loading

janezd commented Jul 26, 2019

janezd commented Jul 26, 2019

ajdapretnar commented Aug 1, 2019

ajdapretnar commented Aug 5, 2019

janezd commented Aug 17, 2019

lanzagar commented Aug 23, 2019

lanzagar Aug 23, 2019

Choose a reason for hiding this comment

janezd Aug 23, 2019

Choose a reason for hiding this comment

janezd commented Aug 23, 2019

janezd commented Aug 23, 2019

janezd commented Jul 8, 2019 •

edited

Loading

codecov bot commented Jul 9, 2019 •

edited

Loading

BlazZupan commented Jul 19, 2019 •

edited

Loading