[FIX] Round bhattacharayya #4340

AndrejaKovacic · 2020-01-16T18:56:03Z

Issue

Due to floating point precision, bhattacharayya distance would sometimes return negative distances.

Description of changes

Rounding numbers to 13th digit. I think this is precise enough for our purposes, otherwise, we could use decimal class instead of floats.

Includes

Code changes

codecov · 2020-01-16T19:08:52Z

Codecov Report

Merging #4340 into master will increase coverage by 0.06%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master    #4340      +/-   ##
==========================================
+ Coverage   86.85%   86.91%   +0.06%     
==========================================
  Files         396      396              
  Lines       71828    71991     +163     
==========================================
+ Hits        62383    62571     +188     
+ Misses       9445     9420      -25

janezd

I think this should be solved by clipping, not rounding.

janezd · 2020-01-17T10:24:59Z

Orange/distance/distance.py

-        return -np.log(np.sum(np.sqrt(a.multiply(b))))
-    return -np.log(np.sum(np.sqrt(a * b)))
+        return np.clip(-np.log(np.sum(np.sqrt(a.multiply(b)))), 0, None)
+    return np.clip(-np.log(np.sum(np.sqrt(a * b))), 0, None)


I know I'm being an a..., I mean, annoying, but I would find it much nicer if you changed the code to something like

if sp.issparse(a): prod = a.multiply(b) else: prod = a * b return np.clip(-np.log(np.sum(np.sqrt(prod))

Or even prod = a.multiply(a) if sp.issparse(a) else a * b.

it does look nicer, i changed it.

Hmmm, is the condition even needed? Wouldn't a.multiply(b) work to with sparse and dense matrices?

You seem to think I haven't tried. :)

>>> import numpy as np >>> a = np.arange(20).reshape(4, 5) >>> a.multiply(a) Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: 'numpy.ndarray' object has no attribute 'multiply'

There is np.multiply, but it doesn't work for sparse matrices.

No, numpy arrays don't have multiply function, only the numpy module has it.

janezd requested changes Jan 17, 2020

View reviewed changes

AndrejaKovacic force-pushed the rounding_bhatt branch from fceb2b3 to 72103dc Compare January 17, 2020 10:10

janezd reviewed Jan 17, 2020

View reviewed changes

Clip minimum to 0

3204e42

AndrejaKovacic force-pushed the rounding_bhatt branch from 72103dc to 3204e42 Compare January 17, 2020 10:40

janezd approved these changes Jan 17, 2020

View reviewed changes

janezd merged commit 31cc232 into biolab:master Jan 17, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FIX] Round bhattacharayya #4340

[FIX] Round bhattacharayya #4340

AndrejaKovacic commented Jan 16, 2020

codecov bot commented Jan 16, 2020 •

edited

Loading

janezd left a comment

janezd Jan 17, 2020

AndrejaKovacic Jan 17, 2020

markotoplak Jan 17, 2020

janezd Jan 17, 2020

AndrejaKovacic Jan 17, 2020

[FIX] Round bhattacharayya #4340

[FIX] Round bhattacharayya #4340

Conversation

AndrejaKovacic commented Jan 16, 2020

Issue

Description of changes

Includes

codecov bot commented Jan 16, 2020 • edited Loading

Codecov Report

janezd left a comment

Choose a reason for hiding this comment

janezd Jan 17, 2020

Choose a reason for hiding this comment

AndrejaKovacic Jan 17, 2020

Choose a reason for hiding this comment

markotoplak Jan 17, 2020

Choose a reason for hiding this comment

janezd Jan 17, 2020

Choose a reason for hiding this comment

AndrejaKovacic Jan 17, 2020

Choose a reason for hiding this comment

codecov bot commented Jan 16, 2020 •

edited

Loading