WIP classify to rgba #211

knaaptime · 2024-05-14T00:52:21Z

this adds a function to generate an array of colors from an array of values. I've been toying with pydeck lately, which, basically, requires a column of colors on the df. Looking a bit closer, that's how most (all?) of the viz libraries work, so this is a shot at refactoring geopandas's logic for generating colors.

I can add tests, but wanted to open first to see if folks find this valuable and if it should live here

codecov · 2024-05-14T06:54:30Z

Codecov Report

Attention: Patch coverage is 91.66667% with 2 lines in your changes missing coverage. Please review.

Project coverage is 89.6%. Comparing base (af76ab3) to head (0f64631).

Additional details and impacted files

@@          Coverage Diff          @@
##            main    #211   +/-   ##
=====================================
  Coverage   89.6%   89.6%           
=====================================
  Files          8       9    +1     
  Lines       1101    1125   +24     
=====================================
+ Hits         986    1008   +22     
- Misses       115     117    +2

Files	Coverage Δ
mapclassify/__init__.py	`100.0% <100.0%> (ø)`
mapclassify/util.py	`91.3% <91.3%> (ø)`

martinfleis · 2024-05-14T08:46:25Z

I am think whether this should not be a class or a method on MapClassifier (probably the latter). Because to use it with a legend, you will normally be interested in other information apart from the array of colours.

knaaptime · 2024-05-14T17:12:47Z

i was curious about that, but it seemed like classify was doing the heavy lifting in geopandas, even with the legend?

probably no real reason we couldn't do both. The difference is a classifier class doesn't require a camp (edit: though, I guess if its a method it just consumes the cmap)

knaaptime · 2024-05-14T17:16:58Z

ah. but classify gives back the fitted classifier whereas this gives back the array.

Agree this makes more sense as a method, will move around

knaaptime · 2024-05-14T17:51:17Z

nevermind. If this gets pushed inside the class, there's nowhere to include the nan logic

I am think whether this should not be a class or a method on MapClassifier (probably the latter). Because to use it with a legend, you will normally be interested in other information apart from the array of colours.

so I think I would just create a classifier and the rgbas if you need both at the moment

martinfleis · 2024-05-15T14:11:03Z

That is a bit unfortunate to do the classification twice. Especially given some of them can be quite costly. Maybe include NaN tracking in the class? Like your legit_indices here?

knaaptime · 2024-05-15T15:16:48Z

yeah, agree. I think the nan-handle logic is only like those three lines. Without touching any of the classification code, we could maybe sneak it into the first step of the binning function so nans are ignored from the outset?

sjsrey · 2024-05-16T08:09:05Z

mapclassify/_classify_API.py

+    bins = classify(v.dropna().values, scheme=classifier, k=k).yb
+
+    # create a normalizer using the data's range (not strictly 1-k...)
+    norm = Normalize(min(bins), max(bins))


I'm not clear why there is a classifier called prior to here.
It seems to me this is trying to do a classless choropleth map [1].
If that is true, then the first classifier can be omitted and line 258 could become
norm = Normalize(v.min(), v.max())

But maybe I'm missing something here in my understanding?

[1] Tobler, W. (1973). Choropleth maps without class intervals. Geographical
Analysis, 5:262-265

no the whole idea is to use the classifier to get colors. It might not need that call to Normalize, but we're taking the value, discretizing it to the class bins, then using the bins to get colors from a matplotlib colormap

So do I have this right?

This was classified k=8, fisher jenks. So I'm thinking there should only be 8 colors in the figure, but the additional z axis allows for differentiation between units in the same bin but with different values. The color of the hex fill would be whatever class the hex value is placed into.

Yeah exactly. Z and color are disconnnected there. Color is based on Jenks but height is continuous

(i.e. all this does is take a cmap and translate bins --> colors)

so if you remove the z-dimension, this function gives you the same choropleth geopandas would give you. But then you can also use extrusion to encode the same or a different variable. A bit more at the end of this

sjsrey · 2024-05-17T09:26:24Z

this adds a function to generate an array of colors from an array of values. I've been toying with pydeck lately, which, basically, requires a column of colors on the df. Looking a bit closer, that's how most (all?) of the viz libraries work, so this is a shot at refactoring geopandas's logic for generating colors.

I can add tests, but wanted to open first to see if folks find this valuable and if it should live here

This is definitely useful functionality.

I'm not sure that mc is the place for this, for a couple of reasons. First, it makes matplotlib a hard dependency. Second, it seems to consume mc, rather than extend mc , for a particular use case.

I can see a few options:

Keep in it mc, but move it into a utility module that makes matplotlib a soft dependency.

Keep it as suggested with the hard dependency

Don't include it in mc but have it be part of a downstream package.

Other?

knaaptime · 2024-05-17T15:40:45Z

yeah agree. I'm happy to stick it downstream, but when @martinfleis and I discussed briefly on the last pysal call, we thought it might be useful more broadly (e.g. for geopandas to consume, since its coloring logic is currently a bit arcane).

would be easy to make matplotlib a module-level import if we want to go that route

martinfleis · 2024-06-25T20:58:38Z

I've discussed this with @sjsrey in Basel and came to a conclusion that it won't help geopandas at this point. We are planning on refactoring (or at least having a deep dive) of the plotting code for 1.1 or 1.2 though, so it may become more useful then. The issue is that we will always have mapclassify as optional dependency only and when no binning is applied we need to deal with missing values anyway.

knaaptime · 2024-06-25T21:06:15Z

cool. nice work on 1.0 btw!

i still think this has value here, e.g. for legendgram and such (would also give a logical place to live alongside #173), but lmk what you think. Happy to put elsewhere if you prefer

martinfleis · 2024-06-25T21:17:12Z

I am perfectly fine with keeping it here as I can see it can be useful. Just wanted to close the discussion about usage of this in geopandas.

knaaptime · 2024-07-01T23:39:20Z

I vote to put this into a util module with mc as a soft dep. It's not central to classification, but often useful alongside classification.

So it's really useful to keep around with the same group of tools, but shouldn't induce a new dependency

* CI: ensure 3.9 envs are compatible * pin geopandas

* CI: doctest only on ubuntu latest * try including coverage * doctestplus * Apply suggestions from code review Co-authored-by: James Gaboardi <jgaboardi@gmail.com> --------- Co-authored-by: James Gaboardi <jgaboardi@gmail.com>

* CI: test against Python 3.12 * pytest-doctestplus

for more information, see https://pre-commit.ci

updates: - [github.com/psf/black: 24.3.0 → 24.4.2](psf/black@24.3.0...24.4.2) - [github.com/astral-sh/ruff-pre-commit: v0.3.5 → v0.5.0](astral-sh/ruff-pre-commit@v0.3.5...v0.5.0) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

for more information, see https://pre-commit.ci

martinfleis · 2024-07-03T15:05:32Z

mapclassify/util.py

+    legit_indices = v[~v.isna()].index.values
+
+    # transform (non-NaN) values into class bins
+    bins = classify(v.dropna().values, scheme=classifier, k=k).yb


Shall we not require k and pass through all kwargs to support all classifiers and their options?

oh, dur. good point

martinfleis

If you make pre-commit happy lgtm.

martinfleis · 2024-07-03T15:34:48Z

mapclassify/util.py

+):
+    """Convert array of values into RGBA colors using a colormap and classifier.
+
+    Parameters


kwargs should probably be mentioned here.

cool, i'm always unsure what to do in the docstrings when there's a catchall

I always mention where are they passed to.

for more information, see https://pre-commit.ci

add classify rgb function

e36a113

knaaptime requested review from jGaboardi, sjsrey, ljwolf and martinfleis May 14, 2024 02:14

knaaptime added the enhancement label May 14, 2024

sjsrey reviewed May 16, 2024

View reviewed changes

sjsrey mentioned this pull request May 17, 2024

Consider handling NANs #212

Open

alpha

bec1919

knaaptime mentioned this pull request Jun 19, 2024

plot histogram with class bins #216

Merged

knaaptime and others added 4 commits July 1, 2024 20:43

move to util; add test

6070e8c

CI: ensure 3.9 envs are compatible (pysal#218)

1f7434e

* CI: ensure 3.9 envs are compatible * pin geopandas

CI: doctest only on ubuntu latest (pysal#219)

4de91bb

* CI: doctest only on ubuntu latest * try including coverage * doctestplus * Apply suggestions from code review Co-authored-by: James Gaboardi <jgaboardi@gmail.com> --------- Co-authored-by: James Gaboardi <jgaboardi@gmail.com>

CI: test against Python 3.12 (pysal#220)

0fad7cd

* CI: test against Python 3.12 * pytest-doctestplus

knaaptime added the pre-commit.ci autofix label Jul 2, 2024

pre-commit-ci bot removed the pre-commit.ci autofix label Jul 2, 2024

[pre-commit.ci] auto fixes from pre-commit.com hooks

aab5ddd

for more information, see https://pre-commit.ci

pre-commit-ci bot and others added 15 commits July 2, 2024 18:54

plot histogram

f1f4add

reorder args

d780b2f

despine optional; add comments

ce219a0

warn not raise if sns missing

171a566

martins improvements

394bd67

simplify args

918ff1c

mpl not pandas in docstring

815a75f

linewidth

e668c8e

add tests

447df21

pytestmpl

c89ef62

[pre-commit.ci] auto fixes from pre-commit.com hooks

5ff49da

for more information, see https://pre-commit.ci

rm unused kwarg logic

46fee1e

forgot color

bfefddf

Merge branch 'main' of github.com:pysal/mapclassify into classify

e85b810

martinfleis reviewed Jul 3, 2024

View reviewed changes

kwargs no k

d8805db

martinfleis approved these changes Jul 3, 2024

View reviewed changes

document kwargs

cb9d014

knaaptime added the pre-commit.ci autofix label Jul 3, 2024

pre-commit-ci bot removed the pre-commit.ci autofix label Jul 3, 2024

[pre-commit.ci] auto fixes from pre-commit.com hooks

0f64631

for more information, see https://pre-commit.ci

knaaptime merged commit af62513 into pysal:main Jul 3, 2024
17 checks passed

knaaptime deleted the classify branch July 3, 2024 21:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP classify to rgba #211

WIP classify to rgba #211

knaaptime commented May 14, 2024

codecov bot commented May 14, 2024 •

edited

Loading

martinfleis commented May 14, 2024

knaaptime commented May 14, 2024 •

edited

Loading

knaaptime commented May 14, 2024

knaaptime commented May 14, 2024 •

edited

Loading

martinfleis commented May 15, 2024

knaaptime commented May 15, 2024

sjsrey May 16, 2024

knaaptime May 16, 2024

sjsrey May 16, 2024

knaaptime May 16, 2024 •

edited

Loading

knaaptime May 16, 2024

sjsrey commented May 17, 2024

knaaptime commented May 17, 2024

martinfleis commented Jun 25, 2024

knaaptime commented Jun 25, 2024

martinfleis commented Jun 25, 2024

knaaptime commented Jul 1, 2024

martinfleis Jul 3, 2024

knaaptime Jul 3, 2024

martinfleis left a comment

martinfleis Jul 3, 2024

knaaptime Jul 3, 2024

martinfleis Jul 3, 2024

WIP classify to rgba #211

WIP classify to rgba #211

Conversation

knaaptime commented May 14, 2024

codecov bot commented May 14, 2024 • edited Loading

Codecov Report

martinfleis commented May 14, 2024

knaaptime commented May 14, 2024 • edited Loading

knaaptime commented May 14, 2024

knaaptime commented May 14, 2024 • edited Loading

martinfleis commented May 15, 2024

knaaptime commented May 15, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

knaaptime May 16, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sjsrey commented May 17, 2024

knaaptime commented May 17, 2024

martinfleis commented Jun 25, 2024

knaaptime commented Jun 25, 2024

martinfleis commented Jun 25, 2024

knaaptime commented Jul 1, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

martinfleis left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented May 14, 2024 •

edited

Loading

knaaptime commented May 14, 2024 •

edited

Loading

knaaptime commented May 14, 2024 •

edited

Loading

knaaptime May 16, 2024 •

edited

Loading