-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP classify to rgba #211
WIP classify to rgba #211
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #211 +/- ##
=====================================
Coverage 89.6% 89.6%
=====================================
Files 8 9 +1
Lines 1101 1125 +24
=====================================
+ Hits 986 1008 +22
- Misses 115 117 +2
|
I am think whether this should not be a class or a method on |
i was curious about that, but it seemed like classify was doing the heavy lifting in geopandas, even with the legend? probably no real reason we couldn't do both. The difference is a classifier class doesn't require a camp (edit: though, I guess if its a method it just consumes the cmap) |
ah. but Agree this makes more sense as a method, will move around |
nevermind. If this gets pushed inside the class, there's nowhere to include the nan logic
so I think I would just create a classifier and the rgbas if you need both at the moment |
That is a bit unfortunate to do the classification twice. Especially given some of them can be quite costly. Maybe include NaN tracking in the class? Like your |
yeah, agree. I think the nan-handle logic is only like those three lines. Without touching any of the classification code, we could maybe sneak it into the first step of the binning function so nans are ignored from the outset? |
mapclassify/_classify_API.py
Outdated
bins = classify(v.dropna().values, scheme=classifier, k=k).yb | ||
|
||
# create a normalizer using the data's range (not strictly 1-k...) | ||
norm = Normalize(min(bins), max(bins)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not clear why there is a classifier called prior to here.
It seems to me this is trying to do a classless choropleth map [1].
If that is true, then the first classifier can be omitted and line 258 could become
norm = Normalize(v.min(), v.max())
But maybe I'm missing something here in my understanding?
[1] Tobler, W. (1973). Choropleth maps without class intervals. Geographical
Analysis, 5:262-265
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no the whole idea is to use the classifier to get colors. It might not need that call to Normalize, but we're taking the value, discretizing it to the class bins, then using the bins to get colors from a matplotlib colormap
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was classified k=8, fisher jenks. So I'm thinking there should only be 8 colors in the figure, but the additional z axis allows for differentiation between units in the same bin but with different values. The color of the hex fill would be whatever class the hex value is placed into.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah exactly. Z and color are disconnnected there. Color is based on Jenks but height is continuous
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(i.e. all this does is take a cmap and translate bins --> colors)
so if you remove the z-dimension, this function gives you the same choropleth geopandas would give you. But then you can also use extrusion to encode the same or a different variable. A bit more at the end of this
This is definitely useful functionality. I'm not sure that mc is the place for this, for a couple of reasons. First, it makes matplotlib a hard dependency. Second, it seems to consume mc, rather than extend mc , for a particular use case. I can see a few options: Keep in it mc, but move it into a utility module that makes matplotlib a soft dependency. Keep it as suggested with the hard dependency Don't include it in mc but have it be part of a downstream package. Other? |
yeah agree. I'm happy to stick it downstream, but when @martinfleis and I discussed briefly on the last pysal call, we thought it might be useful more broadly (e.g. for geopandas to consume, since its coloring logic is currently a bit arcane). would be easy to make matplotlib a module-level import if we want to go that route |
I've discussed this with @sjsrey in Basel and came to a conclusion that it won't help geopandas at this point. We are planning on refactoring (or at least having a deep dive) of the plotting code for 1.1 or 1.2 though, so it may become more useful then. The issue is that we will always have mapclassify as optional dependency only and when no binning is applied we need to deal with missing values anyway. |
cool. nice work on 1.0 btw! i still think this has value here, e.g. for legendgram and such (would also give a logical place to live alongside #173), but lmk what you think. Happy to put elsewhere if you prefer |
I am perfectly fine with keeping it here as I can see it can be useful. Just wanted to close the discussion about usage of this in geopandas. |
I vote to put this into a util module with mc as a soft dep. It's not central to classification, but often useful alongside classification. So it's really useful to keep around with the same group of tools, but shouldn't induce a new dependency |
* CI: ensure 3.9 envs are compatible * pin geopandas
* CI: doctest only on ubuntu latest * try including coverage * doctestplus * Apply suggestions from code review Co-authored-by: James Gaboardi <jgaboardi@gmail.com> --------- Co-authored-by: James Gaboardi <jgaboardi@gmail.com>
* CI: test against Python 3.12 * pytest-doctestplus
for more information, see https://pre-commit.ci
updates: - [github.com/psf/black: 24.3.0 → 24.4.2](psf/black@24.3.0...24.4.2) - [github.com/astral-sh/ruff-pre-commit: v0.3.5 → v0.5.0](astral-sh/ruff-pre-commit@v0.3.5...v0.5.0) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
for more information, see https://pre-commit.ci
mapclassify/util.py
Outdated
legit_indices = v[~v.isna()].index.values | ||
|
||
# transform (non-NaN) values into class bins | ||
bins = classify(v.dropna().values, scheme=classifier, k=k).yb |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we not require k
and pass through all kwargs
to support all classifiers and their options?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, dur. good point
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you make pre-commit happy lgtm.
): | ||
"""Convert array of values into RGBA colors using a colormap and classifier. | ||
|
||
Parameters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kwargs should probably be mentioned here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cool, i'm always unsure what to do in the docstrings when there's a catchall
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I always mention where are they passed to.
for more information, see https://pre-commit.ci
this adds a function to generate an array of colors from an array of values. I've been toying with pydeck lately, which, basically, requires a column of colors on the df. Looking a bit closer, that's how most (all?) of the viz libraries work, so this is a shot at refactoring geopandas's logic for generating colors.
I can add tests, but wanted to open first to see if folks find this valuable and if it should live here