-
Notifications
You must be signed in to change notification settings - Fork 793
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix type checker errors for libraries such as Polars where Altair uses dataframe interchange protocol #3297
Conversation
…e parts of the Protocol which we need for our codebase to make the type hint more flexible
… keep the ones which are relevant for our codebase. get_chunks is not directly used in Altair codebase but is relevant for pi.from_dataframe
Hmm, pyright is not yet happy. I'll see if I can figure out why. |
…n as Pyright doesn't like that in Polars it is specified as a property which is semantically not the same as an attribute (the later can be set, properties are read-only by default)
Now, pyright is happy as well :) The last issue I fixed comes actually more from the Polars codebase where they define I also had to add a mroe explicit signature for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Works for me. Thanks for digging through this!
Thanks @binste, can we add a note in the release notes? Not sure if it counts as an enhancement or bug fix. |
Sounds good. I'd say it's a bug fix as the expectation was that the previous type hints would work for Polars. For reference, here the code I used to test the type hints: test_types.py: import polars as pl
import numpy as np
import altair as alt
import pyarrow as pa
import pandas as pd
content = {
"a": range(5),
"b": np.random.rand(5),
}
df_pl = pl.DataFrame(content)
df_pd = pd.DataFrame(content)
df_pa = pa.Table.from_pydict(content)
alt.Chart(df_pl).mark_point().encode(
x="a",
y="b",
)
alt.Chart(df_pd).mark_point().encode(
x="a",
y="b",
)
alt.Chart(df_pa).mark_point().encode(
x="a",
y="b",
) mypy test_types.py
pyright test_types.py |
Closes #3294 . As discussed in that issue, I switched to Protocols over ABC classes as other libraries such as Polars of course don't use our own classes but instead have their own implementations -> ABC classes won't type check but Protocols work.
After that change Polars dataframes still did not satisfy mypy so I digged a bit into the interchange protocol. There are minor differences in how Pandas, Polars, etc. have implemented type hints for the protocol and sometimes even the protocol itself. For example, in Polars
CategoricalDescription.is_dictionary
has typeLiteral[True]
as it is indeed always True in the case of Polars. However, the interchange protocol only requiresbool
. AsTypedDict
s are invariant, this will not satisfy a type checker as the types would need to be exactly the same. As an example:To not continuously have to deal with such minor differences, I removed all type hints including methods and attributes from the interchange protocols which are not relevant for Altair, e.g. methods which we do not use anywhere in our code base.
With the changes here, all examples in #3294 now pass mypy. If you see errors while testing, try deleting the
.mypy_cache
folder before running mypy.