Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Generic Collection Types #164

Closed
wants to merge 1 commit into from
Closed

Use Generic Collection Types #164

wants to merge 1 commit into from

Conversation

rgant
Copy link

@rgant rgant commented Apr 5, 2022

Because TypedDict and Dict are not compatible. And Tuples are not Lists
https://mypy.readthedocs.io/en/stable/builtin_types.html#generic-types
python/mypy#4976

@zkrolikowski-vl
Copy link
Collaborator

zkrolikowski-vl commented Apr 5, 2022

@rgant This one of the places in the base code for which typing was updated recently (using the new 3.10 generic syntax - this wouldn't work for stubs):
https://github.com/pandas-dev/pandas/blob/v1.4.2/pandas/io/json/_normalize.py#L241-L536

Although I understand that technically the function will accept broader types - in this case TypedDict and Tuple, I'm not sure this aligns with what pandas devs have in mind for this function. Technically they could make the function incompatible with those broader types in 1.4.3 and that would perfectly sensible because they've specified the exact interface.

I'm not convinced that type stubs should be broader than what pandas tells us - even if practically accepted. Could you describe your use-case a bit? Code snippet would be nice.

@rgant
Copy link
Author

rgant commented Apr 5, 2022

import typing

UserData = dict[str, typing.Any]
User = typing.TypedDict(
    'User',
    {
        'uid': str,
        'email': str,
        'provider_id': str,
        'created_at': str,
        'last_sign_in': typing.Optional[str],
        'data': UserData,
    },
)

def get_csv(users: list[User]) -> str:
    output = io.StringIO(newline='')
    # Cast could be removed if https://github.com/VirtusLab/pandas-stubs/pull/164 is merged
    normalized = json_normalize(typing.cast(list[dict[str, typing.Any]], users))
    normalized.replace(r'\n', '  ', regex=True, inplace=True)
    normalized.to_csv(output, index=False, columns=sorted(normalized.columns))
    result = output.getvalue()
    output.close()
    return result

People frequently get the types wrong in python/mypy. 99% of the time they use Dict when they should use Mapping/MutableMapping because they know about Dicts and they don't know about the intricacies of python type/collections systems.

If you review the conversation (esp this reference: https://mypy.readthedocs.io/en/latest/common_issues.html#invariance-vs-covariance) in the linked ticket you will see that Dict should hardly ever be used in a types for python libraries.

Dict is an end type, but many people make other things that are Mappings but are not Dicts. Believe me, I think mypy is doing the wrong thing here, but the decision was made in 2019 and I was overruled.

@zkrolikowski-vl
Copy link
Collaborator

@rgant Sorry for not responding for a good while. I've got good reason's for it as pandas-stubs has moved to another repository and will now be managed alongside pandas itself: https://github.com/pandas-dev/pandas-stubs

You're absolutely right about how Dict is used wrongly for typing in Python.

As for this change in particular the new repository still got it wrong so I suggest moving this PR there: https://github.com/pandas-dev/pandas-stubs/blob/main/pandas-stubs/io/json/_normalize.pyi

You'd probably want to use copy-pasty because the repositories don't have common history so you can't just change upstream.

Closing this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants