-
Notifications
You must be signed in to change notification settings - Fork 700
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(s3.list): type hinting for chunked arg #1113
Conversation
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
Neat, I like it, although I must say it's quite verbose |
Agreed,, it is extremely verbose. I could not figure out a better way to implement it though. If someone can, please do. I guess these lines would take care of maybe several hundred lines on the user side. @diegodebrito @victoriarouton grep for it |
I agree that it's a bit too verbose I am afraid, especially that there are other methods that have even larger input arguments. Since we are only interested in the return type, I wonder if there is a way to specify **args, **kwargs instead? |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
It might be possible to only include the from typing import overload
@overload
def mouse_event(x1: int, y1: int) -> ClickEvent: ...
@overload
def mouse_event(x1: int, y1: int, x2: int, y2: int) -> DragEvent: ...
# The actual *implementation* of 'mouse_event'.
# The implementation contains the actual runtime logic.
#
# It may or may not have type hints. If it does, mypy
# will check the body of the implementation against the
# type hints.
#
# Mypy will also check and make sure the signature is
# consistent with the provided variants.
def mouse_event(x1: int,
y1: int,
x2: Optional[int] = None,
y2: Optional[int] = None) -> Union[ClickEvent, DragEvent]:
if x2 is None and y2 is None:
return ClickEvent(x1, y1)
elif x2 is not None and y2 is not None:
return DragEvent(x1, y1, x2, y2)
else:
raise TypeError("Bad arguments")
class M: ...
@overload
def get_model(model_or_pk: M, flag: bool = ...) -> M: ...
@overload
def get_model(model_or_pk: int, flag: bool = ...) -> M | None: ...
from typing import Union, overload
# Overload *variants* for 'mouse_event'.
# These variants give extra information to the type checker.
# They are ignored at runtime. |
I tried a few more things. A possible alternative is converting chucked to a keyword only arg. That is potentially more verbose and would impact the api. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO better type hints >> LOC so I'm ok to merge this
That'd be splendid! s3 list is probably the one users (at least in our case) are most likely to be indexing that doesn't resolve to The one thing I'd like to validate a bit more is if the overloads should be using the defaults i.e. |
Did some testing. It works either way but the I updated to Happy to rebase this if needed since there were some merges into it. Here's what I used for testing. # Redundant cast error -- this is a good thing
thing = cast(List[str], wr.s3.list_objects("thing"))[0]
thing = wr.s3.list_objects("thing", "*", chunked=True)
# These are all valid calls
wr.s3.list_objects("thing", ["*"], ["^"], datetime.datetime(2022, 1, 1), chunked=True)
wr.s3.list_objects("thing", ["*"], ["^"], datetime.datetime(2022, 1, 1), chunked=False)
wr.s3.list_objects(
path="thing",
suffix="*",
ignore_suffix="^",
last_modified_begin=datetime.datetime(2022, 1, 1),
last_modified_end=datetime.datetime(2022, 1, 1),
ignore_empty=True,
chunked=False,
s3_additional_kwargs={"foo": "bar"},
boto3_session=boto3.Session(),
)
wr.s3.list_objects(
"thing",
"*",
"^",
last_modified_begin=datetime.datetime(2022, 1, 1),
last_modified_end=datetime.datetime(2022, 1, 1),
ignore_empty=True,
chunked=False,
s3_additional_kwargs={"foo": "bar"},
boto3_session=boto3.Session(),
)
wr.s3.list_objects(
"thing",
"*",
"^",
last_modified_begin=datetime.datetime(2022, 1, 1),
last_modified_end=datetime.datetime(2022, 1, 1),
ignore_empty=True,
chunked=True,
s3_additional_kwargs={"foo": "bar"},
boto3_session=boto3.Session(),
)
wr.s3.list_objects(
"thing",
"*",
"^",
last_modified_begin=datetime.datetime(2022, 1, 1),
last_modified_end=datetime.datetime(2022, 1, 1),
chunked=True,
s3_additional_kwargs={"foo": "bar"},
boto3_session=boto3.Session(),
)
|
511b471
to
df49487
Compare
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
Issue #, if available:
N/A
Description of changes:
Fixing the return type of
s3.list_objects
to account for thechunked
arg by using typing.overload.Currently, I have to use
cast
in a lot of places (40+ and growing).cast
is confusing to data scientists who consume the code base, as they think it is actually casting the value or haven't seen it before. This addition allows the correct return type to be reported based on the args passed. It will save quite a bit of overhead on the user end.Tested with
mypy==0.910
andmypy==0.931
.If this is agreeable, I would be happy to take on some of the other functions that have chunking.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.