-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
implemented pad with new-indexes #4974
Conversation
Hello @fujiisoup! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:
Comment last updated at 2021-06-29 11:56:41 UTC |
xarray/core/dataset.py
Outdated
@@ -6638,8 +6653,23 @@ def pad( | |||
coord_pad_options = {} | |||
|
|||
variables = {} | |||
|
|||
# standarize pad_width | |||
pad_width_standarized = {} # type: Mapping[Hashable, Tuple[int, int]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs help for correcting the typing.
Now, pad_width_standarized
should be Mapping[Hashable, Tuple[int, int]]
instead of Mapping[ Hashable, Union[int, Tuple[Union[int, Iterable], Union[int, Iterable]]] ]
mypy is complaining...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's not trivial to correctly type... The main change I did was to use Sequence
(https://docs.python.org/3/library/collections.abc.html) as Iterable
has no __len__
. E.g. numpy uses Sequence
for ArrayLike
(https://github.com/numpy/numpy/blob/master/numpy/typing/_array_like.py).
Disclaimer: I am by no means a specialist here - I am making this up as I go.
xarray/core/dataset.py
Outdated
if not isinstance(v, int): | ||
# if pad_width is a tuple of iterable, we use its length for | ||
# pad_width_standarized | ||
pad_width_standarized[k] = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pad_width_standarized[k] = [ | |
pad_width_standardized[k] = [ |
Co-authored-by: Mathias Hauser <mathause@users.noreply.github.com>
Co-authored-by: Mathias Hauser <mathause@users.noreply.github.com>
Co-authored-by: Mathias Hauser <mathause@users.noreply.github.com>
I fixed the typo ( |
Co-authored-by: Deepak Cherian <dcherian@users.noreply.github.com> Co-authored-by: Mathias Hauser <mathause@users.noreply.github.com>
Co-authored-by: Mathias Hauser <mathause@users.noreply.github.com>
Not sure why the doctest is failing. The same tests in test_dataset.py do not fail... |
I think only the indentation is wrong now - e.g. https://github.com/pydata/xarray/pull/4974/checks?check_run_id=2020140730#step:8:82 |
Thank you @mathause for your suggestion. |
xarray/core/dataset.py
Outdated
w1_ = IndexVariable(name, [fill_value_ind] * w1) | ||
else: | ||
w1_ = IndexVariable(name, w1) | ||
variables[name] = var.concat([w0_, var, w1_], dim=name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit wondering if we put this series of logic should be moved into IndexVariable.pad
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I think that would make sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Done.
disallow [Sequence, int] as an argument.
if isinstance(pad_start, int) or isinstance(pad_end, int): | ||
# do not allow [Sequence, int] as pad_width | ||
raise TypeError( | ||
"({}, {}) is used for pad_width[{}]. Must be either (int, int) or (Sequence, Sequence).".format( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noticed that a mixture of sequence and int for pad_width, like
da.pad(dim2=([0, 1, 2], 3))
makes implementation very complex (as we may need to support many options, such as pad_mode).
I just disallowed the mixed argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
@@ -1355,6 +1355,15 @@ def pad( | |||
|
|||
return type(self)(self.dims, array) | |||
|
|||
def pad_indexes(self, pad_start: Sequence, pad_end: Sequence): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make this IndexVariable.pad
? and add a test in test_variable.py
?
if isinstance(pad_start, int) or isinstance(pad_end, int): | ||
# do not allow [Sequence, int] as pad_width | ||
raise TypeError( | ||
"({}, {}) is used for pad_width[{}]. Must be either (int, int) or (Sequence, Sequence).".format( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
I was wondering what is the status of this PR? @fujiisoup do you still plan to work on it? Or maybe is it wise to wait for the indexes refactoring (#5692), i.e., #3868 (comment) and revisit a bit the approach? More specifically, the approach here (i.e., passing indexes via the What about another parameter |
Hi.
I think we can just discard this PR if this does not fit with the index refactoring. |
Closing this per the last comments and label added to this PR last year. |
pre-commit run --all-files
whats-new.rst
Now we use a tuple of indexes for
DataArray.pad
andDataset.pad
.