-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle the character array dim name #2896
Conversation
…decode and reapply in encode
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implementation looks good to me, but this needs documentation & tests.
Could you kindly add a note to the documentation about this? Maybe somewhere in this section?
http://xarray.pydata.org/en/stable/io.html#string-encoding
Tests could probably go somewhere in this file:
https://github.com/pydata/xarray/blob/master/xarray/tests/test_coding_strings.py
thanks, @shoyer. I will add the documentation and tests now that the first hurdle is cleared and update the PR. |
Hello @jmccreight! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found: There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻 Comment last updated at 2019-04-19 15:13:29 UTC |
@shoyer Added test and documentation. I did not build documentation, wasnt sure if that was necessary. |
xarray/tests/test_coding_strings.py
Outdated
Variable(('x',), [b'ab', b'cdef'], encoding={'char_dim_name': 'foo'}) | ||
] | ||
) | ||
def test_CharacterArrayCoder_char_dim_name(original): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is better and getting warmer. The test improved the underlying code here.
But I still dont see how to eliminate this logic. Putting an or
in the assert does not seem like a better alternative, nor does only testing just one of the two possible input scenarios.... let me know if you have a better idea that I'm too blind to see.
Thanks,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The clean way to do this would be to add a second parametrized argument for the expected value, e.g.,
@pytest.mark.parametrize(
['original', 'expected_char_dim_name'],
[
(Variable(('x',), [b'ab', b'cdef']), 'string4'),
(Variable(('x',), [b'ab', b'cdef'], encoding={'char_dim_name': 'foo'}), 'char_dim_name')
]
)
def test_CharacterArrayCoder_char_dim_name(original, expected_char_dim_name):
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is the first case of 2-variable usage in that test file... thanks for pointing it out.
xarray/coding/strings.py
Outdated
else: | ||
default_char_dim_name = 'string%s' % data.shape[-1] | ||
dims = dims + (default_char_dim_name,) | ||
encoding['char_dim_name'] = default_char_dim_name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this line is causing a set of failures. I think this line is desirable. I could be wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
line 111, to be clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The convention we have (which is apparently enforced by tests) is that when you use an encoding you remove it from the encoding
dict. That way we catch cases where you have a typo, e.g., char_dimension_name
instead of char_dim_name
.
So this section should instead probably be something like:
char_dim_name = encoding.pop(
'char_dim_name', 'string%s' % data.shape[-1])
dims = dims + (char_dim_name,)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great reason, this solves all the errors I was seeing locally. THanks
I'm uncertain why travis is failing. Two of them look http-related and the other maybe be docs-related (but dont trust me). Running pytest locallin in the xarray/tests/ dir
|
One is a lint failure: #2896 (comment) The docs failure looks like a cartopy install problem |
🎊 |
Could you please add a brief note to |
🤦♂️ with that formatting in the whats-new.rst. (a reminder to squash) |
Thanks @jmccreight |
* master: (29 commits) Handle the character array dim name (pydata#2896) Partial fix for pydata#2841 to improve formatting. (pydata#2906) docs: Move quick overview one level up (pydata#2890) Manually specify chunks in open_zarr (pydata#2530) Minor improvement of docstring for Dataset (pydata#2904) Fix minor typos in docstrings (pydata#2903) Added docs example for `xarray.Dataset.get()` (pydata#2894) Bugfix for docs build instructions (pydata#2897) Return correct count for scalar datetime64 arrays (pydata#2892) Indexing with an empty array (pydata#2883) BUG: Fix pydata#2864 by adding the missing vrt parameters (pydata#2865) Reduce length of cftime resample tests (pydata#2879) WIP: type annotations (pydata#2877) decreased pytest verbosity (pydata#2881) Fix mypy typing error in cftime_offsets.py (pydata#2878) update links to https (pydata#2872) revert to 0.12.2 dev 0.12.1 release Various fixes for explicit Dataset.indexes (pydata#2858) Fix minor typo in docstring (pydata#2860) ...
Perfect timing, I just needed that! Thanks @jmccreight et al |
See #2895