Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add sel_points for point-wise indexing by label #507

Merged
merged 3 commits into from
Aug 5, 2015

Conversation

shoyer
Copy link
Member

@shoyer shoyer commented Aug 1, 2015

xref #475

Example usage:

In [1]: da = xray.DataArray(np.arange(56).reshape((7, 8)),
   ...:                     coords={'x': list('abcdefg'),
   ...:                             'y': 10 * np.arange(8)},
   ...:                     dims=['x', 'y'])
   ...:

In [2]: da
Out[2]:
<xray.DataArray (x: 7, y: 8)>
array([[ 0,  1,  2,  3,  4,  5,  6,  7],
       [ 8,  9, 10, 11, 12, 13, 14, 15],
       [16, 17, 18, 19, 20, 21, 22, 23],
       [24, 25, 26, 27, 28, 29, 30, 31],
       [32, 33, 34, 35, 36, 37, 38, 39],
       [40, 41, 42, 43, 44, 45, 46, 47],
       [48, 49, 50, 51, 52, 53, 54, 55]])
Coordinates:
* y        (y) int64 0 10 20 30 40 50 60 70
* x        (x) |S1 'a' 'b' 'c' 'd' 'e' 'f' 'g'

# we can index by position along each dimension
In [3]: da.isel_points(x=[0, 1, 6], y=[0, 1, 0], dim='points')
Out[3]:
<xray.DataArray (points: 3)>
array([ 0,  9, 48])
Coordinates:
    y        (points) int64 0 10 0
    x        (points) |S1 'a' 'b' 'g'
  * points   (points) int64 0 1 2

# or equivalently by label
In [4]: da.sel_points(x=['a', 'b', 'g'], y=[0, 10, 0], dim='points')
Out[4]:
<xray.DataArray (points: 3)>
array([ 0,  9, 48])
Coordinates:
    y        (points) int64 0 10 0
    x        (points) |S1 'a' 'b' 'g'
  * points   (points) int64 0 1 2
Bug fixes

cc @jhamman

method : {None, 'nearest', 'pad'/'ffill', 'backfill'/'bfill'}, optional
Method to use for inexact matches (requires pandas>=0.16):

* default: only exact matches
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know these bullets are straight from pandas but "default" isn't a valid keyword here. I think this may be clearer as * None (default): only exact matches.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, I'll update.

@jhamman
Copy link
Member

jhamman commented Aug 1, 2015

This looks pretty good to me. Are you basically letting pandas do all the testing in terms of the method and range of the indexers? I'm wondering if we need to target any corner cases that extend beyond how pandas does the indexer mapping.

@shoyer
Copy link
Member Author

shoyer commented Aug 2, 2015

Are you basically letting pandas do all the testing in terms of the method and range of the indexers? I'm wondering if we need to target any corner cases that extend beyond how pandas does the indexer mapping.

This has been my strategy. Pandas has lots of tests for the exact behavior of get_indexer.

xref GH475

Example usage:

	In [1]: da = xray.DataArray(np.arange(56).reshape((7, 8)),
	   ...:                     coords={'x': list('abcdefg'),
	   ...:                             'y': 10 * np.arange(8)},
	   ...:                     dims=['x', 'y'])
	   ...:

	In [2]: da
	Out[2]:
	<xray.DataArray (x: 7, y: 8)>
	array([[ 0,  1,  2,  3,  4,  5,  6,  7],
	       [ 8,  9, 10, 11, 12, 13, 14, 15],
	       [16, 17, 18, 19, 20, 21, 22, 23],
	       [24, 25, 26, 27, 28, 29, 30, 31],
	       [32, 33, 34, 35, 36, 37, 38, 39],
	       [40, 41, 42, 43, 44, 45, 46, 47],
	       [48, 49, 50, 51, 52, 53, 54, 55]])
	Coordinates:
	* y        (y) int64 0 10 20 30 40 50 60 70
	* x        (x) |S1 'a' 'b' 'c' 'd' 'e' 'f' 'g'

	# we can index by position along each dimension
	In [3]: da.isel_points(x=[0, 1, 6], y=[0, 1, 0], dim='points')
	Out[3]:
	<xray.DataArray (points: 3)>
	array([ 0,  9, 48])
	Coordinates:
	    y        (points) int64 0 10 0
	    x        (points) |S1 'a' 'b' 'g'
	  * points   (points) int64 0 1 2

	# or equivalently by label
	In [4]: da.sel_points(x=['a', 'b', 'g'], y=[0, 10, 0], dim='points')
	Out[4]:
	<xray.DataArray (points: 3)>
	array([ 0,  9, 48])
	Coordinates:
	    y        (points) int64 0 10 0
	    x        (points) |S1 'a' 'b' 'g'
	  * points   (points) int64 0 1 2
	Bug fixes

cc jhamman
@shoyer
Copy link
Member Author

shoyer commented Aug 5, 2015

@jhamman any other comments? If not, I'll merge this shortly.

@jhamman
Copy link
Member

jhamman commented Aug 5, 2015

No, I think this is good to go.

@jhamman
Copy link
Member

jhamman commented Aug 5, 2015

Are there going to be merge conflicts with #512? Maybe merge this then update that PR accordingly?

shoyer added a commit that referenced this pull request Aug 5, 2015
Add sel_points for point-wise indexing by label
@shoyer shoyer merged commit 68438a1 into pydata:master Aug 5, 2015
@shoyer shoyer deleted the sel_points branch August 5, 2015 03:51
@@ -442,6 +442,16 @@ def test_isel_points_method(self):
actual = da.isel_points(y=[1, 2], x=[1, 2], dim=['A', 'B'])
assert 'points' in actual.coords

def test_isel_points(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you have two test methods with the same name

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants