Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

_indexes of DataArray are not deep copied #3899

Closed
toddrjen opened this issue Mar 27, 2020 · 4 comments
Closed

_indexes of DataArray are not deep copied #3899

toddrjen opened this issue Mar 27, 2020 · 4 comments

Comments

@toddrjen
Copy link
Contributor

In DataArray.copy, the _indexes attributes is not deep copied. After pull request #3840, this causes deleting a coordinate of a copy will also delete that coordinate from the original, even for deep copies.

MCVE Code Sample

a0 = xr.DataArray(
    np.array([[1, 2, 3], [4, 5, 6]]),
    dims=["y", "x"],
    coords={"x": ["a", "b", "c"], "y": [-1, 1]},
)

a1 = a0.copy()
del a1.coords["y"]

xr.tests.assert_identical(a0, a0)

The result is:

xarray/testing.py:272: in _assert_internal_invariants
    _assert_dataarray_invariants(xarray_obj)
xarray/testing.py:222: in _assert_dataarray_invariants
    _assert_indexes_invariants_checks(da._indexes, da._coords, da.dims)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

indexes = {'x': Index(['a', 'b', 'c'], dtype='object', name='x')}, possible_coord_variables = {'x': <xarray.IndexVariable 'x' (x: 3)>
array(['a', 'b', 'c'], dtype='<U1'), 'y': <xarray.IndexVariable 'y' (y: 2)>
array([-1,  1])}
dims = ('y', 'x')

    def _assert_indexes_invariants_checks(indexes, possible_coord_variables, dims):
        assert isinstance(indexes, dict), indexes
        assert all(isinstance(v, pd.Index) for v in indexes.values()), {
            k: type(v) for k, v in indexes.items()
        }
    
        index_vars = {
            k for k, v in possible_coord_variables.items() if isinstance(v, IndexVariable)
        }
        assert indexes.keys() <= index_vars, (set(indexes), index_vars)
    
        # Note: when we support non-default indexes, these checks should be opt-in
        # only!
        defaults = default_indexes(possible_coord_variables, dims)
>       assert indexes.keys() == defaults.keys(), (set(indexes), set(defaults))
E       AssertionError: ({'x'}, {'y', 'x'})

xarray/testing.py:185: AssertionError

Expected Output

The test should pass.

Problem Description

Doing a deep copy should make a copy of everything. Changing a deep copy should not alter the original in any way.

toddrjen added a commit to toddrjen/xarray that referenced this issue Mar 27, 2020
@max-sixty
Copy link
Collaborator

Great spot @toddrjen .

@dcherian any ideas? I may have some spare time tomorrow to help fix

@dcherian
Copy link
Contributor

From a quick look maybe we do copy.deepcopy(indexes) in Dataset.copy and pass that to Dataset._replace?

@toddrjen
Copy link
Contributor Author

My pull request #3871 has a fix already.

toddrjen added a commit to toddrjen/xarray that referenced this issue Mar 29, 2020
max-sixty pushed a commit that referenced this issue Mar 29, 2020
* drop numpy 1.12 compat code that can hide other errors

* deep copy _indexes (#3899)

* implement idxmax and idxmin
@dcherian
Copy link
Contributor

Closed by #3871

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants