You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Asking for Clarification on Flexible Indexes Roadmap (GH Project 1) : Will this project enable hierarchical indexing along single axis as well as non-MECE indexes?
#6737
I am not totally sure as to what the Flexible Indexes Roadmap will allow in terms of flexible labeling within Xarray in the future. Will this project enable hierarchical indexing along single axis as well as potentially non-MECE indexes along a single axis? As, I am unsure if I am using the write terminology I have presented an abstract example and a coded example in pandas below:
For example imagine a 1-D array where each values is a category (let's say a part of the economy) (agriculture, manufacturing .... government) but you wanted to also index columns by type of account (exogenous, endogenous) or let's say you even had a category that extended across two indexes. An example is shown below:
I'm not sure I'm doing a great job of describing this, but essentially a way of having more complex indexes that can cross-over each other and be non-exhaustive.
The reason this might be useful is to allow for easy labeling of values or vectors within a matrix and access to vectors or sub-matrices within a matrix. There are often times when a sub-matrix or vector has significance and by labeling it, it allows for code readability / explain-ability, etc.
Apologies if all of these functions are solved by the Flexible index roadmap, I read all the historical discussions from 2017 but wasn't sure if the proposal would allow for the behavior mentioned above.
To provide a code example, I can create somewhat desired result (clunky) in pandas:
importpandasaspdimportnumpyasnpimportxarrayasxr# Future column indexexpenditures=pd.DataFrame(dict(
expenditures=["A", "C", "F", "H", "G", "I", "R"],
expenditures_type=["ENDO", "ENDO", "ENDO", "ENDO", "EXO", "EXO", "EXO"]))
# Future row indexincomes=pd.DataFrame(dict(
incomes=["A", "C", "F", "H", "G", "I", "R"],
incomes_type=["ENDO", "ENDO", "ENDO", "ENDO", "EXO", "EXO", "EXO"]))
# Create flat table that describes every variable in matrixincomes["key"] =1expenditures["key"] =1long=pd.merge(expenditures, incomes, on=["key"])
# Generate random variables for examplelong["values"] =np.random.rand(long.shape[0])
# Desired result is pivot table where columns and row are multi-indexeddesired_result=pd.pivot_table(long,
values=["values"],
index=["incomes_type", "incomes"],
columns=["expenditures_type", "expenditures"])
# Converting to xarray Dataset creates something, whereas the desired representation# would most likely be some type of xarray DataArray, however not possible to convert# dataframe to DataArray obviously.test=xr.Dataset.from_dataframe(desired_result) # yields something but not desired data object
This may not be the intention of the Xarray project, but I personally was hoping to find some functionality like this, and was wondering if the Flexible Indexes GH Project 1 was working towards something like this. Appreciate any comments, and happy to help if there is something like this planned.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
I am not totally sure as to what the Flexible Indexes Roadmap will allow in terms of flexible labeling within Xarray in the future. Will this project enable hierarchical indexing along single axis as well as potentially non-MECE indexes along a single axis? As, I am unsure if I am using the write terminology I have presented an abstract example and a coded example in pandas below:
For example imagine a 1-D array where each values is a category (let's say a part of the economy) (agriculture, manufacturing .... government) but you wanted to also index columns by type of account (exogenous, endogenous) or let's say you even had a category that extended across two indexes. An example is shown below:
I'm not sure I'm doing a great job of describing this, but essentially a way of having more complex indexes that can cross-over each other and be non-exhaustive.
The reason this might be useful is to allow for easy labeling of values or vectors within a matrix and access to vectors or sub-matrices within a matrix. There are often times when a sub-matrix or vector has significance and by labeling it, it allows for code readability / explain-ability, etc.
Apologies if all of these functions are solved by the Flexible index roadmap, I read all the historical discussions from 2017 but wasn't sure if the proposal would allow for the behavior mentioned above.
Closest I found in the docs is: https://docs.xarray.dev/en/latest/user-guide/indexing.html#multi-level-indexing
To provide a code example, I can create somewhat desired result (clunky) in pandas:
This may not be the intention of the Xarray project, but I personally was hoping to find some functionality like this, and was wondering if the Flexible Indexes GH Project 1 was working towards something like this. Appreciate any comments, and happy to help if there is something like this planned.
Beta Was this translation helpful? Give feedback.
All reactions