Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an Iamslice #657

Merged
merged 14 commits into from
May 12, 2022
Merged

Add an Iamslice #657

merged 14 commits into from
May 12, 2022

Conversation

danielhuppmann
Copy link
Member

@danielhuppmann danielhuppmann commented May 6, 2022

Please confirm that this PR has done the following:

  • Tests Added
  • Documentation Added
  • Name of contributors Added to AUTHORS.rst
  • Description in RELEASE_NOTES.md Added

Description of PR

This PR implements an IamSlice based on the initial work by @coroa in #637. I rebased the branch and added basic documentation.

One question: in the initial work, __repr__ combined a summary similar to an IamDataFrame and a representation of the pd.Series. In the usual test case filtered for `scenario="scen_b", this yields.

<class 'pyam.slice.IamSlice'>
Index dimensions and data coordinates:
   model    : model_a (1)
   scenario : scen_b (1)
   region   : World (1)
   variable : Primary Energy (1)
   unit     : EJ/yr (1)
   year     : 2005, 2010 (2)

model    scenario  region  variable             unit   year
model_a  scen_a    World   Primary Energy       EJ/yr  2005    False
                                                       2010    False
                           Primary Energy|Coal  EJ/yr  2005    False
                                                       2010    False
         scen_b    World   Primary Energy       EJ/yr  2005     True
                                                       2010     True
dtype: bool

I find that a bit confusing, because the list of variables shows only "Primary Energy" but the series also shows "Primary Energy|Coal" (as False, but still). I therefore removed the second part, so that __repr__ only shows the summary overview.

If you think that having the slice as pd.Series, we could add a method as_series().

@danielhuppmann danielhuppmann requested a review from coroa May 6, 2022 05:54
@danielhuppmann danielhuppmann self-assigned this May 6, 2022
@codecov
Copy link

codecov bot commented May 6, 2022

Codecov Report

Merging #657 (9063787) into main (764a85e) will decrease coverage by 0.0%.
The diff coverage is 95.0%.

@@           Coverage Diff           @@
##            main    #657     +/-   ##
=======================================
- Coverage   94.5%   94.5%   -0.1%     
=======================================
  Files         57      59      +2     
  Lines       5662    5742     +80     
=======================================
+ Hits        5356    5431     +75     
- Misses       306     311      +5     
Impacted Files Coverage Δ
pyam/slice.py 91.4% <91.4%> (ø)
pyam/core.py 94.5% <91.6%> (-0.1%) ⬇️
pyam/__init__.py 68.4% <100.0%> (+0.8%) ⬆️
tests/conftest.py 100.0% <100.0%> (ø)
tests/test_filter.py 100.0% <100.0%> (ø)
tests/test_slice.py 100.0% <100.0%> (ø)
tests/test_time.py 100.0% <100.0%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 764a85e...9063787. Read the comment docs.

@danielhuppmann danielhuppmann marked this pull request as ready for review May 6, 2022 07:12
@coroa
Copy link
Collaborator

coroa commented May 12, 2022

One question: in the initial work, repr combined a summary similar to an IamDataFrame and a representation of the pd.Series. In the usual test case filtered for `scenario="scen_b", this yields.

Well it is a pd.Series, ie. isinstance(..., pd.Series) will be True, so there is no point to a separate method (pd.Series(a_slice) will work fine).

My main argument for showing the Series representation is that it being a Series opens up several interesting computations, that people might want to know about, which don't make so much sense, if you don't know it is a boolean series.

The most interesting one is that you can do boolean operations, ie if you want from your example

s_scen_a = test_df.slice(scenario="scen_a")
s_coal = test_df.slice(variable="*|Coal")

s_scen_a & ~ s_coal

will give you

<class 'pyam.slice.IamSlice'>
Index dimensions and data coordinates:
   model    : model_a (1)
   scenario : scen_a (1)
   region   : World (1)
   variable : Primary Energy (1)
   unit     : EJ/yr (1)
   year     : 2005, 2010 (2)

model    scenario  region  variable             unit   year
model_a  scen_a    World   Primary Energy       EJ/yr  2005    True
                                                       2010    True
                           Primary Energy|Coal  EJ/yr  2005    False
                                                       2010    False
         scen_b    World   Primary Energy       EJ/yr  2005    False
                                                       2010    False
dtype: bool

@danielhuppmann
Copy link
Member Author

I agree with the implementation and behavior. I'm just worried that implementing __repr__ as combining the IamDataFrame-style overview plus the Series-print can be confusing.

Options:
0. Leave as is, ignore my concern because this will be an expert-only feature anyway

  1. Only show the Series-representation
  2. Only show the IamDataFrame-style-overview

@coroa
Copy link
Collaborator

coroa commented May 12, 2022

I guess the overview is what is going to be the most helpful. So, i'd vote rather for showing the overview, ie option 2 (although, i like seeing both). ie. let's keep it as it is here now. Then we can merge!

What this PR is missing and what is leftover as a new PR would be to refactor the second half of IamDataFrame.filter into a new method (could be called subset for instance) or generic helper and use that from __getitem__ and filter; so that idf[idf.slice(**kwargs)] is equivalent to idf.filter(**kwargs).

(Currently __getitem__ is dropping meta and does not do the year<->time swapping correctly)

@danielhuppmann danielhuppmann merged commit c5e8f9c into IAMconsortium:main May 12, 2022
@danielhuppmann danielhuppmann deleted the iamslice branch June 2, 2022 10:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants