Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adding stack_all #3115

Closed
wants to merge 8 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 12 additions & 7 deletions doc/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -69,18 +69,23 @@
'numpydoc',
'IPython.sphinxext.ipython_directive',
'IPython.sphinxext.ipython_console_highlighting',
'sphinx_gallery.gen_gallery',
'nbsphinx'
]

extlinks = {'issue': ('https://github.com/pydata/xarray/issues/%s', 'GH'),
'pull': ('https://github.com/pydata/xarray/pull/%s', 'PR'),
}

sphinx_gallery_conf = {'examples_dirs': 'gallery',
'gallery_dirs': 'auto_gallery',
'backreferences_dir': False,
'expected_failing_examples': list(allowed_failures)
}
nbsphinx_timeout = 600
nbsphinx_execute = "always"
nbsphinx_prolog = """
{% set docname = env.doc2path(env.docname, base=None) %}

You can run this notebook in a `live session <https://mybinder.org/v2/gh/pydata/xarray/doc/examples/master?urlpath=lab/tree/doc/{{ docname }}>`_ |Binder| or view it `on Github <https://github.com/pydata/xarray/blob/master/doc/{{ docname }}>`_.

.. |Binder| image:: https://mybinder.org/badge.svg
:target: https://mybinder.org/v2/gh/pydata/xarray/master?urlpath=lab/tree/doc/{{ docname }}
"""

autosummary_generate = True

Expand Down Expand Up @@ -124,7 +129,7 @@

# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
exclude_patterns = ['_build']
exclude_patterns = ['_build', '**.ipynb_checkpoints']

# The reST default role (used for this markup: `text`) to use for all
# documents.
Expand Down
3 changes: 3 additions & 0 deletions doc/environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -24,4 +24,7 @@ dependencies:
- pillow=5.4.1
- sphinx_rtd_theme=0.4.2
- mock=2.0.0
- nbsphinx=0.4.2
- jupyter_client=5.3.1
- ipykernel=5.1.1
- pip
2 changes: 1 addition & 1 deletion doc/examples.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,4 +7,4 @@ Examples
examples/weather-data
examples/monthly-means
examples/multidimensional-coords
auto_gallery/index
examples/visualization_gallery
325 changes: 325 additions & 0 deletions doc/examples/monthly-means.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,325 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Calculating Seasonal Averages from Timeseries of Monthly Means \n",
"=====\n",
"\n",
"Author: [Joe Hamman](https://github.com/jhamman/)\n",
"\n",
"The data used for this example can be found in the [xarray-data](https://github.com/pydata/xarray-data) repository. You may need to change the path to `rasm.nc` below.\n",
"\n",
"Suppose we have a netCDF or `xarray.Dataset` of monthly mean data and we want to calculate the seasonal average. To do this properly, we need to calculate the weighted average considering that each month has a different number of days."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"ExecuteTime": {
"end_time": "2018-11-28T20:51:35.958210Z",
"start_time": "2018-11-28T20:51:35.936966Z"
}
},
"outputs": [],
"source": [
"%matplotlib inline\n",
"import numpy as np\n",
"import pandas as pd\n",
"import xarray as xr\n",
"from netCDF4 import num2date\n",
"import matplotlib.pyplot as plt "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Some calendar information so we can support any netCDF calendar. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"ExecuteTime": {
"end_time": "2018-11-28T20:51:35.991620Z",
"start_time": "2018-11-28T20:51:35.960336Z"
}
},
"outputs": [],
"source": [
"dpm = {'noleap': [0, 31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31],\n",
" '365_day': [0, 31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31],\n",
" 'standard': [0, 31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31],\n",
" 'gregorian': [0, 31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31],\n",
" 'proleptic_gregorian': [0, 31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31],\n",
" 'all_leap': [0, 31, 29, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31],\n",
" '366_day': [0, 31, 29, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31],\n",
" '360_day': [0, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30]} "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### A few calendar functions to determine the number of days in each month\n",
"If you were just using the standard calendar, it would be easy to use the `calendar.month_range` function."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"ExecuteTime": {
"end_time": "2018-11-28T20:51:36.015151Z",
"start_time": "2018-11-28T20:51:35.994079Z"
}
},
"outputs": [],
"source": [
"def leap_year(year, calendar='standard'):\n",
" \"\"\"Determine if year is a leap year\"\"\"\n",
" leap = False\n",
" if ((calendar in ['standard', 'gregorian',\n",
" 'proleptic_gregorian', 'julian']) and\n",
" (year % 4 == 0)):\n",
" leap = True\n",
" if ((calendar == 'proleptic_gregorian') and\n",
" (year % 100 == 0) and\n",
" (year % 400 != 0)):\n",
" leap = False\n",
" elif ((calendar in ['standard', 'gregorian']) and\n",
" (year % 100 == 0) and (year % 400 != 0) and\n",
" (year < 1583)):\n",
" leap = False\n",
" return leap\n",
"\n",
"def get_dpm(time, calendar='standard'):\n",
" \"\"\"\n",
" return a array of days per month corresponding to the months provided in `months`\n",
" \"\"\"\n",
" month_length = np.zeros(len(time), dtype=np.int)\n",
" \n",
" cal_days = dpm[calendar]\n",
" \n",
" for i, (month, year) in enumerate(zip(time.month, time.year)):\n",
" month_length[i] = cal_days[month]\n",
" if leap_year(year, calendar=calendar):\n",
" month_length[i] += 1\n",
" return month_length"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Open the `Dataset`"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"ExecuteTime": {
"end_time": "2018-11-28T20:51:36.072316Z",
"start_time": "2018-11-28T20:51:36.016594Z"
}
},
"outputs": [],
"source": [
"ds = xr.tutorial.open_dataset('rasm').load()\n",
"print(ds)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Now for the heavy lifting:\n",
"We first have to come up with the weights,\n",
"- calculate the month lengths for each monthly data record\n",
"- calculate weights using `groupby('time.season')`\n",
"\n",
"Finally, we just need to multiply our weights by the `Dataset` and sum allong the time dimension. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"ExecuteTime": {
"end_time": "2018-11-28T20:51:36.132413Z",
"start_time": "2018-11-28T20:51:36.073708Z"
}
},
"outputs": [],
"source": [
"# Make a DataArray with the number of days in each month, size = len(time)\n",
"month_length = xr.DataArray(get_dpm(ds.time.to_index(), calendar='noleap'),\n",
" coords=[ds.time], name='month_length')\n",
"\n",
"# Calculate the weights by grouping by 'time.season'.\n",
"# Conversion to float type ('astype(float)') only necessary for Python 2.x\n",
"weights = month_length.groupby('time.season') / month_length.astype(float).groupby('time.season').sum()\n",
"\n",
"# Test that the sum of the weights for each season is 1.0\n",
"np.testing.assert_allclose(weights.groupby('time.season').sum().values, np.ones(4))\n",
"\n",
"# Calculate the weighted average\n",
"ds_weighted = (ds * weights).groupby('time.season').sum(dim='time')"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"ExecuteTime": {
"end_time": "2018-11-28T20:51:36.152913Z",
"start_time": "2018-11-28T20:51:36.133997Z"
}
},
"outputs": [],
"source": [
"print(ds_weighted)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"ExecuteTime": {
"end_time": "2018-11-28T20:51:36.190765Z",
"start_time": "2018-11-28T20:51:36.154416Z"
}
},
"outputs": [],
"source": [
"# only used for comparisons\n",
"ds_unweighted = ds.groupby('time.season').mean('time')\n",
"ds_diff = ds_weighted - ds_unweighted"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"ExecuteTime": {
"end_time": "2018-11-28T20:51:40.264871Z",
"start_time": "2018-11-28T20:51:36.192467Z"
}
},
"outputs": [],
"source": [
"# Quick plot to show the results\n",
"notnull = pd.notnull(ds_unweighted['Tair'][0])\n",
"\n",
"fig, axes = plt.subplots(nrows=4, ncols=3, figsize=(14,12))\n",
"for i, season in enumerate(('DJF', 'MAM', 'JJA', 'SON')):\n",
" ds_weighted['Tair'].sel(season=season).where(notnull).plot.pcolormesh(\n",
" ax=axes[i, 0], vmin=-30, vmax=30, cmap='Spectral_r', \n",
" add_colorbar=True, extend='both')\n",
" \n",
" ds_unweighted['Tair'].sel(season=season).where(notnull).plot.pcolormesh(\n",
" ax=axes[i, 1], vmin=-30, vmax=30, cmap='Spectral_r', \n",
" add_colorbar=True, extend='both')\n",
"\n",
" ds_diff['Tair'].sel(season=season).where(notnull).plot.pcolormesh(\n",
" ax=axes[i, 2], vmin=-0.1, vmax=.1, cmap='RdBu_r',\n",
" add_colorbar=True, extend='both')\n",
"\n",
" axes[i, 0].set_ylabel(season)\n",
" axes[i, 1].set_ylabel('')\n",
" axes[i, 2].set_ylabel('')\n",
"\n",
"for ax in axes.flat:\n",
" ax.axes.get_xaxis().set_ticklabels([])\n",
" ax.axes.get_yaxis().set_ticklabels([])\n",
" ax.axes.axis('tight')\n",
" ax.set_xlabel('')\n",
" \n",
"axes[0, 0].set_title('Weighted by DPM')\n",
"axes[0, 1].set_title('Equal Weighting')\n",
"axes[0, 2].set_title('Difference')\n",
" \n",
"plt.tight_layout()\n",
"\n",
"fig.suptitle('Seasonal Surface Air Temperature', fontsize=16, y=1.02)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"ExecuteTime": {
"end_time": "2018-11-28T20:51:40.284898Z",
"start_time": "2018-11-28T20:51:40.266406Z"
}
},
"outputs": [],
"source": [
"# Wrap it into a simple function\n",
"def season_mean(ds, calendar='standard'):\n",
" # Make a DataArray of season/year groups\n",
" year_season = xr.DataArray(ds.time.to_index().to_period(freq='Q-NOV').to_timestamp(how='E'),\n",
" coords=[ds.time], name='year_season')\n",
"\n",
" # Make a DataArray with the number of days in each month, size = len(time)\n",
" month_length = xr.DataArray(get_dpm(ds.time.to_index(), calendar=calendar),\n",
" coords=[ds.time], name='month_length')\n",
" # Calculate the weights by grouping by 'time.season'\n",
" weights = month_length.groupby('time.season') / month_length.groupby('time.season').sum()\n",
"\n",
" # Test that the sum of the weights for each season is 1.0\n",
" np.testing.assert_allclose(weights.groupby('time.season').sum().values, np.ones(4))\n",
"\n",
" # Calculate the weighted average\n",
" return (ds * weights).groupby('time.season').sum(dim='time')"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"anaconda-cloud": {},
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.8"
},
"toc": {
"base_numbering": 1,
"nav_menu": {},
"number_sections": true,
"sideBar": true,
"skip_h1_title": false,
"title_cell": "Table of Contents",
"title_sidebar": "Contents",
"toc_cell": true,
"toc_position": {},
"toc_section_display": true,
"toc_window_display": true
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Loading