Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: Release v1.2.0 #91

Merged
merged 44 commits into from
Oct 24, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
44 commits
Select commit Hold shift + click to select a range
0033359
refactor: add class for MultiIndex convertion
nicrie Aug 14, 2023
79b43cf
refactor: remove dim checks
nicrie Aug 16, 2023
983e0ad
refactor: generalize hilbert transform over dims
nicrie Aug 16, 2023
39ee3fb
refactor: add MultIndexConverter to preprocessor
nicrie Aug 16, 2023
4d521f8
feat: choose name of sample and feature dim
nicrie Aug 16, 2023
a737127
test: provide method to create synthetic data
nicrie Sep 16, 2023
b51f36a
style: add and streamline type hints
nicrie Sep 24, 2023
15f0194
test: add more flexible data generation classes
nicrie Sep 24, 2023
52935a0
refactor: add MultiIndexConvert & tests
nicrie Sep 24, 2023
508d742
refactor: add Sanitizer & tests
nicrie Sep 24, 2023
9f39f8f
refactor: Stacker focuses on stacking
nicrie Sep 24, 2023
7b5068b
refactor: Sanitizer removes NaNs
nicrie Sep 24, 2023
a62fc5a
refactor: streamline Scaler
nicrie Sep 24, 2023
4a43ce9
refactor: adapt Preprocessor to refactoring
nicrie Sep 24, 2023
0ce02f0
refactor: reflect refactoring in Factory
nicrie Sep 24, 2023
fe9bd46
refactor: generalize preprocessing in MCA
nicrie Sep 24, 2023
2bbcb69
fix: PCA preprocessing before Hilbert transform
nicrie Sep 24, 2023
79e82cc
feat: add CCA support
nicrie Sep 26, 2023
a7df1b8
build: Merge branch 'main' into cca
nicrie Sep 26, 2023
12cfc97
refactor(BaseModel): create algorithm methods
nicrie Oct 9, 2023
324c06b
style: update typings
nicrie Oct 9, 2023
166e957
refactor: create hilbert_transform.py
nicrie Oct 9, 2023
09415a3
refactor: simplify model data structure
nicrie Oct 12, 2023
ebadaa2
refactor(Preprocessor): enforce input as list
nicrie Oct 16, 2023
183779a
refactor(BaseModel): move input check to utils
nicrie Oct 16, 2023
2307169
refactor(GenericListTransformer): remove inherit
nicrie Oct 16, 2023
6f8a9cf
style(decomposer): use structural pattern matching
nicrie Oct 16, 2023
731eec2
fix: score coordinates in T-mode EOF analysis
nicrie Oct 16, 2023
4080143
test(EOF): fix random seed
nicrie Oct 16, 2023
d4b5a77
test: isolated NaNs are invalid test cases
nicrie Oct 16, 2023
ffd11fc
fix(Bootstrapper): avoid pertubating sample coords
nicrie Oct 16, 2023
58c2dbe
feat: add GWPCA support
nicrie Oct 12, 2023
f0dcb34
feat: parameter normalized scores
nicrie Oct 22, 2023
2db736c
feat: add Extended EOF Analysis
nicrie Aug 14, 2023
807b7e8
perf(dask): compute SVD result immediately
nicrie Oct 23, 2023
43e9274
fix(GWPCA): raise error in scores
nicrie Oct 23, 2023
98c5cef
fix: streamline model attribute types
nicrie Oct 23, 2023
cf8b641
feat: provide standard kwarg for random_state
nicrie Oct 23, 2023
6be8923
docs: provide top-level type hints
nicrie Oct 23, 2023
76c0b1b
fix(CCA): add checks for edge cases
nicrie Oct 23, 2023
10faa03
build: py3.10 requires typing_extension for Self
nicrie Oct 23, 2023
47e6909
build: add cca-zoo as dev dependency
nicrie Oct 23, 2023
6ed06f0
test: remove cca-zoo test
nicrie Oct 23, 2023
fb85abe
build: add typing-extensions
nicrie Oct 23, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
# Personal
.vscode/
# Test related to CCA
tests/models/test_cca_solution.py

# Byte-compiled / optimized / DLL files
__pycache__/
Expand Down
1 change: 1 addition & 0 deletions docs/_autosummary/xeofs.models.ComplexEOF.rst
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@
~ComplexEOF.explained_variance
~ComplexEOF.explained_variance_ratio
~ComplexEOF.fit
~ComplexEOF.fit_transform
~ComplexEOF.get_params
~ComplexEOF.inverse_transform
~ComplexEOF.scores
Expand Down
1 change: 1 addition & 0 deletions docs/_autosummary/xeofs.models.ComplexEOFRotator.rst
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@
~ComplexEOFRotator.explained_variance
~ComplexEOFRotator.explained_variance_ratio
~ComplexEOFRotator.fit
~ComplexEOFRotator.fit_transform
~ComplexEOFRotator.get_params
~ComplexEOFRotator.inverse_transform
~ComplexEOFRotator.scores
Expand Down
1 change: 1 addition & 0 deletions docs/_autosummary/xeofs.models.ComplexMCA.rst
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@
~ComplexMCA.singular_values
~ComplexMCA.squared_covariance
~ComplexMCA.squared_covariance_fraction
~ComplexMCA.total_covariance
~ComplexMCA.transform


Expand Down
1 change: 1 addition & 0 deletions docs/_autosummary/xeofs.models.ComplexMCARotator.rst
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@
~ComplexMCARotator.singular_values
~ComplexMCARotator.squared_covariance
~ComplexMCARotator.squared_covariance_fraction
~ComplexMCARotator.total_covariance
~ComplexMCARotator.transform


Expand Down
1 change: 1 addition & 0 deletions docs/_autosummary/xeofs.models.EOF.rst
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@
~EOF.explained_variance
~EOF.explained_variance_ratio
~EOF.fit
~EOF.fit_transform
~EOF.get_params
~EOF.inverse_transform
~EOF.scores
Expand Down
1 change: 1 addition & 0 deletions docs/_autosummary/xeofs.models.EOFRotator.rst
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@
~EOFRotator.explained_variance
~EOFRotator.explained_variance_ratio
~EOFRotator.fit
~EOFRotator.fit_transform
~EOFRotator.get_params
~EOFRotator.inverse_transform
~EOFRotator.scores
Expand Down
1 change: 1 addition & 0 deletions docs/_autosummary/xeofs.models.MCA.rst
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@
~MCA.singular_values
~MCA.squared_covariance
~MCA.squared_covariance_fraction
~MCA.total_covariance
~MCA.transform


Expand Down
1 change: 1 addition & 0 deletions docs/_autosummary/xeofs.models.MCARotator.rst
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@
~MCARotator.singular_values
~MCARotator.squared_covariance
~MCARotator.squared_covariance_fraction
~MCARotator.total_covariance
~MCARotator.transform


Expand Down
1 change: 1 addition & 0 deletions docs/_autosummary/xeofs.models.OPA.rst
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@
~OPA.decorrelation_time
~OPA.filter_patterns
~OPA.fit
~OPA.fit_transform
~OPA.get_params
~OPA.inverse_transform
~OPA.scores
Expand Down
1 change: 1 addition & 0 deletions docs/_autosummary/xeofs.validation.EOFBootstrapper.rst
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@
~EOFBootstrapper.explained_variance
~EOFBootstrapper.explained_variance_ratio
~EOFBootstrapper.fit
~EOFBootstrapper.fit_transform
~EOFBootstrapper.get_params
~EOFBootstrapper.inverse_transform
~EOFBootstrapper.scores
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
36 changes: 36 additions & 0 deletions docs/auto_examples/1eof/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,23 @@
<div class="sphx-glr-thumbnails">


.. raw:: html

<div class="sphx-glr-thumbcontainer" tooltip="This example demonstrates Extended EOF (EEOF) analysis on xarray tutorial data. EEOF analysis,...">

.. only:: html

.. image:: /auto_examples/1eof/images/thumb/sphx_glr_plot_eeof_thumb.png
:alt:

:ref:`sphx_glr_auto_examples_1eof_plot_eeof.py`

.. raw:: html

<div class="sphx-glr-thumbnail-title">Extented EOF analysis</div>
</div>


.. raw:: html

<div class="sphx-glr-thumbcontainer" tooltip="EOF analysis in T-mode maximises the spatial variance.">
Expand Down Expand Up @@ -114,6 +131,23 @@
</div>


.. raw:: html

<div class="sphx-glr-thumbcontainer" tooltip="TIn this demonstration, we&#x27;ll apply GWPCA to a dataset detailing the chemical compositions of s...">

.. only:: html

.. image:: /auto_examples/1eof/images/thumb/sphx_glr_plot_gwpca_thumb.png
:alt:

:ref:`sphx_glr_auto_examples_1eof_plot_gwpca.py`

.. raw:: html

<div class="sphx-glr-thumbnail-title">Geographically weighted PCA</div>
</div>


.. raw:: html

</div>
Expand All @@ -122,10 +156,12 @@
.. toctree::
:hidden:

/auto_examples/1eof/plot_eeof
/auto_examples/1eof/plot_eof-tmode
/auto_examples/1eof/plot_eof-smode
/auto_examples/1eof/plot_multivariate-eof
/auto_examples/1eof/plot_mreof
/auto_examples/1eof/plot_rotated_eof
/auto_examples/1eof/plot_weighted-eof
/auto_examples/1eof/plot_gwpca

151 changes: 151 additions & 0 deletions docs/auto_examples/1eof/plot_eeof.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,151 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n# Extented EOF analysis\n\nThis example demonstrates Extended EOF (EEOF) analysis on ``xarray`` tutorial \ndata. EEOF analysis, also termed as Multivariate/Multichannel Singular \nSpectrum Analysis, advances traditional EOF analysis to capture propagating \nsignals or oscillations in multivariate datasets. At its core, this \ninvolves the formulation of a lagged covariance matrix that encapsulates \nboth spatial and temporal correlations. Subsequently, this matrix is \ndecomposed to yield its eigenvectors (components) and eigenvalues (explained variance).\n\nLet's begin by setting up the required packages and fetching the data:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"import xarray as xr\nimport xeofs as xe\nimport matplotlib.pyplot as plt\n\nxr.set_options(display_expand_data=False)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Load the tutorial data.\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"t2m = xr.tutorial.load_dataset(\"air_temperature\").air"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Prior to conducting the EEOF analysis, it's essential to determine the\nstructure of the lagged covariance matrix. This entails defining the time\ndelay ``tau`` and the ``embedding`` dimension. The former signifies the\ninterval between the original and lagged time series, while the latter\ndictates the number of time-lagged copies in the delay-coordinate space,\nrepresenting the system's dynamics.\nFor illustration, using ``tau=4`` and ``embedding=40``, we generate 40\ndelayed versions of the time series, each offset by 4 time steps, resulting\nin a maximum shift of ``tau x embedding = 160``. Given our dataset's\n6-hour intervals, tau = 4 translates to a 24-hour shift.\nIt's obvious that this way of constructing the lagged covariance matrix\nand subsequently decomposing it can be computationally expensive. For example,\ngiven our dataset's dimensions,\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"t2m.shape"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"the extended dataset would have 40 x 25 x 53 = 53000 features\nwhich is much larger than the original dataset's 1325 features.\nTo mitigate this, we can first preprocess the data using PCA / EOF analysis\nand then perform EEOF analysis on the resulting PCA / EOF scores. Here,\nwe'll use ``n_pca_modes=50`` to retain the first 50 PCA modes, so we end\nup with 40 x 50 = 200 (latent) features.\nWith these parameters set, we proceed to instantiate the ``ExtendedEOF``\nmodel and fit our data.\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"model = xe.models.ExtendedEOF(\n n_modes=10, tau=4, embedding=40, n_pca_modes=50, use_coslat=True\n)\nmodel.fit(t2m, dim=\"time\")\nscores = model.scores()\ncomponents = model.components()\ncomponents"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"A notable distinction from standard EOF analysis is the incorporation of an\nextra ``embedding`` dimension in the components. Nonetheless, the\noverarching methodology mirrors traditional EOF practices. The results,\nfor instance, can be assessed by examining the explained variance ratio.\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"model.explained_variance_ratio().plot()\nplt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Additionally, we can look into the scores; let's spotlight mode 4.\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"scores.sel(mode=4).plot()\nplt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In wrapping up, we visualize the corresponding EEOF component of mode 4.\nFor visualization purposes, we'll focus on the component at a specific\nlatitude, in this instance, 60 degrees north.\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"components.sel(mode=4, lat=60).plot()\nplt.show()"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.4"
}
},
"nbformat": 4,
"nbformat_minor": 0
}
83 changes: 83 additions & 0 deletions docs/auto_examples/1eof/plot_eeof.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
"""
Extented EOF analysis
=====================

This example demonstrates Extended EOF (EEOF) analysis on ``xarray`` tutorial
data. EEOF analysis, also termed as Multivariate/Multichannel Singular
Spectrum Analysis, advances traditional EOF analysis to capture propagating
signals or oscillations in multivariate datasets. At its core, this
involves the formulation of a lagged covariance matrix that encapsulates
both spatial and temporal correlations. Subsequently, this matrix is
decomposed to yield its eigenvectors (components) and eigenvalues (explained variance).

Let's begin by setting up the required packages and fetching the data:
"""

import xarray as xr
import xeofs as xe
import matplotlib.pyplot as plt

xr.set_options(display_expand_data=False)

# %%
# Load the tutorial data.
t2m = xr.tutorial.load_dataset("air_temperature").air


# %%
# Prior to conducting the EEOF analysis, it's essential to determine the
# structure of the lagged covariance matrix. This entails defining the time
# delay ``tau`` and the ``embedding`` dimension. The former signifies the
# interval between the original and lagged time series, while the latter
# dictates the number of time-lagged copies in the delay-coordinate space,
# representing the system's dynamics.
# For illustration, using ``tau=4`` and ``embedding=40``, we generate 40
# delayed versions of the time series, each offset by 4 time steps, resulting
# in a maximum shift of ``tau x embedding = 160``. Given our dataset's
# 6-hour intervals, tau = 4 translates to a 24-hour shift.
# It's obvious that this way of constructing the lagged covariance matrix
# and subsequently decomposing it can be computationally expensive. For example,
# given our dataset's dimensions,

t2m.shape

# %%
# the extended dataset would have 40 x 25 x 53 = 53000 features
# which is much larger than the original dataset's 1325 features.
# To mitigate this, we can first preprocess the data using PCA / EOF analysis
# and then perform EEOF analysis on the resulting PCA / EOF scores. Here,
# we'll use ``n_pca_modes=50`` to retain the first 50 PCA modes, so we end
# up with 40 x 50 = 200 (latent) features.
# With these parameters set, we proceed to instantiate the ``ExtendedEOF``
# model and fit our data.

model = xe.models.ExtendedEOF(
n_modes=10, tau=4, embedding=40, n_pca_modes=50, use_coslat=True
)
model.fit(t2m, dim="time")
scores = model.scores()
components = model.components()
components

# %%
# A notable distinction from standard EOF analysis is the incorporation of an
# extra ``embedding`` dimension in the components. Nonetheless, the
# overarching methodology mirrors traditional EOF practices. The results,
# for instance, can be assessed by examining the explained variance ratio.

model.explained_variance_ratio().plot()
plt.show()

# %%
# Additionally, we can look into the scores; let's spotlight mode 4.

scores.sel(mode=4).plot()
plt.show()

# %%
# In wrapping up, we visualize the corresponding EEOF component of mode 4.
# For visualization purposes, we'll focus on the component at a specific
# latitude, in this instance, 60 degrees north.

components.sel(mode=4, lat=60).plot()
plt.show()
1 change: 1 addition & 0 deletions docs/auto_examples/1eof/plot_eeof.py.md5
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
7f3b66c7aec555c78dde9031213be3ad
Loading