Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add helpful pre-commit checks and CI Documentation build step #229

Merged
merged 11 commits into from
Aug 31, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions .github/workflows/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,13 @@ jobs:
run: |
pip install -r requirements-dev.txt
pip install sympy # needed for notebook 9, but not required for pysindy
- name: Build the docs
# Not exactly how RTD does it, but close.
run: |
sudo apt-get install pandoc
cd docs
python -m sphinx -T -E -W -b html -d _build/doctrees . _build/html
cd ..
- name: Test with pytest
run: |
pytest test --cov=pysindy --cov-report=xml
Expand Down
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -37,4 +37,4 @@ docs/_build

*.sublime*

.hypothesis/
.hypothesis/
28 changes: 28 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -19,3 +19,31 @@ repos:
hooks:
- id: flake8
args: ["--config=setup.cfg"]
- repo: https://github.com/pre-commit/pygrep-hooks
rev: v1.9.0
hooks:
- id: rst-backticks
- id: rst-directive-colons
types: [text] # overwrite types: [rst]
types_or: [python, rst]
- id: rst-inline-touching-normal
types: [text] # overwrite types: [rst]
types_or: [python, rst]
- repo: https://github.com/sphinx-contrib/sphinx-lint
rev: v0.6.1
hooks:
- id: sphinx-lint
- repo: https://github.com/codespell-project/codespell
rev: v2.1.0
hooks:
- id: codespell
types_or: [python, rst, markdown]
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.3.0
hooks:
- id: end-of-file-fixer
exclude: (.txt|^docs/JOSS1|^docs/JOSS2|^examples/data/)
stages: [commit, merge-commit, push, prepare-commit-msg, commit-msg, post-checkout, post-commit, post-merge, post-rewrite]
- id: trailing-whitespace
stages: [commit, merge-commit, push, prepare-commit-msg, commit-msg, post-checkout, post-commit, post-merge, post-rewrite]
exclude: (.txt|^docs/JOSS1|^docs/JOSS2|^examples/data/)
32 changes: 16 additions & 16 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -148,7 +148,7 @@ PySINDy implements a lot of advanced functionality that may be overwhelming for

.. image:: https://github.com/dynamicslab/pysindy/blob/master/docs/JOSS2/Fig3.png

This flow chart summarizes how `PySINDy` users can start with a dataset and systematically choose the proper candidate library and sparse regression optimizer that are tailored for a specific scientific task. The `GeneralizedLibrary` class allows for tensoring, concatenating, and otherwise combining many different candidate libraries.
This flow chart summarizes how ``PySINDy`` users can start with a dataset and systematically choose the proper candidate library and sparse regression optimizer that are tailored for a specific scientific task. The ``GeneralizedLibrary`` class allows for tensoring, concatenating, and otherwise combining many different candidate libraries.

Community guidelines
--------------------
Expand Down Expand Up @@ -209,7 +209,7 @@ There are a number of SINDy variants and advanced functionality that would be gr

4. Integration of PySINDy with a Python model-predictive control (MPC) code.

5. The PySINDy weak formulation is based on the work in Reinbold, Patrick AK, Daniel R. Gurevich, and Roman O. Grigoriev. "Using noisy or incomplete data to discover models of spatiotemporal dynamics." Physical Review E 101.1 (2020): 010203. It might be useful to additionally implement the weak formulation from Messenger, Daniel A., and David M. Bortz. "Weak SINDy for partial differential equations." Journal of Computational Physics (2021): 110525. The weak formulation in PySINDy is also fairly slow and computationally intensive, so finding ways to speed up the code would be great.
5. The PySINDy weak formulation is based on the work in Reinbold, Patrick AK, Daniel R. Gurevich, and Roman O. Grigoriev. "Using noisy or incomplete data to discover models of spatiotemporal dynamics." Physical Review E 101.1 (2020): 010203. It might be useful to additionally implement the weak formulation from Messenger, Daniel A., and David M. Bortz. "Weak SINDy for partial differential equations." Journal of Computational Physics (2021): 110525. The weak formulation in PySINDy is also fairly slow and computationally intensive, so finding ways to speed up the code would be great.

6. The blended conditional gradients (BCG) algorithm for solving the constrained LASSO problem, Carderera, Alejandro, et al. "CINDy: Conditional gradient-based Identification of Non-linear Dynamics--Noise-robust recovery." arXiv preprint arXiv:2101.02630 (2021).

Expand Down Expand Up @@ -252,18 +252,18 @@ Bibtex:

.. code-block:: text

@article{Kaptanoglu2022,
doi = {10.21105/joss.03994},
url = {https://doi.org/10.21105/joss.03994},
year = {2022},
publisher = {The Open Journal},
volume = {7},
number = {69},
pages = {3994},
author = {Alan A. Kaptanoglu and Brian M. de Silva and Urban Fasel and Kadierdan Kaheman and Andy J. Goldschmidt and Jared Callaham and Charles B. Delahunt and Zachary G. Nicolaou and Kathleen Champion and Jean-Christophe Loiseau and J. Nathan Kutz and Steven L. Brunton},
title = {PySINDy: A comprehensive Python package for robust sparse system identification},
journal = {Journal of Open Source Software}
}
@article{Kaptanoglu2022,
doi = {10.21105/joss.03994},
url = {https://doi.org/10.21105/joss.03994},
year = {2022},
publisher = {The Open Journal},
volume = {7},
number = {69},
pages = {3994},
author = {Alan A. Kaptanoglu and Brian M. de Silva and Urban Fasel and Kadierdan Kaheman and Andy J. Goldschmidt and Jared Callaham and Charles B. Delahunt and Zachary G. Nicolaou and Kathleen Champion and Jean-Christophe Loiseau and J. Nathan Kutz and Steven L. Brunton},
title = {PySINDy: A comprehensive Python package for robust sparse system identification},
journal = {Journal of Open Source Software}
}


References
Expand All @@ -275,7 +275,7 @@ References
`[arXiv] <https://arxiv.org/abs/2004.08424>`__

- Kaptanoglu, Alan A., Brian M. de Silva, Urban Fasel, Kadierdan Kaheman, Andy J. Goldschmidt
Jared L. Callaham, Charles B. Delahunt, Zachary G. Nicolaou, Kathleen Champion,
Jared L. Callaham, Charles B. Delahunt, Zachary G. Nicolaou, Kathleen Champion,
Jean-Christophe Loiseau, J. Nathan Kutz, and Steven L. Brunton.
*PySINDy: A comprehensive Python package for robust sparse system identification.*
arXiv preprint arXiv:2111.08481 (2021).
Expand Down Expand Up @@ -342,7 +342,7 @@ Thanks to the members of the community who have contributed to PySINDy!

.. |JOSS1| image:: https://joss.theoj.org/papers/82d080bbe10ac3ab4bc03fa75f07d644/status.svg
:target: https://joss.theoj.org/papers/82d080bbe10ac3ab4bc03fa75f07d644

.. |JOSS2| image:: https://joss.theoj.org/papers/10.21105/joss.03994/status.svg
:target: https://doi.org/10.21105/joss.03994

Expand Down
56 changes: 39 additions & 17 deletions docs/conf.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
import importlib
import pathlib
import shutil
from pathlib import Path

author = "dynamicslab"
project = "pysindy" # package name
Expand All @@ -15,16 +16,17 @@
master_doc = "index"

extensions = [
"nbsphinx",
"sphinxcontrib.apidoc",
"sphinx.ext.autodoc",
"sphinx.ext.todo",
"sphinx.ext.viewcode",
"sphinx.ext.autosummary",
"sphinx.ext.napoleon",
"sphinx.ext.mathjax",
"sphinx_nbexamples",
"sphinx.ext.intersphinx",
]
nb_execution_mode = "off"

apidoc_module_dir = f"../{project}"
apidoc_excluded_paths = ["tests"]
Expand All @@ -34,19 +36,14 @@
autodoc_member_order = "bysource"
autoclass_content = "init"

language = None
language = "en"

here = pathlib.Path(__file__).parent
here = Path(__file__).parent.resolve()

if (here / "static/custom.css").exists():

html_static_path = ["static"]

def setup(app):
app.add_css_file("custom.css")


exclude_patterns = ["build", "_build"]
exclude_patterns = ["build", "_build", "Youtube"]
# pygments_style = "sphinx"

add_module_names = True
Expand All @@ -61,13 +58,6 @@ def setup(app):
default_role = "any"
html_sourcelink_suffix = ""

example_gallery_config = dict(
dont_preprocess=True,
examples_dirs=["../examples"],
gallery_dirs=["examples"],
pattern=".+.ipynb",
)

intersphinx_mapping = {
"derivative": ("https://derivative.readthedocs.io/en/latest/", None)
}
Expand Down Expand Up @@ -110,3 +100,35 @@ def patched_parse(self):

GoogleDocstring._unpatched_parse = GoogleDocstring._parse
GoogleDocstring._parse = patched_parse


def setup(app):
"""Our sphinx extension for copying from examples/ to docs/examples

Since nbsphinx does not handle glob/regex paths, we need to
manually copy documentation source files from examples. See issue
# 230.
"""
doc_examples = here / "examples"
if not doc_examples.exists():
(here / "examples").mkdir()
example_source = (here / "../examples").resolve()
source_notebooks = example_source.glob("**/*.ipynb")
shutil.copy(example_source / "README.rst", doc_examples / "index.rst")
for notebook in source_notebooks:
if notebook.parent == example_source:
new_dir = doc_examples / notebook.stem
else:
new_dir = doc_examples / notebook.parent.stem
new_dir.mkdir(exist_ok=True)
new_file = new_dir / "example.ipynb"
print(f"Creating file {new_file}")
shutil.copy(notebook, new_file)
# Notebook 15 uses an image file
(doc_examples / "15_pysindy_lectures/data").mkdir(exist_ok=True)
shutil.copy(
example_source / "data/optimizer_summary.jpg",
doc_examples / "15_pysindy_lectures/data/optimizer_summary.jpg",
)
if (here / "static/custom.css").exists():
app.add_css_file("custom.css")
18 changes: 9 additions & 9 deletions docs/tips.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,18 +10,18 @@ Numerical differentiation is one of the core components of the SINDy method. Der

.. math::

\dot{X} \approx \Theta(X)\Xi.
\dot{X} \approx \Theta(X)\Xi.

If care is not taken in computing these derivatives, the quality of the learned model is likely to suffer.

By default, a second order finite difference method is used to differentiate input data. Finite difference methods tend to amplify noise in data. If the data are smooth (at least twice differentiable), then finite difference methods give accurate derivative approximations. When the data are noisy, they give derivative estimates with *more* noise than the original data. The following figure visualizes the impact of noise on numerical derivatives. Note that even a small amount of noise in the data can produce noticeable degradation in the quality of the numerical derivative.

.. figure:: figures/noisy_differentiation.png
:align: center
:alt: A toy example illustrating the effect of noise on derivatives computed with a second order finite difference method
:figclass: align-center
:align: center
:alt: A toy example illustrating the effect of noise on derivatives computed with a second order finite difference method
:figclass: align-center

A toy example illustrating the effect of noise on derivatives computed with a second order finite difference method. Left: The data to be differentiated; :math:`y=\sin(x)` with and without a small amount of additive noise (normally distributed with mean 0 and standard deviation 0.01). Right: Derivatives of the data; the exact derivative :math:`\cos(x)` (blue), the finite difference derivative of the exact data (black, dashed), and the finite difference derivative of the noisy data.
A toy example illustrating the effect of noise on derivatives computed with a second order finite difference method. Left: The data to be differentiated; :math:`y=\sin(x)` with and without a small amount of additive noise (normally distributed with mean 0 and standard deviation 0.01). Right: Derivatives of the data; the exact derivative :math:`\cos(x)` (blue), the finite difference derivative of the exact data (black, dashed), and the finite difference derivative of the noisy data.

One way to mitigate the effects of noise is to smooth the measurements before computing derivatives. The :code:`SmoothedFiniteDifference` method can be used for this purpose.
A numerical differentiation scheme with total variation regularization has also been proposed [Chartrand_2011]_ and recommended for use in SINDy [Brunton_2016]_.
Expand All @@ -35,11 +35,11 @@ The SINDy method assumes dynamics can be represented as a *sparse* linear combin

Typically, prior knowledge of the system of interest and its dynamics should be used to make a judicious choice of basis functions. When such information is unavailable, the default class of library functions, polynomials, are a good place to start, as smooth functions have rapidly converging Taylor series. Brunton et al. [Brunton_2016]_ showed that, equipped with a polynomial library, SINDy can recover the first few terms of the (zero-centered) Taylor series of the true right-hand side function :math:`\mathbf{f}(x)`. If one has reason to believe the dynamics can be sparsely represented in terms of Chebyshev polynomials rather than monomials, then the library should include Chebyshev polynomials.

PySINDy includes the :code:`CustomLibrary` and :code:`IdentityLibrary` objects to allow for flexibility in the library functions. When the desired library consists of a set of functions that should be applied to each measurement variable (or pair, triplet, etc. of measurement variables) in turn, the :code:`CustomLibrary` class should be used. The :code:`IdentityLibrary` class is the most customizable, but transfers the work of computing library functions over to the user. It expects that all the features one wishes to include in the library have already been computed and are present in :code:`X` before :code:`SINDy.fit` is called, as it simply applies the identity map to each variable that is passed to it.
PySINDy includes the :code:`CustomLibrary` and :code:`IdentityLibrary` objects to allow for flexibility in the library functions. When the desired library consists of a set of functions that should be applied to each measurement variable (or pair, triplet, etc. of measurement variables) in turn, the :code:`CustomLibrary` class should be used. The :code:`IdentityLibrary` class is the most customizable, but transfers the work of computing library functions over to the user. It expects that all the features one wishes to include in the library have already been computed and are present in :code:`X` before :code:`SINDy.fit` is called, as it simply applies the identity map to each variable that is passed to it.
It is best suited for situations in which one has very specific instructions for how to apply library functions (e.g. if some of the functions should be applied to only some of the input variables).

As terms are added to the library, the underlying sparse regression problem becomes increasingly ill-conditioned. Therefore it is recommended to start with a small library whose size is gradually expanded until the desired level of performance is achieved.
For example, a user may wish to start with a library of linear terms and then add quadratic and cubic terms as necessary to improve model performance.
As terms are added to the library, the underlying sparse regression problem becomes increasingly ill-conditioned. Therefore it is recommended to start with a small library whose size is gradually expanded until the desired level of performance is achieved.
For example, a user may wish to start with a library of linear terms and then add quadratic and cubic terms as necessary to improve model performance.
For the best results, the strength of regularization applied should be increased in proportion to the size of the library to account for the worsening condition number of the resulting linear system.

Users may also choose to implement library classes tailored to their applications. To do so one should have the new class inherit from our :code:`BaseFeatureLibrary` class. See the documentation for guidance on which functions the new class is expected to implement.
Expand Down Expand Up @@ -75,4 +75,4 @@ Some general best practices regarding regularization follow. Most problems will

.. [Champion_2019] K. Champion, P. Zheng, A. Y. Aravkin, S. L. Brunton, and J. N. Kutz, “A unified sparse optimization framework to learn parsimonious physics-informed models from data,” *arXiv preprint arXiv:1906.10612*, 2019.

.. [Bishop_2016] C. M. Bishop, Pattern recognition and machine learning. Springer, 2006.
.. [Bishop_2016] C. M. Bishop, Pattern recognition and machine learning. Springer, 2006.
32 changes: 19 additions & 13 deletions examples/10_PDEFIND_examples.ipynb

Large diffs are not rendered by default.

Loading