Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SLEPLET: Slepian Scale-Discretised Wavelets in Python (#148) #149

Closed
15 of 30 tasks
paddyroddy opened this issue Nov 16, 2023 · 40 comments
Closed
15 of 30 tasks

SLEPLET: Slepian Scale-Discretised Wavelets in Python (#148) #149

paddyroddy opened this issue Nov 16, 2023 · 40 comments

Comments

@paddyroddy
Copy link

paddyroddy commented Nov 16, 2023

Submitting Author: Patrick J. Roddy (@paddyroddy)
All current maintainers: (@paddyroddy)
Package Name: SLEPLET
One-Line Description of Package: Slepian Scale-Discretised Wavelets in Python
Repository Link: https://github.com/astro-informatics/sleplet
Version submitted: v1.4.4
Editor: Szymon Moliński (@SimonMolinsky )
Reviewer 1: Shannon Quinn (@magsol )
Reviewer 2: Jakub Tomasz Gnyp (@gnypit )
Archive: 10.5281/zenodo.7268074
JOSS DOI: 10.21105/joss.05221
Version accepted: 1.4.7
Date accepted (month/day/year): 2024/05/21


Code of Conduct & Commitment to Maintain Package

Description

  • Include a brief paragraph describing what your package does:

SLEPLET is a Python package for the construction of Slepian wavelets in the spherical and manifold (via meshes) settings. SLEPLET handles any spherical region as well as the general manifold setting. The API is documented and easily extendible, designed in an object-orientated manner. Upon installation, SLEPLET comes with two command line interfaces - sphere and mesh - that allow one to easily generate plots on the sphere and a set of meshes using plotly. Whilst these scripts are the primary intended use, SLEPLET may be used directly to generate the Slepian coefficients in the spherical/manifold setting and use methods to convert these into real space for visualisation or other intended purposes. The construction of the sifting convolution was required to create Slepian wavelets. As a result, there are also many examples of functions on the sphere in harmonic space (rather than Slepian) that were used to demonstrate its effectiveness. SLEPLET has been used in the development of various papers.

Scope

  • Please indicate which category or categories.
    Check out our package scope page to learn more about our
    scope. (If you are unsure of which category you fit, we suggest you make a pre-submission inquiry):

    • Data retrieval
    • Data extraction
    • Data processing/munging
    • Data deposition
    • Data validation and testing
    • Data visualization1
    • Workflow automation
    • Citation management and bibliometrics
    • Scientific software wrappers
    • Database interoperability

Domain Specific & Community Partnerships

- [ ] Geospatial
- [ ] Education
- [ ] Pangeo

Community Partnerships

If your package is associated with an
existing community please check below:

  • For all submissions, explain how the and why the package falls under the categories you indicated above. In your explanation, please address the following points (briefly, 1-2 sentences for each):

    • Who is the target audience and what are scientific applications of this package?

Many fields in science and engineering measure data that inherently live on non-Euclidean geometries, such as the sphere. Techniques developed in the Euclidean setting must be extended to other geometries. Due to recent interest in geometric deep learning, analogues of Euclidean techniques must also handle general manifolds or graphs. Often, data are only observed over partial regions of manifolds, and thus standard whole-manifold techniques may not yield accurate predictions. Slepian wavelets are designed for datasets like these. Slepian wavelets are built upon the eigenfunctions of the Slepian concentration problem of the manifold: a set of bandlimited functions that are maximally concentrated within a given region. Wavelets are constructed through a tiling of the Slepian harmonic line by leveraging the existing scale-discretised framework. Whilst these wavelets were inspired by spherical datasets, like in cosmology, the wavelet construction may be utilised for manifold or graph data.

  • Are there other Python packages that accomplish the same thing? If so, how does yours differ?

To the author's knowledge, there is no public software that allows one to compute Slepian wavelets
(or a similar approach) on the sphere or general manifolds/meshes. SHTools is a Python code used for spherical harmonic transforms, which allows one to compute the Slepian functions of the spherical polar cap. A series of MATLAB scripts exist in slepian_alpha, which permits the calculation of the Slepian functions on the sphere. However, these scripts are very specialised and hard to generalise.

Technical checks

For details about the pyOpenSci packaging requirements, see our packaging guide. Confirm each of the following by checking the box. This package:

  • does not violate the Terms of Service of any service it interacts with.
  • uses an OSI approved license.
  • contains a README with instructions for installing the development version.
  • includes documentation with examples for all functions.
  • contains a tutorial with examples of its essential functions and uses.
  • has a test suite.
  • has continuous integration setup, such as GitHub Actions CircleCI, and/or others.

Publication Options

JOSS Checks
  • The package has an obvious research application according to JOSS's definition in their submission requirements. Be aware that completing the pyOpenSci review process does not guarantee acceptance to JOSS. Be sure to read their submission requirements (linked above) if you are interested in submitting to JOSS.
  • The package is not a "minor utility" as defined by JOSS's submission requirements: "Minor ‘utility’ packages, including ‘thin’ API clients, are not acceptable." pyOpenSci welcomes these packages under "Data Retrieval", but JOSS has slightly different criteria.
  • The package contains a paper.md matching JOSS's requirements with a high-level description in the package root or in inst/.
  • The package is deposited in a long-term repository with the DOI:

Note: JOSS accepts our review as theirs. You will NOT need to go through another full review. JOSS will only review your paper.md file. Be sure to link to this pyOpenSci issue when a JOSS issue is opened for your package. Also be sure to tell the JOSS editor that this is a pyOpenSci reviewed package once you reach this step.

Are you OK with Reviewers Submitting Issues and/or pull requests to your Repo Directly?

This option will allow reviewers to open smaller issues that can then be linked to PR's rather than submitting a more dense text based review. It will also allow you to demonstrate addressing the issue via PR links.

  • Yes I am OK with reviewers submitting requested changes as issues to my repo. Reviewers will then link to the issues in their submitted review.

Confirm each of the following by checking the box.

  • I have read the author guide.
  • I expect to maintain this package for at least 2 years and can help find a replacement for the maintainer (team) if needed.

Please fill out our survey

P.S. Have feedback/comments about our review process? Leave a comment here

Editor and Review Templates

The editor template can be found here.

The review template can be found here.

Footnotes

  1. Please fill out a pre-submission inquiry before submitting a data visualization package.

@NickleDave
Copy link
Contributor

Hi @paddyroddy I'm adding the initial Editor-in-Chief checks.

The tl;dr is that we are about ready to start but I need you to do two things first:

  • Add a code of conduct
  • Add a "layperson" summary of the package to the docs page
    • The first thing someone should read on the docs is a statement of what the package does--I wouldn't assume someone will bother to click again to go to PyPI. Provide enough background that someone outside your domain will have some understanding of how the package fits into a bigger picture, kind of like the first couple of sentences of the abstract of your thesis.

I will add one more comment below but it is not required that you address it before we start the review

Editor in Chief checks

Hi there! Thank you for submitting your package for pyOpenSci
review. Below are the basic checks that your package needs to pass
to begin our review. If some of these are missing, we will ask you
to work on them before the review process begins.

Please check our Python packaging guide for more information on the elements
below.

  • Installation The package can be installed from a community repository such as PyPI (preferred), and/or a community channel on conda (e.g. conda-forge, bioconda).
    • The package imports properly into a standard Python environment import package.
  • Fit The package meets criteria for fit and overlap.
  • Documentation The package has sufficient online documentation to allow us to evaluate package function and scope without installing the package. This includes:
    • User-facing documentation that overviews how to install and start using the package.
    • Short tutorials that help a user understand how to use the package and what it can do for them.
    • API documentation (documentation for your code's functions, classes, methods and attributes): this includes clearly written docstrings with variables defined using a standard docstring format.
  • Core GitHub repository Files
    • README The package has a README.md file with clear explanation of what the package does, instructions on how to install it, and a link to development instructions.
    • Contributing File The package has a CONTRIBUTING.md file that details how to install and contribute to the package.
    • Code of Conduct The package has a CODE_OF_CONDUCT.md file.
    • License The package has an OSI approved license.
      NOTE: We prefer that you have development instructions in your documentation too.
  • Issue Submission Documentation All of the information is filled out in the YAML header of the issue (located at the top of the issue template).
  • Automated tests Package has a testing suite and is tested via a Continuous Integration service.
  • Repository The repository link resolves correctly.
  • Package overlap The package doesn't entirely overlap with the functionality of other packages that have already been submitted to pyOpenSci.

  • Initial onboarding survey was filled out
    We appreciate each maintainer of the package filling out this survey individually. 🙌
    Thank you authors in advance for setting aside five to ten minutes to do this. It truly helps our organization. 🙌


Editor comments

  • when I first tested that I could import the package, it downloaded a lot of data. Many people might find this surprising. I would suggest adding data directly to the package if possible--if the files are small enough--and/or deferring download of large files

@NickleDave
Copy link
Contributor

NickleDave commented Nov 21, 2023

@paddyroddy related to that to-do item for the docs:
my understanding is that the goal for SLEPLET is to provide general access to the wavelet basis/transform you have developed for incomplete data on manifolds (feel free to correct me if I'm wrong)

Part of what we can do with this review is help you reach a broader audience. But that means we need to know where to look for editors and reviewers.

Can you please help me understand the main domains where you have applied and can apply your methods? I see you are an RSE in an astro group and that you mention cosmology in your thesis; in the papers you apply your methods to topographic data from Earth. Are these methods mainly used by people in cosmology, geophysics, computer imaging, all of the above? Could this work relate to geodesy in earth science?

As examples of places where we might look for editors/reviewers: we work with astropy and software underground. I can also imagine asking people that work on manifolds in general and geometric deep learning, and another place we might look would be users/developers of packages you rely on or mention, e.g. pyssht/py2let. But my sense is that we will want domain experts that can help you translate the methods you've developed for a broader audience.

Any more concrete detail you can give me about domains to look at for editors + reviewers would be helpful. We are looking now, but I want to get your input to make sure this review helps you.

@paddyroddy
Copy link
Author

The tl;dr is that we are about ready to start but I need you to do two things first:

* [ ]  Add a code of conduct

* [ ]  Add a "layperson" summary of the package to the docs page
  
  * The first thing someone should read on the docs is a statement of what the package does--I wouldn't assume someone will bother to click again to go to PyPI. Provide enough background that someone outside your domain will have some understanding of how the package fits into a bigger picture, kind of like the first couple of sentences of the abstract of your thesis.

I've added these. Mine isn't particularly lay, but they are a fairly specialised application, so hopefully from the description it's obvious who the target audience are!

@paddyroddy
Copy link
Author

paddyroddy commented Nov 24, 2023

Editor comments

* when I first tested that I could import the package, it downloaded a lot of data. Many people might find this surprising. I would suggest adding data directly to the package if possible--if the files are small enough--and/or deferring download of large files

Annoyingly, I hadn't spotted this, as I cache after the first import. I've had a look and the following are created:

-rw-------    1 paddy staff  74M Nov 24 18:37 EGM2008_Topography_flms_L2190.mat
-rw-------    1 paddy staff 5.6K Nov 24 18:37 meshes_laplacians_basis_functions_bird_b697_eigenvalues.npy
-rw-------    1 paddy staff  15M Nov 24 18:37 meshes_laplacians_basis_functions_bird_b697_eigenvectors.npy
-rw-------    1 paddy staff 5.6K Nov 24 18:37 meshes_laplacians_slepian_functions_bird_b697_N194_eigenvalues.npy
-rw-------    1 paddy staff 1.1M Nov 24 18:37 meshes_laplacians_slepian_functions_bird_b697_N194_eigenvectors.npy
-rw-------    1 paddy staff  132 Nov 24 18:37 slepian_masks_africa_L1.npy
-rw-------    1 paddy staff  132 Nov 24 18:37 slepian_masks_south_america_L1.npy

The problem (in particular) is PyPI doesn't allow over 50MB. I need to work out a way to delay the downloading of EGM2008_Topography_flms_L2190.mat.

@paddyroddy
Copy link
Author

paddyroddy commented Nov 24, 2023

The problem (in particular) is PyPI doesn't allow over 50MB. I need to work out a way to delay the downloading of EGM2008_Topography_flms_L2190.mat.

Fixed this in astro-informatics/sleplet#336 (v1.4.5), turns out I had introduced the bug using pydantic.Field(default= values. The data will still be downloaded as and when - but not on initial import. And, as before, it is cached so will only happen once.

@paddyroddy
Copy link
Author

@paddyroddy related to that to-do item for the docs: my understanding is that the goal for SLEPLET is to provide general access to the wavelet basis/transform you have developed for incomplete data on manifolds (feel free to correct me if I'm wrong)

Yep that's right!

Can you please help me understand the main domains where you have applied and can apply your methods? I see you are an RSE in an astro group and that you mention cosmology in your thesis; in the papers you apply your methods to topographic data from Earth. Are these methods mainly used by people in cosmology, geophysics, computer imaging, all of the above? Could this work relate to geodesy in earth science?

So I'm actually a general RSE (not strictly working on this), but was formerly a PhD student in an astrophysics group, specialising in cosmology. In reality, these methods could be used by any field in which data is measured only in certain parts of a manifold. Be that: oceanography, geophysics, solar physics, astrophysics, cosmology, computer graphics, machine learning etc. In reality, there are many fields which could use methods such as these.

As examples of places where we might look for editors/reviewers: we work with astropy and software underground. I can also imagine asking people that work on manifolds in general and geometric deep learning, and another place we might look would be users/developers of packages you rely on or mention, e.g. pyssht/py2let. But my sense is that we will want domain experts that can help you translate the methods you've developed for a broader audience.

Those all sound like reasonable places. Would be very interested to have others people's insights. Sorry, I'm not sure how else to help on that front!

@NickleDave
Copy link
Contributor

The problem (in particular) is PyPI doesn't allow over 50MB. I need to work out a way to delay the downloading of EGM2008_Topography_flms_L2190.mat.

Sounds like you fixed the issue overall but jic you're still figuring this out: are you familiar with pooch?
A pattern a lot of core scientific Python packages seem to be converging on is "put the data in the package and use importlib-resources if it fits on PyPI; for everything else use pooch". See for example scikit-image: scikit-image/scikit-image#5146 or librosa: https://github.com/librosa/librosa/tree/main/librosa/util/example_data

@NickleDave
Copy link
Contributor

Thank you @paddyroddy for your replies about the domain, very helpful.
I am very sympathetic to developing methods that are useful across domains (FFT anyone?) but just want to make sure we're helping you reach that broader audience.

We are looking for an editor + reviewers now.

@paddyroddy
Copy link
Author

Sounds like you fixed the issue overall but jic you're still figuring this out: are you familiar with pooch?

Yes, I'm using pooch...

@NickleDave
Copy link
Contributor

Whoops, guess I could've just looked 😝 😇 We can say I'm trying to be helpful

@NickleDave
Copy link
Contributor

Hi @paddyroddy, happy new year's 🙂
I'm happy to report that @SimonMolinsky will be the editor for this review, I'll let him take it from here.
We are now actively seeking reviewers.

@SimonMolinsky
Copy link
Collaborator

Hi @paddyroddy !

From now on, I'll be responsible for the review process. :) I've looked into your package, the JOSS publication, and related publications, and everything is outstanding. I agree with @NickleDave that your package needs more usability review from potential users than a technical review. I will look for reviewers in specific communities interested in applying your algorithms in their solution.

@paddyroddy
Copy link
Author

Thank you, @SimonMolinsky! Looking forward to the next steps

@SimonMolinsky
Copy link
Collaborator

@paddyroddy I'm still looking for reviewers. Hopefully, we will start the review at the beginning of next week! (We have one good soul on board, waiting for the response from other people). Thank you for your patience!

@SimonMolinsky
Copy link
Collaborator

Hi @paddyroddy , we have our reviewers. We can start a review process.

@SimonMolinsky
Copy link
Collaborator

Editor response to review:


Editor comments

👋 Hi @magsol and @gnypit ! Thank you for volunteering to review
for pyOpenSci! I'm here to help you with the review, so if you need any support, then let me know immediately! But first, look at the steps below 👇 @magsol you may start as soon as you want; @gnypit will join us after 22nd of February!

Please fill out our pre-review survey

Before beginning your review, please fill out our pre-review survey. This helps us improve all aspects of our review and better understand our community. No personal data will be shared from this survey - it will only be used in an aggregated format by our Executive Director to improve our processes and programs.

  • reviewer 1 survey completed.
  • reviewer 2 survey completed.

The following resources will help you complete your review:

  1. Here is the reviewers guide. This guide contains all of the steps and information needed to complete your review.
  2. Here is the review template that you will need to fill out and submit
    here as a comment, once your review is complete.

Please get in touch with any questions or concerns! Your review is due: 2024-03-17

Reviewers: Shannon Quinn, Jakub Tomasz Gnyp
Due date: 2024-03-17

@SimonMolinsky
Copy link
Collaborator

Hi @magsol , @gnypit !

Please let us know how it's going :)

@magsol
Copy link

magsol commented Mar 6, 2024

@SimonMolinsky Just finished the reviewer survey, and am starting to work my way through the Reviewer Guide. Hopefully will have some more detailed updates soon :) (sorry for the delay in starting-- tl;dr it's been a very chaotic past month)

@magsol
Copy link

magsol commented Mar 8, 2024

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

  • As the reviewer I confirm that there are no conflicts of interest for me to review this work (If you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

The package includes all the following forms of documentation:

  • A statement of need clearly stating problems the software is designed to solve and its target audience in README.
  • Installation instructions: for the development version of the package and any non-standard dependencies in README.
  • Vignette(s) demonstrating major functionality that runs successfully locally.
  • Function Documentation: for all user-facing functions.
  • Examples for all user-facing functions.
  • Community guidelines including contribution guidelines in the README or CONTRIBUTING.
  • Metadata including author(s), author e-mail(s), a url, and any other relevant metadata e.g., in a pyproject.toml file or elsewhere.

Readme file requirements
The package meets the readme requirements below:

  • Package has a README.md file in the root directory.

The README should include, from top to bottom:

  • The package name
  • Badges for:
    • Continuous integration and test coverage,
    • Docs building (if you have a documentation website),
    • A repostatus.org badge,
    • Python versions supported,
    • Current package version (on PyPI / Conda).

NOTE: If the README has many more badges, you might want to consider using a table for badges: see this example. Such a table should be more wide than high. (Note that the a badge for pyOpenSci peer-review will be provided upon acceptance.)

  • Short description of package goals.
  • Package installation instructions
  • Any additional setup required to use the package (authentication tokens, etc.)
  • Descriptive links to all vignettes. If the package is small, there may only be a need for one vignette which could be placed in the README.md file.
    • Brief demonstration of package usage (as it makes sense - links to vignettes could also suffice here if package description is clear)
  • Link to your documentation website.
  • If applicable, how the package compares to other similar packages and/or how it relates to other packages in the scientific ecosystem.
  • Citation information

Usability

Reviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole.
Package structure should follow general community best-practices. In general please consider whether:

  • Package documentation is clear and easy to find and use.
  • The need for the package is clear
  • All functions have documentation and associated examples for use
  • The package is easy to install

Functionality

  • Installation: Installation succeeds as documented.
  • Functionality: Any functional claims of the software been confirmed.
  • Performance: Any performance claims of the software been confirmed.
  • Automated tests:
    • All tests pass on the reviewer's local machine for the package version submitted by the author. Ideally this should be a tagged version making it easy for reviewers to install.
    • Tests cover essential functions of the package and a reasonable range of inputs and conditions.
  • Continuous Integration: Has continuous integration setup (We suggest using Github actions but any CI platform is acceptable for review)
  • Packaging guidelines: The package conforms to the pyOpenSci packaging guidelines.
    A few notable highlights to look at:
    • Package supports modern versions of Python and not End of life versions.
    • Code format is standard throughout package and follows PEP 8 guidelines (CI tests for linting pass)

For packages also submitting to JOSS

Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.

The package contains a paper.md matching JOSS's requirements with:

  • A short summary describing the high-level functionality of the software
  • Authors: A list of authors with their affiliations
  • A statement of need clearly stating problems the software is designed to solve and its target audience.
  • References: With DOIs for all those that have one (e.g. papers, datasets, software).

Final approval (post-review)

  • The author has responded to my review and made changes to my satisfaction. I recommend approving this package.

Estimated hours spent reviewing:

Somewhere in the 2-3 hours range.


Review Comments

@magsol
Copy link

magsol commented Mar 18, 2024

Sorry, it was a very busy past week + weekend. I plan on finishing this review tomorrow. Apologies for the delay!

paddyroddy added a commit to astro-informatics/sleplet that referenced this issue Mar 20, 2024
@paddyroddy
Copy link
Author

paddyroddy commented Mar 20, 2024

Thanks, @magsol! I wasn't aware of repostatus.org, I have added one now in astro-informatics/sleplet#360.

@magsol
Copy link

magsol commented Mar 20, 2024

I'm pretty happy with it! I recommend approval :) Thanks @paddyroddy! Great work!

@SimonMolinsky
Copy link
Collaborator

@magsol Thank you for your review! It's great to hear that SLEPLET is a well-written package :)

@paddyroddy I heard from @gnypit, and he will review the package soon.

@gnypit
Copy link

gnypit commented Apr 15, 2024 via email

@paddyroddy
Copy link
Author

@gnypit I'm sorry to hear about the recent events in your life. I hope things are getting back in control.

Thank you for constructive feedback regarding comments, documentation and demos.

I have tried to automate the documentation construction as much as possible, and hence it is a bit light in some places. However, just for your information, the parts in which it is documentation-light (as you say) are the sections that are private (i.e. not exposed to the users) indicated by the leading _. This does beg the question of who @pyOpenSci is aiming to, is it users, developers or both? To me, as a user of a package, it is not often I'd be digging around the internals. As a developer, it is something I might do if the situation requires it. I believe all functions (public or private) are named fairly sensibly, and have docstrings.

Regarding comments, I feel this is personal style - I tend to put them when the step is non-intuitive (for me), but I feel it can clutter the code somewhat if it is everywhere.

As for demos, I have various demos in https://github.com/astro-informatics/sleplet/tree/main/examples. Would surrounding these with explanatory prose suffice your request? I'm not a big fan of notebooks personally, but could convert the files in the examples folder to notebooks.

@SimonMolinsky
Copy link
Collaborator

SimonMolinsky commented Apr 15, 2024

Hi @gnypit , sorry to hear you have experienced such things; hopefully, your situation improves!

@gnypit Thank you for your review; just one more thing: could you copy & paste the review template into a comment in this thread and select boxes based on your review: https://www.pyopensci.org/software-peer-review/how-to/reviewer-guide.html#peer-review-template (You should be logged in to GitHub, it won't be possible from email).

@gnypit , @paddyroddy :

However, just for your information, the parts that are documentation-light (as you say) are the sections that are private (i.e., not exposed to the users) indicated by the leading _. This does beg the question of who https://github.com/pyOpenSci is aiming at: users, developers, or both?

Answering this question: docs in private functions are not required. @pyOpenSci focuses on both users and developers - as potential maintainers. Documenting private functions is nice to have but totally optional 😊

@gnypit
Copy link

gnypit commented Apr 20, 2024

Okay @SimonMolinsky & @paddyroddy - just in case, I'm still dabbling in the functionalities on a Linux VM, but:

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

  • As the reviewer I confirm that there are no conflicts of interest for me to review this work (If you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

The package includes all the following forms of documentation:

  • A statement of need clearly stating problems the software is designed to solve and its target audience in README.
  • Installation instructions: for the development version of the package and any non-standard dependencies in README.
  • Vignette(s) demonstrating major functionality that runs successfully locally.
  • Function Documentation: for all user-facing functions.
  • Examples for all user-facing functions.
  • Community guidelines including contribution guidelines in the README or CONTRIBUTING.
  • Metadata including author(s), author e-mail(s), a url, and any other relevant metadata e.g., in a pyproject.toml file or elsewhere.

Readme file requirements
The package meets the readme requirements below:

  • Package has a README.md file in the root directory.

The README should include, from top to bottom:

  • The package name
  • Badges for:
    • Continuous integration and test coverage,
    • Docs building (if you have a documentation website),
    • A repostatus.org badge,
    • Python versions supported,
    • Current package version (on PyPI / Conda).

NOTE: If the README has many more badges, you might want to consider using a table for badges: see this example. Such a table should be more wide than high. (Note that the a badge for pyOpenSci peer-review will be provided upon acceptance.)

  • Short description of package goals.
  • Package installation instructions
  • Any additional setup required to use the package (authentication tokens, etc.)
  • Descriptive links to all vignettes. If the package is small, there may only be a need for one vignette which could be placed in the README.md file.
    • Brief demonstration of package usage (as it makes sense - links to vignettes could also suffice here if package description is clear)
  • Link to your documentation website.
  • If applicable, how the package compares to other similar packages and/or how it relates to other packages in the scientific ecosystem.
  • Citation information

Usability

Reviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole.
Package structure should follow general community best-practices. In general please consider whether:

  • Package documentation is clear and easy to find and use.
  • The need for the package is clear
  • All functions have documentation and associated examples for use
  • The package is easy to install

Functionality

  • Installation: Installation succeeds as documented.
  • Functionality: Any functional claims of the software been confirmed.
  • Performance: Any performance claims of the software been confirmed.
  • Automated tests:
    • All tests pass on the reviewer's local machine for the package version submitted by the author. Ideally this should be a tagged version making it easy for reviewers to install.
    • Tests cover essential functions of the package and a reasonable range of inputs and conditions.
  • Continuous Integration: Has continuous integration setup (We suggest using Github actions but any CI platform is acceptable for review)
  • Packaging guidelines: The package conforms to the pyOpenSci packaging guidelines.
    A few notable highlights to look at:
    • Package supports modern versions of Python and not End of life versions.
    • Code format is standard throughout package and follows PEP 8 guidelines (CI tests for linting pass)

For packages also submitting to JOSS

Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.

The package contains a paper.md matching JOSS's requirements with:

  • A short summary describing the high-level functionality of the software
  • Authors: A list of authors with their affiliations
  • A statement of need clearly stating problems the software is designed to solve and its target audience.
  • References: With DOIs for all those that have one (e.g. papers, datasets, software).

Final approval (post-review)

  • The author has responded to my review and made changes to my satisfaction. I recommend approving this package.

Estimated hours spent reviewing: 30h by @gnypit + time spent by @magsol


Review Comments

The overall impression is very good: the code is mainly OOP, with clear organisation of files, their names, and the names of classes, methods and functions. The documentation and examples could use a little more prose around them or comments within them, as already discussed. However, all user-faced parts of code are very thoroughly written and the private functiones, methods, etc. are clearly indicated. Nevertheless, expanding on comments and guides/tutorials might make it easier for both users and contributors to faster and better orient themselves in the library. It doesn't change the fact, that for specialists in the field who are semi-advanced in Python the clarity of the code, intrinsic messages and equations put in LaTeX, together with the required documentation, seem sufficient.

The one concept in programming might be very new and difficult to get used to even for semi-advanced programmers in Python, like multiprocessing. It concernes the private parts of the code, as the _parallel_methods.py is using a fork of the multiprocessing library. As there are other libraries and methods to make parallel operations in Python, also depending on the hardware one uses (e.g. CUDA), it might be a good idea to significantly expand examples and explain this part of the library in depth so that if someone uses this part of the library, it is done quite purposefully. It is, of course, merely a suggestion on what could be done as a part of future development of the library.

Overall, the biggest challenge before contributors is to make the documentation a little more friendly and embark on a boring and tiresome adventure to add comments or alternatives, as already discussed with author during the review. There already are examples of in-depth comments, so there is a clear and present standard to strive for. Apart from this, the library is very well-organised, written beautifully in OOP in Python, with some functional parts. It is intuitive and modern. Also, introducing multiprocessing shows the author's consideration for computing cost, which is a growing necessity in today's scientific world - especially for those who have the skills but cannot afford proper hardware.

Finally, as a personal comment, though I'm still playing around with the functionalities, as I misunderstood the scope of JOSS review, I can already recommend approving the package and will simply let yopu know here in the comments, whenever and whatever else comes up.

@paddyroddy
Copy link
Author

The one concept in programming might be very new and difficult to get used to even for semi-advanced programmers in Python, like multiprocessing. It concernes the private parts of the code, as the _parallel_methods.py is using a fork of the multiprocessing library. As there are other libraries and methods to make parallel operations in Python, also depending on the hardware one uses (e.g. CUDA), it might be a good idea to significantly expand examples and explain this part of the library in depth so that if someone uses this part of the library, it is done quite purposefully. It is, of course, merely a suggestion on what could be done as a part of future development of the library.

multiprocess is a fork of multiprocessing as you say. The reason for using it is that it uses dill rather than pickle, which makes it easier to serialise data - and much easier to work with IMO. As mentioned previously, this is in the non-public facing part of the code, so shouldn't be too important for users. However, there is definite scope for speeding up further with CUDA etc., depending on what hardware one has available.

@SimonMolinsky
Copy link
Collaborator

@gnypit thank you for your review! As I understand, we're waiting for the performance checks from your side. If you need any support with the point Packaging guidelines: The package conforms to the pyOpenSci [packaging guidelines](https://www.pyopensci.org/python-package-guide). then let me know!

@gnypit
Copy link

gnypit commented May 21, 2024

Hello all! As I have recommended package approval already in April, I thought that you're free to go on. I apologise for the confusion. I checked all the boxes in the review template. I triple-checked some minor performance issues that I had; they're caused by problems with my virtual machine and not by the code itself.

@paddyroddy coming back to the multiprocessing, my starting the subject was motivated only by giving some suggestions. Of course it's a private part of the code, so it's a secondary matter for now.

Is there anything else that's required of me @SimonMolinsky ?

@SimonMolinsky
Copy link
Collaborator

@gnypit Thank you!

@paddyroddy As I see, there are no other changes required. Thus, I'm going to approve the package!

@SimonMolinsky
Copy link
Collaborator

@paddyroddy Congratulations, SLEPLET has become officially a part of the pyOpenSci family! Thank you for submitting your package here. I will let you know when it will be updated on the pyOS website as soon as it happens :)

@gnypit & @magsol, thank you for your support and time!

@paddyroddy
Copy link
Author

@paddyroddy coming back to the multiprocessing, my starting the subject was motivated only by giving some suggestions. Of course it's a private part of the code, so it's a secondary matter for now.

Thank you for your comments and review @gnypit!

@paddyroddy
Copy link
Author

@paddyroddy Congratulations, SLEPLET has become officially a part of the pyOpenSci family! Thank you for submitting your package here. I will let you know when it will be updated on the pyOS website as soon as it happens :)

Great, thank you. Excited to have this recognition!

@SimonMolinsky
Copy link
Collaborator

SimonMolinsky commented Jun 3, 2024

@paddyroddy Thank you again. We are wrapping things up to close the review officially. I've seen on PyPI that you added the pyOS badge to your package. Please fill out the the post-review survey in your spare time :)

And again, you might be interested in writing a blog post about your package. Here are examples and the template:

pandera
movingpandas

TEMPLATE: here is a markdown example that you could use as a guide when creating your post.

Last but not least, we are growing fast and looking for editors, reviewers, and experts in their fields. Your knowledge could be invaluable to the organization. So let me know if you would like to join the pyOpenSci community! 🙂

And something that does not require your input: I've opened a thread about your package on the pyOpenSci Discourse. Your package will soon be visible on the pyOpenSci website, Pull Request with metadata is pending ⌛


🎉 SLEPLET has been approved by pyOpenSci! Thank you @paddyroddy for submitting SLEPLET, and many thanks to @magsol and @gnypit for reviewing this package! 😸

Author Wrap Up Tasks

There are a few things left to do to wrap up this submission:

  • Activate Zenodo watching the repo if you haven't already done so.
  • Tag and create a release to create a Zenodo version and DOI.
  • Add the badge for pyOpenSci peer-review to the README.md of SLEPLET. The badge should be [![pyOpenSci](https://tinyurl.com/y22nb8up)](https://github.com/pyOpenSci/software-submission/issues/149).
  • Please fill out the post-review survey. All maintainers and reviewers should fill this out.

Editor Final Checks

Please complete the final steps to wrap up this review. Editor, please do the following:

  • Make sure that the maintainers filled out the post-review survey
  • Invite the maintainers to submit a blog post highlighting their package. Feel free to use / adapt language found in this comment to help guide the author.
  • Change the status tag of the issue to 6/pyOS-approved6 🚀🚀🚀.
  • Invite the package maintainer(s) and both reviewers to slack if they wish to join.
  • If the author submits to JOSS, please continue to update the labels for JOSS on this issue until the author is accepted (do not remove the 6/pyOS-approved label). Once accepted add the label 9/joss-approved to the issue. Skip this check if the package is not submitted to JOSS.
  • If the package is JOSS-accepted please add the JOSS doi to the YAML at the top of the issue.

If you have any feedback for us about the review process please feel free to share it here. We are always looking to improve our process and documentation in the peer-review-guide.

@paddyroddy
Copy link
Author

Thanks, @SimonMolinsky! I've filled in the survey, and will look into the blog post.

@paddyroddy
Copy link
Author

@SimonMolinsky I've made a PR for the blog post pyOpenSci/pyopensci.github.io#412

@paddyroddy
Copy link
Author

@SimonMolinsky looks like it's on the website now so this issue can be closed?

@SimonMolinsky
Copy link
Collaborator

@paddyroddy Yes, you are right! Thanks! I'm closing the issue :)

paddyroddy added a commit to paddyroddy/talks that referenced this issue Jun 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: joss-accepted
Development

No branches or pull requests

5 participants