Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation about contributing #227

Merged
merged 21 commits into from
Jan 29, 2018
Merged

Documentation about contributing #227

merged 21 commits into from
Jan 29, 2018

Conversation

alimanfoo
Copy link
Member

@alimanfoo alimanfoo commented Jan 3, 2018

This PR adds some documentation about contributing to Zarr.

Also discovered in working up this PR is that the pytest.mark.skipif usage is broken due to pytest-dev/pytest#568, and so this PR includes a simple workaround for the tests with optional dependencies.

TODO:

  • issue template
  • pull request template
  • get appveyor CI working
  • get coveralls CI working
  • check readthedocs integration is working

@alimanfoo
Copy link
Member Author

@jakirkham, @mrocklin, @shoyer if you have a moment, I'd appreciate any feedback on this, especially the contributing doc. I've borrowed bits from the dask and pandas contributing guides while trying to keep it fairly short. Any suggestions welcome.

@alimanfoo alimanfoo added this to the v2.2 milestone Jan 3, 2018
@alimanfoo alimanfoo added the documentation Improvements to the documentation label Jan 3, 2018
@alimanfoo alimanfoo self-assigned this Jan 3, 2018
Copy link
Member

@jhamman jhamman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

$ pip install -r requirements_dev.txt
$ pip install -r requirements_dev_optional.txt
$ pip install -e .

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest also including an example of how to do this with conda. My experience is that this is a common workflow.

$ conda env create -n zarr-dev -f requirements_dev.yml  # req file would need to be added
$ source activate zarr-dev
$ pip install -e  # or python setup.py develop

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't give a conda example as (last time I looked) you can't use tox with conda, and tox is the best way at the moment to run all the tests (including doctests, coverage, flake8 etc.) and verify support for PY2 and PY3.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

virtualenv has been packaged and exists on defaults and conda-forge. Once that was done tox was added to conda-forge in short order.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'll leave this as-is for now as I don't have time to verify all the steps to get tox fully up and running all test environments under a conda installation, but if someone wants to do this later and can provide the necessary commands then very happy to add.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. That was only meant as an informative comment. Happy with this as is.


It's best to create a new, separate branch for each piece of work you want to do. E.g.::

git checkout -b shiny-new-feature
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would preface this with:

git fetch upstream
git checkout -b shiny-new-feature upsteam/master

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, will add.



@pytest.mark.skipif(h5py is None, reason='h5py not installed')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we also ran into this in pydata/xarray#1557. This should work but just for reference, we created our own skipif machinery using unittest.

https://github.com/pydata/xarray/blob/186a8bbe6a79e8d94203af113297c96a045b8730/xarray/tests/__init__.py#L47-L61

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, very helpful.

@alimanfoo
Copy link
Member Author

Thanks @jhamman for the comments, much appreciated.

@coveralls
Copy link

coveralls commented Jan 4, 2018

Coverage Status

Coverage increased (+0.6%) to 100.0% when pulling 2b439d4 on contributing-20180103 into f8df70d on master.

@alimanfoo alimanfoo changed the title WIP Documentation about contributing Documentation about contributing Jan 4, 2018
@coveralls
Copy link

coveralls commented Jan 4, 2018

Coverage Status

Coverage increased (+0.6%) to 100.0% when pulling 9a09cfb on contributing-20180103 into f8df70d on master.

@alimanfoo
Copy link
Member Author

Looks like readthedocs is happy.

* How Zarr was installed (e.g., "using pip into virtual environment", or "using conda")

Also, if you think it might be relevant, please provide the output from ``pip list`` or
``conda list`` depending on which was used to install Zarr.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might suggest conda env export and pip freeze instead. These produce files that can be used to immediately install a user's environment and begin playing with it. While conda list and pip list do provide the same info, there is no one liner that turns them into an environment on a dev's machine.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

TODO:
* [ ] Unit tests and/or doctests in docstrings
* [ ] ``tox -e py36`` passes locally
* [ ] ``tox -e py27`` passes locally
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I don't think we need to be so formal. Maybe something like this for the first one. Similar for the second.

* [ ] Python 3.6 tests pass (e.g. ``tox -e py36``)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

* [ ] Docstrings and API docs for any new/modified user-facing classes and functions
* [ ] New/modified features documented in docs/tutorial.rst
* [ ] Changes documented in docs/release.rst
* [ ] ``tox -e docs`` passes locally
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar story as above. Here some comment about the docs building ok.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

<https://github.com/zarr-developers/zarr/issues/new>`_ including a link to your StackOverflow
question. We will try to respond to questions as quickly as possible, but please bear
in mind that there may be periods where we have limited time to answer questions
due to other commitments.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

<!--

For bug reports, please follow the template below. For enhancement proposals, feel free
to use whatever template makes sense.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems good, but it doesn't show up currently (when rendered). Any reason for it to be commented?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really, was just copying another example I'd seen.

@alimanfoo
Copy link
Member Author

Thanks @jakirkham. I've made some changes which hopefully address all points. Regarding developer best practices, I've added a new section at the end, also including some information about versioning and release policies, let me know what you think. Ideally I was thinking we should at least review and approve each other's PRs, although I don't want to place too heavy a burden so feel free to say "too busy at the moment" and maybe we can try to work around or reschedule.

@alimanfoo alimanfoo added the maintenance Work needed by a maintainer label Jan 4, 2018
~~~~~~~~~~~~~~~~~~~~~

If at all possible, pull requests submitted by an external contributor should be
reviewed and approved by all core developers before being merged. Pull requests
Copy link
Member

@jakirkham jakirkham Jan 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to constrain it to all? It may be difficult to achieve due to time constraints and competing demands. Admittedly "all" is small right now though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As contrast, scikit-learn requires two approvals for merging (see Pull Request Checklist). Dask has a more informal mechanism by which core developers say they plan to merge in a particular time frame if they get no comments (typical in a day).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'll edit this to say we'd like to aim for pull requests to be reviewed by at least one other core developer. With only two of us at the moment it amounts to the same thing, but I think it's probably worth saying we'd like to aim for some review of PRs. This will slow things down if either of us is unable to review due to other commitments, but I think it's probably ok to move slowly for a while and focus on consolidation and sharing knowledge between core devs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. I completely agree. We also added a lot of features. So we might need to be in maintenance mode for a bit just in case any bugs show up.

function, class or method is usually a backwards compatible change. However, removing a
function, class or method; removing an argument to a function or method; adding a
required argument to a function or method; or changing the behaviour of a function or
method, are examples of **backwards-incompatible API changes**.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we did add a few notes about functions not having a stable API. We should probably include a similar note here.

Also how do you feel about documented API vs. functions found inside Zarr? Some projects only consider the documented part public. In that case even functions that don't have a leading underscore are still considered private as long as they are not in the docs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah sorry. Just got to the section on not stable API below. Thoughts about moving that section after this one?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I will change the definition of public API to include only what is documented in the API docs, probably safer for now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM. We can always expand the scope later.

2.2.0 -> 2.2.1). If a release contains API changes, but all API changes are
backwards-compatible, then the minor version number should be incremented
(e.g., 2.2.1 -> 2.3.0). If a release contains any backwards-incompatible API changes,
the major version number should be incremented (e.g., 2.3.0 -> 3.0.0).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to go down the road of having release branches? Asking this partially as I would like to get an idea of how we want to handle breaking changes should they be proposed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would we use release branches? Is it to allow work to be going on on the next major version (including breaking API changes) in parallel with bug fixes on the current major version?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems like a reasonable solution to me.


Exceptions can be made for any function, class or method which has been documented as
an experimental feature, i.e., backwards-incompatible changes can be included in a
minor release, although this should be avoided wherever possible.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As noted above, it would be nice to have this right after the API section as they seem more logically connected.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will restructure.

desirable to maintain data format compatibility wherever possible. However, if a change
is needed to the storage specification, and that change would break data format
compatibility in any way, then the storage specification version number should be
incremented (e.g., 2 -> 3).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens when we add something to the data format spec which is not breaking? Is there a way to version this?

Also related how do we migrate between these enhancement versions? Lastly do we have a way to check the data format spec version of Zarr and/or the files themselves?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the moment, I cannot imagine any change to the data format spec that does not break data format compatibility in some way, in the sense that data written by any one implementation can be read correctly by all others. Hence the storage spec has only a major version number, there is no minor or micro version number. This simplifies the definition of data format compatibility, i.e., all libraries implementing a given spec version number should be fully interoperable. That said, if you have an example of a spec change that does not fit into this model, happy to discuss.

The spec version of the data files is included in the metadata (.zarray or .zgroup). The spec version that zarr implements is currently stored as the ZARR_FORMAT variable in the zarr.meta module. There is a check on reading array or group metadata, if there is a difference between the spec version in the metadata files and the ZARR_FORMAT variable then an exception is raised.

When we moved from spec version 1 to 2 I implemented a function zarr.storage.migrate_1to2 which updates the metadata resources to conform to spec version 2. I imagined that we would do something similar for any future spec changes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an example, we recently opted to not always write metadata files and noted this in the spec. We have also made sure that current Zarr treats this as empty metadata. That said, I'm not sure if an older version of Zarr (claiming to support the same spec version) would handle this correctly. Even though what we added didn't fundamentally break the version 2 spec. Instead it provided a superset of the existing functionality. IOW this seems like a feature we added to the spec that would make it 2.1. Though this could just be me being pedantic. :)

Everything else sounds great. We might want to mention the ZARR_FORMAT variable here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a fair point, although that change was a bit of a grey area in that zarr 2.1 is happy if .zattrs is missing and the spec did not say whether .zattrs is required or not, so saying that .zattrs is optional was a clarification and did not break compatibility with previous zarr versions implementing spec version 2. I might have been a bit more cautious if I knew there were other zarr implementations out there, but it still would have been a clarification rather than a change to the spec.

We may still come across other cases which are more clear cut, in that some change is required in the spec to support a new feature, and where using something more fine-grained than a single major version number for the spec version could be useful, but I find it hard to think that through without a concrete case in front of me. The questions we would need to answer are things like, what kinds of changes would be allowed in a minor spec change (e.g., 2.0 -> 2.1), how should an implementation behave if it implements spec version 2.0 and encounters data using spec version 2.1, etc. I think we can cross those bridges if and when the need arises.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could probably smooth this over by adding one last patch release to 2.1 with this change. Alternatively we could make the behavior something users opt-in to. That said, I'm not too worried about it. In any event, this discussion is diverging from the spec version and can be continued in an issue if we would like.

Sure, make sense. Easy to add once we need it.

number 3.0.0. Note however that the major version number of the Zarr library may not
always correspond to the spec version number. For example, Zarr versions 2.x, 3.x, and
4.x might all implement the same version of the storage spec and thus maintain data
format compatibility, although they will not maintain API compatibility.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we store the spec version in a variable somewhere?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

zarr.meta.ZARR_FORMAT

~~~~~~~~~~~~~~~~~~~~~~

Ideally, any bug fixes that don't change the public API should be released as soon as
possible. It is fine for a micro release to contain only a single bug fix.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hence the branching question above. Do we want to branch out 2.2 and allow mater to get enhancements and backport patches to 2.2?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally this would be great, although given that developer time is limited I think I wouldn't want to make any promises to backport bug fixes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. :)

Tag the version (where "X.X.X" stands for the version number, e.g., "2.2.0")::

$ version=X.X.X
$ git tag -a v$version -m v$version
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any other rules about the message? Are we allowed to highlight changes or is that undesirable?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind about the message, happy with more information if useful.

Release source code to PyPI::

$ python setup.py register sdist
$ twine upload dist/zarr-${version}.tar.gz
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there some rules about who gets permissions to PyPI?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy for other core devs to have PyPI permissions.

@jakirkham
Copy link
Member

Sorry for the slew of questions above. Tried to be thorough. Happy to clarify anything that is unclear.

@alimanfoo
Copy link
Member Author

alimanfoo commented Jan 5, 2018 via email

@jakirkham
Copy link
Member

Anything I can help with here?

@alimanfoo
Copy link
Member Author

alimanfoo commented Jan 25, 2018 via email

@alimanfoo
Copy link
Member Author

Hi @jakirkham, I've made some edits to the contributing doc, hopefully addresses comments but let me know if any other thoughts.

@jakirkham
Copy link
Member

No worries. Know the feeling. :)

The edits look great.

Had one more comment on the version spec regarding a particular change we made. Happy to go along with whatever you decide.

Other than that I think this should be good to merge.

@alimanfoo
Copy link
Member Author

alimanfoo commented Jan 26, 2018 via email

@alimanfoo alimanfoo mentioned this pull request Jan 26, 2018
5 tasks
@jakirkham
Copy link
Member

FWIW am happy to see this merged whenever you are ready.

@alimanfoo alimanfoo merged commit c5319d6 into master Jan 29, 2018
@alimanfoo alimanfoo deleted the contributing-20180103 branch January 29, 2018 11:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements to the documentation maintenance Work needed by a maintainer
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants