Contributing Guide #1318

joshlf · 2024-05-20T00:17:24Z

joshlf
May 20, 2024
Maintainer

How to Contribute

Thanks for your interest in zerocopy!

TL;DR

This section describes the minimum you need to know in order to contribute.

Developing
- Instead of cargo, use our cargo.sh or cargo.bat wrappers: ./cargo.sh +{msrv,stable,nightly} ...
- If UI tests are failing (tests/trybuild.rs or zerocopy-derive/tests/trybuild.rs), run ./tools/update-ui-test-files.sh
- [optional] Configure Git hooks: git config --local core.hooksPath githooks
Submitting
- One commit per PR (if you need to edit your PR, do git commit --amend or similar; don't add new commits)
- Write commit message subject lines like this: [derive] Support IntoBytes on types with parameters
- Follow this advice for commit message bodies

Developing locally

For more detail, see #1320.

Building

TL;DR: You can probably use cargo and everything will be fine. If that breaks, use ./cargo.sh instead.

Some of our tests are sensitive to what Rust toolchain version is used - especially UI tests, which expect exact compiler error message output, which can change between Rust versions. For this reason, running tests with the wrong toolchain can cause unexpected failures. In order to avoid these, we provide cargo.sh (or cargo.bat on Windows), which performs toolchain resolution automatically and then invokes cargo with the correct toolchain. For example:

$ ./cargo.sh +msrv test

...runs tests on our MSRV (minimum supported Rust version) toolchain.

Updating UI tests

If you make changes that affect compiler error output, UI tests may fail. If this happens, run ./tools/update-ui-test-files.sh to update the tests.

Git hooks

There is a pre-push git hook available at githooks/pre-push. This hook runs some quick checks locally before pushing so that these same checks won't later fail during CI. Git won't use this hook automatically. If you would like to use it, set repository config core.hooksPath to githooks. This can be done by running (at repository root):

git config --local core.hooksPath githooks

Contributing code

New files

New files should begin with the copyright header:

// Copyright 2024 The Fuchsia Authors
//
// Licensed under a BSD-style license <LICENSE-BSD>, Apache License, Version 2.0
// <LICENSE-APACHE or https://www.apache.org/licenses/LICENSE-2.0>, or the MIT
// license <LICENSE-MIT or https://opensource.org/licenses/MIT>, at your option.
// This file may not be copied, modified, or distributed except according to
// those terms.

Tests

Much of the code in zerocopy has the property that, if it is buggy, those bugs may not cause user code to fail. This makes it extra important to write thorough tests, but it also makes it harder to write those tests correctly. Here are some guidelines on how to test code in zerocopy:

All code added to zerocopy must include tests that exercise it completely.
Tests must be deterministic. They must not depend on any details of the execution environment including time, threading, filesystem or network access, OS random number generators, etc. All of these can be replaced with deterministic mocks/fakes if needed.
Avoid change detector tests; tests that are unnecessarily sensitive to changes, especially ones external to the code under test, can hamper feature development and refactoring.
Since we run tests in Miri, make sure that tests exist which exercise any potential undefined behavior so that Miri can catch it.
If there's some user code that should be impossible to compile, add a trybuild test to ensure that it's properly rejected.

Abstraction boundaries and unsafe code

Zerocopy's reason for existence is to write code whose correctness cannot be tested, let alone guaranteed at compile time. To quote our documentation: we write unsafe so you don't have to. For most of the code in our codebase, the primary defense we have against bugs is a careful, methodical, and disciplined approach to development.

Any codebase benefits from modularity, clear and well-chosen abstraction boundaries, encapsulation, separation of concerns, etc. However, in most codebases, violations of these principles increase the likelihood of bugs that will result in compilation or unit test failure. In zerocopy, violations can result in code which is silently unsound.

For this reason, we are extremely diligent about defining abstraction boundaries. This leads to a number of concrete practices:

Assume all internal APIs are used adversarially

Just because an API is not public to the crate does not mean that it cannot be accidentally misused by another developer - including your future self, who may have forgotten the subtleties of the API that you understood when you wrote it.

Thus, every single internal API in zerocopy must abide by strict soundness rules:

If a function or trait is not unsafe, then it must be the case that any usage of that function or implementation of that trait which will compile will be sound
If a function or trait is unsafe, it must include a doc comment with a safety section which documents safety/soundness preconditions, postconditions, and invariants
If a type has fields which are public beyond its module, it must be sound for these fields to be set to any values whatsoever
If a type's fields are private, all code in the module is able to modify those fields, and so all code in the module is responsible for upholding that type's internal invariants

Minimize abstraction boundaries

Make modules and other abstraction boundaries as small as possible in order to shrink the set of code that has the opportunity to violate safety invariants. For an example of this principle in action, see the ref::def module, which contains the minimal definition of the Ref type, and ensures that most Ref methods do not need to be considered when reasoning about Ref's internal invariants.

Split `unsafe` and document it completely

Each unsafe operation should be in its own separate unsafe block with its own separate, and logically complete safety proof. For example:

zerocopy/src/pointer/ptr.rs

Lines 603 to 626 in 134b798

    
                   /// Converts a `Ptr` an unaligned `T` into a `Ptr` to an aligned 
        
                   /// `Unalign<T>`. 
        
                   pub(crate) fn into_unalign( 
        
                       self, 
        
                   ) -> Ptr<'a, crate::Unalign<T>, (I::Aliasing, Aligned, I::Validity)> { 
        
                       // SAFETY: 
        
                       // - This cast preserves provenance. 
        
                       // - This cast preserves address. `Unalign<T>` promises to have the 
        
                       //   same size as `T`, and so the cast returns a pointer addressing 
        
                       //   the same byte range as `p`. 
        
                       // - By the same argument, the returned pointer refers to 
        
                       //   `UnsafeCell`s at the same locations as `p`. 
        
                       let ptr = unsafe { 
        
                           #[allow(clippy::as_conversions)] 
        
                           self.cast_unsized(|p: *mut T| p as *mut crate::Unalign<T>) 
        
                       }; 
        
                       // SAFETY: `Unalign<T>` promises to have the same bit validity as 
        
                       // `T`. 
        
                       let ptr = unsafe { ptr.assume_validity::<I::Validity>() }; 
        
                       // SAFETY: `Unalign<T>` promises to have alignment 1, and so it is 
        
                       // trivially aligned. 
        
                       let ptr = unsafe { ptr.assume_alignment::<Aligned>() }; 
        
                       ptr 
        
                   }

Comments

TL;DR: Code is written once but read many times. Take the time to write clear, helpful comments.

Comments should always be full sentences with proper grammar, capitalization, and punctuation.
Comments should be self-contained in both time (across versions of the code) and in space (across the codebase):
- Comments should always describe the current state of the code. Do not write comments that make reference to planned future changes or past versions of the code.
  - An exception: it may be the case that the existence of some code only makes sense in light of an ongoing transition, in which case it may be unavoidable to reference the transition in order to explain why certain choices were made.
  - An exception: if a particular design causes problems, it may be justified to make reference to this design when commenting on the design that replaces it to explain why certain choices are made, and to ensure that the lessons of the past are not lost on future developers.
- Comments should not assume that the reader is aware of any other part of the codebase besides what's being read.
- Avoid writing comments whose correctness depends upon other parts of the codebase; it is impossible to ensure that the comment will be updated if those dependencies change.
When writing TODOs, always include an issue reference using the format TODO(#123):

Source Control Best Practices

Commits should be arranged for ease of reading; that is, incidental changes such as code movement or formatting changes should be committed separately from actual code changes.

Commits should always be focused. For example, a commit could add a feature, fix a bug, or refactor code, but not a mixture.

Commits should be thoughtfully sized; avoid overly large or complex commits which can be logically separated, but also avoid overly separated commits that require code reviews to load multiple commits into their mental working memory in order to properly understand how the various pieces fit together.

Commit Messages

Commit messages should be concise but self-contained (avoid relying on issue references as explanations for changes) and written such that they are helpful to people reading in the future (include rationale and any necessary context).

Avoid superfluous details or narrative.

Commit messages should consist of a brief subject line and a separate explanatory paragraph in accordance with the following:

If the code affects a particular subsystem, prefix the subject line with the name of that subsystem in square brackets, omitting any "zerocopy" prefix (that's implicit). For example, for a commit adding a feature to the zerocopy-derive crate:

[derive] Support IntoBytes on types with parameters

The body may be omitted if the subject is self-explanatory; e.g. when fixing a typo. The git book contains a Commit Guidelines section with much of the same advice, and the list above is part of a blog post by Chris Beams.

Commit messages should make use of issue integration. Including an issue reference like #123 will cause the GitHub UI to link the text of that reference to the referenced issue, and will also make it so that the referenced issue back-links to the commit. Use "Closes", "Fixes", or "Resolves" on its own line to automatically close an issue when your commit is merged:

Closes #123
Fixes #123
Resolves #123

When using issue integration, don't omit necessary context that may also be included in the relevant issue (see "Commit messages should be concise but self-contained" above). Git history is more likely to be retained indefinitely than issue history (for example, if this repository is migrated away from GitHub at some point in the future).

Commit messages should never contain references to any of:

Relative moments in time
Individuals
Hosted code reviews (such as on https://github.com/google/zerocopy/pulls)
- Refer to commits in this repository by their SHA-1 hash
- Refer to commits in other repositories by public web address (such as
  789b3deb)
Other entities which may not make sense to arbitrary future readers

Miscellaneous

Contributor License Agreement

Contributions to this project must be accompanied by a Contributor License Agreement. You (or your employer) retain the copyright to your contribution; this simply gives us permission to use and redistribute your contributions as part of the project. Head over to https://cla.developers.google.com/ to see your current agreements on file or to sign a new one.

You generally only need to submit a CLA once, so if you've already submitted one (even if it was for a different project), you probably don't need to do it again.

Community Guidelines

This project follows Google's Open Source Community Guidelines.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Contributing Guide #1318

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

Contributing Guide #1318

joshlf May 20, 2024 Maintainer

How to Contribute

TL;DR

Developing locally

Building

Updating UI tests

Git hooks

Contributing code

New files

Tests

Abstraction boundaries and unsafe code

Assume all internal APIs are used adversarially

Minimize abstraction boundaries

Split unsafe and document it completely

Comments

Source Control Best Practices

Commit Messages

Miscellaneous

Contributor License Agreement

Community Guidelines

Replies: 0 comments

joshlf
May 20, 2024
Maintainer

Split `unsafe` and document it completely