Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MetaPtr + usize/u32 changes #4174

Open
wants to merge 10 commits into
base: master
Choose a base branch
from
Open

Conversation

LawrenceEsswood
Copy link

Pull Request Overview

This changes some u32s into usizes as appropriate.

It also introduces a new MetaPtr type that is a pointer that has explicit target dependant metadata. On many platforms, this will just be a wrapper around usize. On CHERI platforms MetaPtr will be a capability.
See later commit.

It also adds a new syscall encoding encode_syscall_return_metaptr That is intended to work for both 32/64 platforms with or without this extra metadata.

Testing Strategy

Running on QEMU

TODO or Help Wanted

This pull request still needs review, especially around encode_syscall_return_mptr which makes some decisions on how 64-bit syscalls should work

Documentation Updated

None

Formatting

  • [x ] Ran make prepush.

This changes some u32s into usizes as appropriate.

It also introduces a new MetaPtr type that is a pointer that has explicit target dependant metadata. On many platforms, this will just be a wrapper around usize.
On CHERI platforms MetaPtr will be a capability.
See later commit.

It also adds a new syscall encoding `encode_syscall_return_metaptr`
That is intended to work for both 32/64 platforms with or without this extra metadata.

Change-Id: I40faa11c1fd53debc6e9b21d00772660cacf8cab
@github-actions github-actions bot added kernel tock-libraries This affects libraries supported by the Tock project risc-v RISC-V architecture labels Sep 18, 2024
@hudson-ayers
Copy link
Contributor

This PR seems to have broken the litex-sim-ci, and I suspect that is a true breakage, and the app is not running. Have you run any risc-v apps in qemu during your testing?

@bradjc
Copy link
Contributor

bradjc commented Sep 24, 2024

What about AuthenticatedPtr for a name?

Copy link
Member

@alevy alevy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally good. I think it needs more explanation and potentially consistency in what a MetaPtr's purpose is, especially without invoking the specific CHERI use-case, since it's something to be used regardless of CHERI (or other capability) hardware being available.

kernel/src/metaptr.rs Outdated Show resolved Hide resolved
}
}

impl From<usize> for MetaPtr {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hrm... this seems to break the abstraction. E.g., is this OK under CHERI? Why is it ok to created a MetaPtr from any-old usize?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's OK assuming the metadata would say something like "invalid" or "just a usize".

On CHERI this would correspond to deriving from the NULL capability.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to reopen this for a closely-related question. What do these operations do to provenance? My guess is:

  1. From<CapabilityPtr> for usize does not expose the provenance, similar to .addr
  2. From<usize> for CapabilityPtr produces a pointer without provenance, similar to core::ptr::without_provenance.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think those will be the semantics. If somebody wanted to maintain provenance, they should not go through usize.

new_with_metadata is an interesting one. While it seems to have a ptr argument, one might guess that the resulting thing would have its provenance. This is true in this patch. In the CHERI implementation we do the equivalent of:

secret_internal_ptr.with_addr(ptr.addr)

This is because it is unlikely that the kernel would actually have provided something of the correct provenance, and that method exists to attach it (using base/length/perms to select an appropriate internal thing to derive from).

Currently nothing in this file mentions provenance, so I wonder if this interaction should be documented at all.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strict provenance will be stable soon so we should probably document how these functions affect provenance.

How does new_with_metadata compare with core::ptr::with_exposed_provenance? They sound similar, but the latter API has been said to be incompatible with CHERI. Is new_with_metadata a hybrid CHERI-only mechanism or is it different?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which provenance the pointer returned from new_with_metadata is up to its implementation.

On non-CHERI, I suspect we will return the input.

In the CHERI implementation, we will derive from one of a global DDC, or a global PCC (that covers the entire address space, and all objects are derived from). I think we might even see it in a Purecap world, because we might have a pointer that is RW and need to derive one with the same bounds but with RX.

Execute,
}

impl From<MetaPtr> for usize {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is OK because a MetaPtr is always convertible into a raw pointer by just losing any meta information. However, might be better to implement this for <T> *const T (which can always be cast to a usize anyway)?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can add that too (it might even exist as a method?), but this one gets used a lot in syscall / return argument handling.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah got it. Does this imply that those arguments or returns should actually be *const T?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think its happening a lot because values that are usize, or even eventually u32, are being passed in registers that can hold something as wide as MetaPtr (the widest type, or so I have assumed).

@bradjc has suggested that maybe those should be a separate RegisterData type, something large enough to hold u32/isize/*const()/MetaPtr.

I suppose, fairly, one could imagine a platform with pointers that are smaller than u32, in which case really RegisterData would wrap around the largest type.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documenting that CapabilityPtr is a pointer then using it to store other things is very confusing, and I think having a separate RegisterData [1] type would help that confusion. CapabilityPtr is already a confusing enough type to understand, especially for contributors who are unfamiliar with CHERI.

The fact that some CHERI systems use different registers for integers and capabilities however, does throw a wrench into the works. However, there are two obvious ways that come to mind for how we might choose to handle that at the ABI level:

  1. We can pass integers in the integer registers and capabilities in the capability registers. In this case, the kernel knows which are which so we can can use usize for the integer registers and CapabilityPtr for the capability registers [2].
  2. We can pass everything in the capability registers and ignore the integer registers. This is just treating the system as a CHERI system that uses the same registers for everything, and we'd just use RegisterData for all the register values.

In both cases, it makes sense to only use CapabilityPtr for pointer types, so I don't think it's really a blocker for having both CapabilityPtr and RegisterData. Option 1 would be somewhat awkward, though, because we would only use RegisterData for the capability registers.

[1] libtock-rs just calls this Register
[2] This ignores the possibility of non-pointer capabilities, which is another complication

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not convinced RegisterData is really a single type. Machines have different width registers. To say that RegisterData should refer to just one of them is odd to me. If we did introduce a RegisterData here, I would expect it to wrap CapabilityPtr (with an appropriate comment saying the type is being used because it is expected to be widest). It would certainly throw a spanner if ever a different ABI were introduced that actually used different width registers, to have a RegisterData and then SomeOtherRegisterData. But I am happy to add RegisterData if that is the consensus and makes the syscall layer easier to understand.

  1. is/was the ABI for those split file machines. In a similar fashion to how you would select float/integer registers on platforms that split them.

kernel/src/metaptr.rs Outdated Show resolved Hide resolved
kernel/src/metaptr.rs Outdated Show resolved Hide resolved
kernel/src/kernel.rs Outdated Show resolved Hide resolved
kernel/src/metaptr.rs Outdated Show resolved Hide resolved
kernel/src/process.rs Show resolved Hide resolved
Comment on lines +575 to +576
/// The obvious extension of TRD104 that works for 32-bit and 64-bit platforms.
/// This makes no changes to TRD104 on 32-bit platforms.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eventually I think we will benefit from a dedicated PR / discussion thread updating TRD104 and more explicitly discussing the best ABI for capabilities.

I'm not sure we should necessarily block this PR on that longer process.. but this code does change our ABI, which feels like it should be partitioned off somehow more explicitly from anything not using CHERI in the short term; I'm not sure the least-painful way to do that :/

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am doing to take a stab at TRD104 now

kernel/src/upcall.rs Show resolved Hide resolved
Copy link
Member

@lschuermann lschuermann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only looked at syscall.rs for now. I agree with @ppannuto that this would benefit from an update to TRD 104 and feel rather strongly that this update should be merged before or in tandem with this PR. It's hard to fully reason about these changes and what they'll mean for new platforms and a more abstract update of the TRD will make that easier to reason about.

/// On CHERI, this grants authority.
/// Access to this return is therefore privileged.
SuccessPtr(MetaPtr),

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be reflected in an update to TRD 104. Even if it isn't a breaking change, we should enumerate this variant and show onto which SyscallReturnVariant it maps.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't passing a pointer back to userspace via a command a very significant change?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I think this crops up in two places:

The return value from allow (which I think did previously return the last allow-ed thing?)
And sbrk, which did previously return an address, but on CHERI needed to return a MetaPtr to allow access to the new heap. This is actually new.

Possibly this is where I broke libtock-c, because these codes have changed. On the CHERI fork of libtock-c the expected return from allow / sbrk have been updated.

Maybe this deserves more discussion.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am still having trouble deciding if this should appear in TRD 104. It can now no longer be returned on the platforms TRD 104 currently says it is for. Either TRD 104 needs to be about 64 bit / CHERI platforms, or actually this would need documenting in another document.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, there is a bit of weirdness in that this probably belongs in a new TRD, but it should add to the syscall return variants, which are tracked in a table in TRD 104. I think it is important for us to track the syscall return variants in a single place (so we don't accidentally assign the same number for different variants in different TRDs). My suggestion is to update the return variants table in TRD 104, but to design the rest of this in a separate TRD.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems a good suggestion (with a note they are not possible on this platform, but reserved).

kernel/src/syscall.rs Outdated Show resolved Hide resolved
@@ -522,111 +552,197 @@ impl SyscallReturn {
/// Encode the system call return value into 4 registers, following the
/// encoding specified in TRD104. Architectures which do not follow TRD104
/// are free to define their own encoding.
/// TODO: deprecate in favour of the more general one
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems perfectly fine for this kernel-supplied implementation to only cover a narrow set of platforms. We deliberately allow downstream users or individual architectures to use their own implementations. The comment above seems fine as is now, and once TRD 104 is updated to reflect these changes we should update the above comment instead of adding this TODO.

/// SuccessPtr is as passed the full MetaPtr register.
/// Pointers from allow'd buffers have minimal bounds attached that cover their length,
/// and the same permissions that were checked at the syscall boundary.
pub fn encode_syscall_return_mptr(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move this to the CHERI architecture, where it's actually used? I'd prefer implementations only used on one platform to be shipped in that platform support code, instead of generic kernel code.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one in intended to work for 32/64 bit CHERI / non-CHERI and so I think belongs in kernel.

@lschuermann
Copy link
Member

I want to cross-post my comment from tock/book#53 here:

[Are usized command-parameters] really what we want to have? Way back when we talked about this for the 2.0 system call interface, we converged on having the current system call interface and driver traits use u32s consistently, and implement an additional, 64-bit specific trait for such platforms when we have them. I know that at least for command args, we're currently not living this reality, though we arguably ought to be.

I still think this is a good idea, because this way we don't have to reason about edge-cases on different platforms for the system call interface. Driver (interfaces) behave identically across these platforms.

Does this PR depend on command arguments being usize? If so, we should think about this as an additional "breaking change". While command parameters are usize right now, they really ought to be u32s. This didn't matter previously as we only supported 32-bit platforms. However, I'm wary of a 64-bit platform locking us into this decision. We should instead feature a new command system call with 64-bit parameters to avoid our drivers having platform-dependent interfaces.

/// On CHERI, this grants authority.
/// Access to this return is therefore privileged.
SuccessPtr(MetaPtr),

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't passing a pointer back to userspace via a command a very significant change?

self.encode_syscall_return_mptr(a0, a1, a2, a3);
}
} else {
panic!("encode_syscall_return used on a 64-bit platform or CHERI platform")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this needs to remain a compile time check. This is very low level to have a panic.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My logic was that people would therefore find out very fast that something was wrong and they were calling the wrong function.

I will check that const _ : () = .... does what we want here. I am just worried that the existence of the function on 32-bit platforms in, say, debug builds would cause the compiler error.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checked. The static assert will trigger in an undesirable way on 64-bit platforms. Its not really an error for this function to exist, just use it.

Unless we use cfg[] to remove this function entirely, I think it is correct to just panic on 64-bit systems.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason this shouldn't be an assert! instead?

For @bradjc I believe (and we can confirm) that because this can be resolved at compile time (both sides of the == are const), the panic/assert are going to be elided at compile time on 32-bit platforms.

@LawrenceEsswood generally we avoid resolving comments directed to us unless it becomes irrelevant (which GitHub does automatically when the code changes enough), and instead the author of the comment (or potentially some other moderator) should resolve when their question is answered

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure this, can be assert too. Will roll into the next round of patches (and leave this open!).

kernel/src/syscall.rs Outdated Show resolved Hide resolved
Lawrence Esswood added 5 commits October 14, 2024 18:40
If we ever refactored to support strict provenance or a cheri purecap
rustc, *const() is a far better choice than usize.

Change-Id: I56947d53dc2f2ef9d1d5ac80641bc4410f383813
Change-Id: I8049bb48e48a162feb261cb4e1e15f7e5b4d8f6e
Make return variants that contain pointers and usizes explict about this
type information.

A backwards compatability mapping is also provided for legacy u32
platforms.

Change-Id: Ied5513c58bfa87be3599ba366692f66af3da1691
Change-Id: Iccb152457179f1a48eb110c4e4eb7c2efc19150d
Change-Id: I4fdbed879d85013c7e6f0e0157dcba7f92fb943b
alevy
alevy previously approved these changes Oct 20, 2024
Copy link
Member

@alevy alevy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm marking approved because I think this is very close, and want to communicate my approval vote.

There are a few nits in the new doc.

I also think we should rename MetaPtr to something more descriptive, e.g. AuthenticPtr as @bradjc suggested. Or something else. Capability wouldn't be bad if it didn't imply an incorrectly high level of assurance for non-tagged architectures. Meta is just vague.

kernel/src/syscall.rs Show resolved Hide resolved
kernel/src/metaptr.rs Outdated Show resolved Hide resolved
kernel/src/metaptr.rs Outdated Show resolved Hide resolved
kernel/src/metaptr.rs Outdated Show resolved Hide resolved
kernel/src/metaptr.rs Outdated Show resolved Hide resolved
Co-authored-by: Amit Levy <aalevy@gmail.com>
Lawrence Esswood added 2 commits October 29, 2024 23:18
Change-Id: I0b59feb2d1d6f60be4aaa46adc430785651837ca
Change-Id: Iccd7ff9f675786cf04abfa16ce89e8c3667a756d
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should edit old notes.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Teaches me to use refactoring tools

use core::fmt::{Formatter, LowerHex, UpperHex};
use core::ops::AddAssign;

/// A pointer with target specific metadata concerning validity or access rights.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear to me how we could actually use this type in upstream Tock since we don't have hardware that supports this metadata.

I still think we need an abstraction which captures the intent and not the implementation. This is tied to the naming question, but is distinct. It may actually make sense to call this struct CapabilityPtr given this comment description. However, I think something like QualifiedPtr better captures the distinction between the type of pointer considered in this PR, and other types of pointers. A QualifiedPtr would be described entirely based on its use case within Tock, and not at all based on how it could/is implemented.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't expect this type to add much value over *const () in upstream Tock, as I think it only makes sense on the platforms that have such a thing, so I don't think it will have any added benefit per se.

However, I think Capability does carry some semantic abstraction without just being a CHERI capability. For example, if Tock ever targeted a platform with PAC (which I would frame as password capabilities), I could imagine the checked parts of this interface checking pointer integrity, casts from usize signing the pointer etc.

I think that "authority bearing" does convey what it is used for and where it should be used and is independent of implementation details, such as how that authority is encoded or checked.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is some talking past each other here.

I don't think @bradjc is looking for "value" in non-CHERI Tock, but rather an explanation/name/whatever of the abstraction (the thing called CapabilityPtr) that makes it clear how it should be used. E.g., it is describing some concept. That concept is enforced if there are hardware capabilities. It's not enforced if there are not. But it's still meaningful to use that abstraction elsewhere, if only as a description of the meaning of some particular value.

My working model is something like: a CapabilityPtr names some address in userspace (though not necessarily currently/always accessible to userspace) that, conceptually, userspace may or may not be permitted to read, write, or execute. There are certain operations on this abstraction, and those operations carry some implicit (enforced or otherwise) affect on those permissions. For example, I can add/subtract offsets from a CapabilityPtr, and the result may or may not have the same access permissions (e.g. the permissions will drop if the offset puts me out of bounds of the region it referred to). I can also extract the word-size address component as pure data (a usize), which let's me "hide" data inside a CapabilityPtr in the same way I can "hide" data in a *const _.

Again, these properties may or may not be enforced. In this sense it's kind of like the difference between a bounds-checked slice in Rust and an array in C with no bounds checking. In all cases, correct programs will behave the same, and will use these in accordance with the rules/properties. On CHERI, an incorrect program will fault if it uses it incorrectly. Similarly, a Tock kernel that uses this concept correctly in the right places will enable correct userspace programs regardless, but a Tock kernel that uses it incorrectly will only fail to support programs under CHERI (because in non-CHERI userspace programs can also de-reference a usize/*const _ so the type that's passed up from the kernel doesn't actually matter).

OK, this concept now tells us where/how to use a CapabilityPtr. For example, if a system call is intending to return a value that might mean "a pointer userspace may dereference in some way" it should be a CapabilityPtr (e.g. the user-data passed down in the allow system call and up in upcalls). If a value is passed between userspace and the kernel that should never be treated as a pointer userspace can dereference, it should not be a CapabilityPtr.

(perhaps I'm wrong about the specific explanation of the abstraction, but hopefully this at least captures the gap in understanding)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think about this abstraction in a similar way. Maybe then, I am not sure what the suggestion for change is here. Just that Capability is still not a good name for this thing? Or that the text could do a better job explaining it use? Or its meaning?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or that the text could do a better job explaining it use? Or its meaning?

@bradjc, confirm?

I think the clarifying the meaning is necessary to review whether the rest of the PR is indeed using it in an appropriate way, etc. Especially to the extent it concerns things like changing TRD104 and related core kernel interfaces.

If that's the case, I can also take a stab at writing this out. I can see how the comment explaining this below is maybe too CHERI and implementation oriented

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, I think Capability does carry some semantic abstraction without just being a CHERI capability. For example, if Tock ever targeted a platform with PAC (which I would frame as password capabilities), I could imagine the checked parts of this interface checking pointer integrity, casts from usize signing the pointer etc.

I think one of the questions I have is, in this scenario, would PAC use the same CapabilityPtr as-defined here and the same syscall interface to pass the capability pointer, etc?

More important is up-leveling from that specific question and away from something specific to CHERI or to PAC or to [thing]—is what's proposed here a general mechanism for conveying pointers with metadata? If so, how do we express what the properties of a "pointer with metadata" abstraction should be?

A related question based on @alevy's new text, is this only for userspace pointers? Are theses capabilities only enforced during userspace operation, or are there cases where we would want to use hardware capability enforcement to e.g. support stronger isolation / IFT / etc between resources passed to capsules? Not to re-bikeshed the name, but if it's really userspace-only, that feels like it should be a front a center piece of the naming/rationale.

I'm also struggling with the repeated assertion that this is only useful on platforms with hardware capabilities. I feel like that isn't the case? i.e., a non-C userspace with some kind of managed runtime could in principle [less-efficient, sure] enforce all the same properties as CHERI or equivalent hardware. Perhaps explaining the pointer abstraction here with the perspective of also answering "what would a pure-software implementation that intercepts every memory access provide" would help, as it would divorce the discussion from any particular implementation and invite thinking of things in terms of the more fundamental properties?

Copy link
Author

@LawrenceEsswood LawrenceEsswood Nov 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would PAC use the same CapabilityPtr as-defined here and the same syscall interface to pass the capability pointer, etc

Hopefully. It may have to grow to assume a superset of hardware checked integrity (CHERI motivated bounds and W,R,X, but one can imagine more).

A related question based on @alevy's new text, is this only for userspace pointers?

No, I have left a comment there. The boundary is just the lowest hanging fruit because it is the boundary where compiler enforcement stops applying and we must rely on hardware.

i.e., a non-C userspace with some kind of managed runtime could in principle [less-efficient, sure] enforce all the same properties as CHERI or equivalent hardware

Well, I think it does, and I think that language is Rust and that is already precisely what libtock-rs does. I would argue that, in the rust type system, &T is already a capability enforced by the type system, and libtock-rs enforces that you make syscalls only when you have one of those. The point of CapabilityPtr was to be a source for authority beyond the existing type system (which is why it immediately found use across compiler boundaries). We cannot proves you used libtock-rs properly. Use of CapabilityPtr can fix issues in even Rust: when either the compiler is wrong, or unsafe Rust invokes UB. Unsafe Rust can fabricate &T. In userspace or the kernel. On some platforms (namely CHERI), it cannot fabricate a valid CapabilityPtr.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is what's proposed here a general mechanism for conveying pointers with metadata?

@ppannuto I intentionally avoided defining this in terms of metadata. It has a semantic meaning relating to user references. On some platforms thateaning might be enforcible with hardware support, on other it won't.

The important bit is that this type is a (valid or invalid) userspace reference. That there may be metadata on some platforms is incidental.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does PAC stand for?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -539,7 +540,7 @@ pub trait Process {
/// process's memory region.
/// - [`Error::KernelError`] if there was an internal kernel error. This is
/// a bug.
fn brk(&self, new_break: *const u8) -> Result<*const u8, Error>;
fn brk(&self, new_break: *const u8) -> Result<CapabilityPtr, Error>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The argument is a *const u8 because userspace is requesting access to memory, but fabricating an address ex nihilo. It returns a CapabilityPtr upon successful because the kernel has authorized access.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this was the motivation.

@@ -52,48 +52,48 @@ pub(crate) fn memop(process: &dyn Process, op_type: usize, r1: usize) -> Syscall
// Op Type 1: SBRK
Copy link
Member

@alevy alevy Nov 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes in this module are all backwards compatible with 32-bit userspace because when traslated to ABI, on 32-bit with into_compat SuccessUsize and SuccessPtr both get rewritten to SuccessU32.

@@ -1078,9 +1081,9 @@ pub struct FunctionCall {
/// The third argument to the function.
pub argument2: usize,
/// The fourth argument to the function.
pub argument3: usize,
pub argument3: CapabilityPtr,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is where we pass back the userdata (a void*) passed in via subscribe.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will add comment as to why

@@ -1078,9 +1081,9 @@ pub struct FunctionCall {
/// The third argument to the function.
pub argument2: usize,
/// The fourth argument to the function.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// The fourth argument to the function.
/// The userdata provided by the process via `subscribe`

@bradjc
Copy link
Contributor

bradjc commented Nov 4, 2024

Is there ever a use for CapabilityPtr within the kernel? If this can only ever work for pointers within userspace, perhaps we should include that in the name?

Or, I think it is fine to separate the struct name from the concept. So if we want to have a generic concept called CapabilityPtr, with an implementation called UserspaceCapabilityPtr that would work for me. This is what we do with leasable buffers (the generic concept) and SubSlice (the implementation we use in Tock).

Comment on lines +170 to +171
// On CHERI platforms we need to maintain metadata for
// these two as they are used to access code/data.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// On CHERI platforms we need to maintain metadata for
// these two as they are used to access code/data.

I think this is a useful comment for context in this PR, but given the "meaning" of CapabilityPtr I don't think it's useful to justify this in terms of CHERI

/// and will eventually be deprecated once all interfaces are updated.
pub const fn into_compat(self) -> Self {
// We only need to be backwards compatible on 32-bit systems
let compat = core::mem::size_of::<usize>() == core::mem::size_of::<u32>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why only on 32-bit platforms? This would require having needing different userspace libraries for 32 and 64-bit since libraries generally expect specific return variants.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intent here is not to use any new codes for existing systems (32-bit being a proxy for "legacy"). We can come up with a better compat as that evolves. Possibly this should be a feature.

This would certainly require changes to libtock-c/libtock-rs for 64-bit system. But that was going to happen anyway, because they don't support 64-bit systems.

However, it does mean they don't need any change on existing 32-bit systems. The intent is allow designing what we would like the interface to look like, without making invasive change. Possibly, on a major release, compat would always be false.

@LawrenceEsswood
Copy link
Author

@bradjc I would certainly like CapabilityPtr to make it further into the kernel (with a series of further patches). Right now, it makes it as far as this interface, but I would like for it to make its way into grants (via allow as well as subscribe), and later into drivers / devices.

One of the powers of a CHERI capability specifically is that it establishes a chain of integrity (it cannot be corrupted or fabricated). This means that once it makes it far as a driver, we can be assured that a driver can only operate on a buffer if the user really did pass it, independent of a kernel bug. Right now we rely on the Rust type system for that, but CapabilityPtr can add a second pass of hardware checks in case there is a bug in grant logic / driver logic (especially when DMA is concerned, and Rust can't save you as much).

@@ -7,26 +7,23 @@
use core::fmt::{Formatter, LowerHex, UpperHex};
use core::ops::AddAssign;

/// A pointer with target specific metadata concerning validity or access rights.
/// A pointer to userspace memory.
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I don't think it is necessary to be userspace memory. This may also find internal use.

Copy link
Member

@alevy alevy Nov 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you give an example? Would it feel better to say this is memory userspace can refer to (e.g.., it may not be in a processes memory region, and might in principle be shared from kernel memory or another process).

I think it's really important to nail down what exactly this abstraction represents. We can use capability hardware to implement different abstractions if we want.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a capability could legally refer to any memory, user or otherwise. Once capabilities reach drivers, they may very well point to either kernel memory or user memory. I may wish to have the same driver write into userspace or an internal buffer depending on some mux layer below it.

Specifically in the case of user memory (because of how we upgrade the syscalls interface) it might be true to say that "any capability referring to user memory has been checked for validity", but that's about it.

/// A `usize`, possibly podding with extra bits. It is therefore an appropriate choice for the type
/// of a register that may contain any one of these in the syscall ABI at a point where it is not
/// yet clear which of these it is yet.
/// [^note1]: Depending on the architecutre, this the size of a
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No "this" in this sentence?

@@ -52,48 +52,48 @@ pub(crate) fn memop(process: &dyn Process, op_type: usize, r1: usize) -> Syscall
// Op Type 1: SBRK
1 => process
.sbrk(r1 as isize)
.map(|addr| SyscallReturn::SuccessU32(addr as u32))
.map(|addr| SyscallReturn::SuccessPtr(addr))
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...although into_compat will reverse this again.

@alevy
Copy link
Member

alevy commented Nov 5, 2024

@LawrenceEsswood

I would certainly like CapabilityPtr to make it further into the kernel

But it would always we referring to something meaningful to userspace. E g. This memory I'm using in this driver came from userspace, is share with userspace, etc.

Write,
ReadWrite,
Execute,
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the surface, this is an odd set of possible permissions. I assume that Any means Read + Write + Execute, but if so why don't Read + Execute or Write + Execute exist. If it wasn't for Any existing I would guess that this was supposed to hard-enforce W^X.

Are these tied to CHERI or implementations in particular? Is the set of possibilities likely to remain constant or could it change as new CHERI implementations appear?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some CHERI platforms do indeed have W^X. The actual combinations are complicated. However, this list was motivated not by what CHERI supports, but by the uses in Tock. Which is this list, apart from Any, which I am now confused by.

It's long ago enough that I have to admit I am not sure why this is "Any" and not just "All" or "None". Maybe I was trying to suggest that "All" may not be possible, and so this should just be viewed as "As Much As Possible".

But now I look at my CHERI branch and I seem to have mapped this to no permissions (which is definitely any, not all). Maybe this should just be called "None". I feel there needs to be a default if someone does not care, but possibly that best default is None.

I expect the set to grow not based on hardware, but based on Tock's need to express different things.

@@ -70,6 +69,12 @@ pub enum UpcallError {
KernelError,
}

// FIXME: When we get CHERI compiler support, these can go back to the proper types
// b/274586199
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tock probably shouldn't have Google-internal links in its source code.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Internal policy seems to suggest it is acceptable, and I would like to keep them unless Tock maintainers themselves have a problem.

I know they offer no external value, but it reduces out downstream diff, and makes it obvious what issue this links to internally if we get a patch that fixes a bug externally.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I see the value now. I think it would be reasonable to leave this in here, but indicate what it is:

// https://github.com/tock/tock/issues/4134
// Google-internal issue: b/274586199

pub fn encode_syscall_return(&self, a0: &mut u32, a1: &mut u32, a2: &mut u32, a3: &mut u32) {
if core::mem::size_of::<CapabilityPtr>() == core::mem::size_of::<u32>() {
// SAFETY: if the two unsized integers are the same size references to them
// can be safely transmuted.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The transmutes also need align_of::<u32>() >= align_of::<CapabilityPtr>(), and I don't think anything guarantees that right now. Should probably check that as well.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, will amend.

allow_address: r2 as *const u8,
allow_size: r3,
subdriver_number: r1.into(),
allow_address: r2.as_ptr_checked::<u8>(r3.into(), MetaPermissions::Read) as *mut u8,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand this correctly, if a process makes an Allow call and passes a buffer that lacks the necessary permissions, this will silently change the address to null. I don't think that's the behavior we want.

Maybe as_ptr_checked should return an Option or Result rather than null on failure?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am happy with as_ptr_checked returning an Option, but I think that opens up the question of how we would like to handle it: by propagating the None?

On one hand, that lets the user know about the error.

On the other, the way it exists currently removes the old allow. That means that any commands that a user sends after won't accidentally be made against the previously allowed thing. At least propagating the error means they shouldn't fall victim to this. But its hard to guess at internal logic of userspace and zeroing out the allow felt safe. We could always do as_ptr_checked().unwrap_or(core::ptr::null()) if we liked this behavior but wanted the option.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At a conceptual level, Allow treats buffer pointers as if they are not copyable. When a userspace process invokes Allow, it passes that buffer to the kernel. The kernel should pass that buffer back eventually: either as part of the failure message for this Allow call, or as part of the success message of a future Allow call. You can see this in TRD 104:

The return variants for Read-Write Allow system calls are Failure with 2 u32 and Success with 2 u32. In both cases, Argument 0 contains an address and Argument 1 contains a length. When a driver implementing the Read-Write Allow system call returns a failure result, it MUST return the same address and length as those that were passed in the call. When a driver implementing the Read-Write Allow system call returns a success result, the returned address and length MUST be those that were passed in the previous call, unless this is the first call. On the first successful invocation of a particular Read-Write Allow system call, an driver implementation MUST return address 0 and size 0.

To be consistent with this design, you need to either return an error or keep the pointer intact so it can be returned on a later Allow. IMO, letting the user know about the error seems like the better option versus pretending to succeed then telling the syscall driver that the user passed a null buffer. It's also consistent with the kernel's behavior if userspace passes a pointer that points outside its address space:

The Tock kernel MUST check that the passed buffer is contained within the calling process's writeable address space. Every byte of a passed buffer must be readable and writeable by the process. Zero-length buffers may therefore have arbitrary addresses. If the passed buffer is not complete within the calling process's writeable address space, the kernel MUST return a failure result with an error code of INVALID.

The userspace libraries should already be compatible with this behavior, and if not, that is a bug in the userspace library.

@@ -206,7 +206,7 @@ impl kernel::syscall::UserspaceKernelBoundary for SysCall {
state.regs[R_RA] = state.pc;

// Save the PC we expect to execute.
state.pc = callback.pc as u32;
state.pc = callback.pc.as_ptr::<()>() as usize as u32;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be usize::from(callback.pc) as u32? That removes one cast. Alternatively, maybe CapabilityPtr is not meant to be used for *mut T -> CapabilityPtr -> usize casts directly? (See also my question about provenance in another file).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think one fewer cast is fine here.

}
}

impl From<usize> for MetaPtr {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to reopen this for a closely-related question. What do these operations do to provenance? My guess is:

  1. From<CapabilityPtr> for usize does not expose the provenance, similar to .addr
  2. From<usize> for CapabilityPtr produces a pointer without provenance, similar to core::ptr::without_provenance.

/// On CHERI, this grants authority.
/// Access to this return is therefore privileged.
SuccessPtr(MetaPtr),

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, there is a bit of weirdness in that this probably belongs in a new TRD, but it should add to the syscall return variants, which are tracked in a table in TRD 104. I think it is important for us to track the syscall return variants in a single place (so we don't accidentally assign the same number for different variants in different TRDs). My suggestion is to update the return variants table in TRD 104, but to design the rest of this in a separate TRD.

Execute,
}

impl From<MetaPtr> for usize {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documenting that CapabilityPtr is a pointer then using it to store other things is very confusing, and I think having a separate RegisterData [1] type would help that confusion. CapabilityPtr is already a confusing enough type to understand, especially for contributors who are unfamiliar with CHERI.

The fact that some CHERI systems use different registers for integers and capabilities however, does throw a wrench into the works. However, there are two obvious ways that come to mind for how we might choose to handle that at the ABI level:

  1. We can pass integers in the integer registers and capabilities in the capability registers. In this case, the kernel knows which are which so we can can use usize for the integer registers and CapabilityPtr for the capability registers [2].
  2. We can pass everything in the capability registers and ignore the integer registers. This is just treating the system as a CHERI system that uses the same registers for everything, and we'd just use RegisterData for all the register values.

In both cases, it makes sense to only use CapabilityPtr for pointer types, so I don't think it's really a blocker for having both CapabilityPtr and RegisterData. Option 1 would be somewhat awkward, though, because we would only use RegisterData for the capability registers.

[1] libtock-rs just calls this Register
[2] This ignores the possibility of non-pointer capabilities, which is another complication

use core::fmt::{Formatter, LowerHex, UpperHex};
use core::ops::AddAssign;

/// A pointer with target specific metadata concerning validity or access rights.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does PAC stand for?

///
/// [^note1]: Depending on the architecutre, this the size of a
/// [CapabilityPtr] may be a word size or larger, e.g., if registers
/// can store metadata such as access permissions.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should mention CHERI somewhere in this, because I think CapabilityPtr will be hard to understand for someone who is isn't very familiar with CHERI. [^note1] seems like a good place to mention this, as it's already giving an example of a place where CapabilityPtr is useful.

@bradjc
Copy link
Contributor

bradjc commented Nov 6, 2024

What are the downsides to having multiple capability pointer types? From what I can understand we need two things for this PR to move forward:

  1. That Tock explicitly recognizes a type of special pointer which represents some sort of privileged, access-controlled memory access. (This is the general idea of a capability pointer.)
  2. A specific Rust type within the kernel which represents pointers that userspace would use to access memory.

We've long tried to avoid including general mechanisms without concrete implementations. Towards that, while we may use capability pointers for more things in the future, do we need to have a single type today that would encompass those? I feel like we should narrowly scope the new type to its use case today. The utility is still high. If we use a capability pointer for something else in the future we could name that based on its use in Tock.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this makes sense in the utilities folder. Adding the capability pointer idea seems the same to me as adding the leasable buffer idea.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kernel risc-v RISC-V architecture tock-libraries This affects libraries supported by the Tock project waiting-on-author
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants