-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CFI: Fix many vtable-related problems #121962
Conversation
rustbot has assigned @petrochenkov. Use r? to explicitly pick a reviewer |
Some changes occurred to MIR optimizations cc @rust-lang/wg-mir-opt Some changes occurred to the CTFE / Miri engine cc @rust-lang/miri |
This comment has been minimized.
This comment has been minimized.
Looks like rustbot chose me due to the |
@maurer Is there no chance at factoring this down the line of the Why does this require changes in |
It adds a new kind of MIR shim. But the question then is, why is that needed? Maybe someone from @rust-lang/wg-mir-opt can help evaluate whether having this new shim is a good idea. (I don't think we have a ping group for MIR in general, just for MIR opts?) |
/// For `target_instance` which are generatable shims, this is done via inserting a cast at the | ||
/// beginning of the shim generated by that instance. | ||
/// For `Item`s, we instead build a direct call shim with that cast inserted. | ||
CfiShim { target_instance: &'tcx InstanceDef<'tcx>, invoke_ty: Ty<'tcx> }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean we can remove the code in the interpreter and codegen that currently does receiver type adjustments? In the interpreter, that's here.
If that code is still needed then I don't understand what this achieves.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That code is still needed at the moment because I am conditionally generating shims - they are only generated when CFI is enabled, since they will result in worse code generation (If it's an item, there's a cast trampoline, and if it's not an item, there's a variant of the shim produced). Even if all calls were inlined, it'd still produce extra bulk as each vtable the type appears in gets its own shim.
I could try testing disabling that code when .cfi_shims()
is true if you think that would be interesting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could try testing disabling that code when .cfi_shims() is true if you think that would be interesting.
No, I was just going off of the PR description which lacks key information it seems. I'll wait for more details to be added. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Incorrect statement, ignore)
I'm unsure how that would work - if I have
how could I introduce this cutover at the stage of MIR->SSA? I was under the impression that by the time that was occurring, we were supposed to be doing a 1:1 transformation of Instance -> codegen structure, not Instance -> several. VTable generation has also already occurred above that abstraction level, so I don't know how I would get the vtables to point to each of the three specialized allocations. |
So this makes the two vtables point to two different functions for that example? That sounds pretty bad for code size, doesn't it? And for perf too -- either there's now extra calls and stack traffic, or it gets inlined and there's even more code bloat and the same code has to be optimized and passed to the backend multiple times. I guess I'll wait for your design document. Currently that doesn't sound like somewhat we want to do for regular builds. Maybe this only happens for CFI builds, the PR description doesn't say. |
I only meant in terms of landing the changes that need to happen before |
I'll provide a larger writeup in a bit, but here's the summary:
With regards to performance, I have not benchmarked it, though I do expect it to hurt performance. Since most of these methods would previously crash when CFI was enabled, and the shims are only generated with CFI enabled, I didn't think it was critical. If you'd like, I can benchmark enabling shims without enabling CFI, do you have a suggested workload? |
Sorry, I misunderstood and thought you wanted me to push all the logic into |
Seems unfortunate that CFI doesn't have a better way to encode the correct semantics here and requires some contortions in the generated code. (At least, it doesn't seem obvious that having such trampoline calls actually increases security compared to CFI that would be able to detect the correct calls directly.) But well, I guess it's what it is...
I didn't realize this would be enabled only for CFI, that makes perf less of a concern then. It introduces new risks though, if these codepaths are not executed in regular builds at all. I see you added some tests :) . I would encourage you to not unnecessarily split up tests into multiple files; the overhead of separate compiler invocations does add up across a large test suite like ours. (But if they need different flags then they do need different files.) |
Hmm. Why exactly is the KCFI model incompatible with this particular style of pointer cast? My understanding is that usually, other kinds of pointer-casting can be accepted, without necessarily erasing all type-checking entropy, and that in many cases, C code that needs to do similar will be performing such an erasure manually? |
Yes, they point to two different functions. This is one way to make it work for regular CFI, and mandatory for KCFI where each function is only allowed a single type. We might be able to get rid of shims some day for regular CFI by leveraging the ability to associate multiple types with a function, but KCFI will always need these shims. KCFI might be able to use only one shim if we make trait upcasting a load-bearing part of virtual calls (since then you'd only ever shim the function at the trait it was implemented for, not all child traits), but that would be future work.
|
KCFI only supports exactly one principal type for a function. This means that no casting at the indirect call site is allowed. The actual implementation mechanism is that the compiler takes the principal type, encodes it to a string, hashes it, and then puts the hashed type in a prelude bundle before the function. When a caller wants to call a function, the compiler takes the type, encodes it to a string, hashes it, and then inserts a check that the function has that hashed type right before it. In the one place where KCFI is heavily used, Linux, C code does indeed perform this erasure manually. Since they don't have objects, their equivalent of vtables are usually structs of function pointers. Their usual pattern is that your implementation takes a pointer to a struct of the abstract type, which is sometimes data-bearing. If it is not data-bearing, it will usually just be a void* you do an explicit cast to your real object on. If it is, then usually it is present as a field inside your object, and the It is possible to avoid most of these shims (drop shims can't be entirely avoided) for LLVM-CFI with appropriate multiple attachments. |
Possibly. You're right about stacked diffs not being a thing on GH. I'll wait for someone from mir-opt to make a peep about whether they think also-seeing the LLVM and symbol-mangling logic is helpful or just visual noise. Although, now that I look more closely at it, could you possibly pluck these out? They don't seem like they depend on the rest of the changes at all, though I could be mistaken. Those could be landed in their own PR, if so. ( They do touch the same file, so they should probably land together for obvious merge conflict reasons, but they both seem to be "pure fixup" with none of the design considerations involved, so that should go quickly. ) |
Is it possible to break these changes into a set of smaller self-contained pairs of issues and fixes (like I've been fixing the CFI known issues)? The reason I ask is that not only it will make it easier for me to understand the problems and what is being proposed, but also because I think some of the issues here are fixed already or are in the process of being fixed for CFI using type metadata IDs/encoding only. I understand that CFI and KCFI perform type tests differently and are implemented as different LLVM passes for KCFI to not require LTO, and because of that KCFI has the limitation that a function may have one type id assigned only, and thus why shims may be necessary. However, CFI support for Rust was designed to be as unobtrusive as possible by just generating type tests and using type metadata IDs/encoding for its implementation, and I'd like to keep it that way (at least for CFI, if not possible for KCFI). Introducing shims/trampolines may result in CFI bypasses and affect CFI performance and I'd like to avoid that. Therefore, would it be also possible to (in addition to breaking these changes into a set of smaller self-contained pairs of issues and fixes), also making them for KCFI only (and not for CFI)? I'm not sure (and others let me know what they think), but I think that--even for KCFI only--this may require at least a design doc and/or an MCP explaining what is being proposed in detail--it'd for sure help me better understand and review it. Have you investigated adding support for functions to have multiple type ids assigned to the LLVM passes for KCFI instead, and weighted the costs x benefits of doing that instead of implementing a workaround for it in the Rust compiler? I think it's important it's also documented there. |
Despite my normally laissez-faire attitude about encoding certain type relationships, I strongly agree that we wish to handle generating code that produces more gadgets that are essentially explicitly designed to allow breaking down restrictions on type encodings with extra care. |
In the case of impl MyTrait for Foo {
fn bar(&self) { // ... }
}
trait OtherTrait : MyTrait {}
impl OtherTrait for Foo {} can't we generate the |
This isn't true, see the address-taken test I added. I can elaborate more when I'm not stuck on my phone if that doesn't explain things. |
That is a cast to a function pointer, right? See the rest of my comment:
|
I'd also like to reiterate my request for, in addition to breaking these changes into a set of smaller self-contained pairs of issues and fixes, which seems you started to do, also making these changes for KCFI only (since they aren't needed for CFI, and I'm fixing those corner cases for CFI using a solution that doesn't require any shims). In general, I'd like to see what the absolute minimum necessary use of shims even for KCFI looks like. For example, for function items, closures, and Fn trait objects to be used interchangeably, there is a solution I mentioned there I'm working on (see #116404) by signature_unclosure'ing/transforming those, exactly as the compiler does at the call sites and CFI does for other equivalent types1 that works even for KCFI, so the shims aren't necessary in this case even for KCFI. Therefore, if you also wait these changes to be merged, less shims are necessary. Since KCFI landed pretty recently (for the Linux kernel standards), I'm still not convinced that it is too widely deployed yet that makes it very difficult to explore any changes to it. There are also the points I made about the possibility of inadvertently introducing KCFI bypasses, the possibility of general performance and cache coherence/locality impact of the shim/trampoline being placed farther away from the actual destination, that I think need to be looked at, and more generally what is the right thing to do longer term, both for the Rust project and the Linux kernel. After all, there is a reason the CFI project is taking long, and it's because we've been trying to do the right thing, and not rush things--if we wanted, we could have everything building and running already, but would that have been the right thing to do? Footnotes
|
@rcvalle Let's separate out discussion of possible changes to KCFI itself from discussing this PR. If Rust decides to reject this PR because we decide this PR is too-baroque an implementation in Rust, for KCFI or otherwise, that would happen independently of whether KCFI would be changed in the future. I am not against discussing it (obviously), but I would rather do so in the Zulip thread I started for such. I will treat further discussion in this PR of that subject as off-topic and mark it appropriately. |
For anyone that commented on this PR and also anyone new to this PR, see my comments in #116404 (comment). |
…ilee CFI: Skip non-passed arguments Rust will occasionally rely on fn((), X) -> Y being compatible with fn(X) -> Y, since () is a non-passed argument. Relax CFI by choosing not to encode non-passed arguments. This PR was split off from rust-lang#121962 as part of fixing the larger vtable compatibility issues. r? `@workingjubilee`
…ilee CFI: Skip non-passed arguments Rust will occasionally rely on fn((), X) -> Y being compatible with fn(X) -> Y, since () is a non-passed argument. Relax CFI by choosing not to encode non-passed arguments. This PR was split off from rust-lang#121962 as part of fixing the larger vtable compatibility issues. r? ``@workingjubilee``
CFI: Skip non-passed arguments Rust will occasionally rely on fn((), X) -> Y being compatible with fn(X) -> Y, since () is a non-passed argument. Relax CFI by choosing not to encode non-passed arguments. This PR was split off from rust-lang#121962 as part of fixing the larger vtable compatibility issues. r? `@workingjubilee`
CFI: Skip non-passed arguments Rust will occasionally rely on fn((), X) -> Y being compatible with fn(X) -> Y, since () is a non-passed argument. Relax CFI by choosing not to encode non-passed arguments. This PR was split off from rust-lang#121962 as part of fixing the larger vtable compatibility issues. r? `@workingjubilee`
…, r=workingjubilee CFI: Support self_cell-like recursion Current `transform_ty` attempts to avoid cycles when normalizing `#[repr(transparent)]` types to their interior, but runs afoul of this pattern used in `self_cell`: ``` struct X<T> { x: u8, p: PhantomData<T>, } #[repr(transparent)] struct Y(X<Y>); ``` When attempting to normalize Y, it will still cycle indefinitely. By using a types-visited list, this will instead get expanded exactly one layer deep to X<Y>, and then stop, not attempting to normalize `Y` any further. This PR was split off from rust-lang#121962 as part of fixing the larger vtable compatibility issues. r? `@workingjubilee`
…, r=workingjubilee CFI: Support self_cell-like recursion Current `transform_ty` attempts to avoid cycles when normalizing `#[repr(transparent)]` types to their interior, but runs afoul of this pattern used in `self_cell`: ``` struct X<T> { x: u8, p: PhantomData<T>, } #[repr(transparent)] struct Y(X<Y>); ``` When attempting to normalize Y, it will still cycle indefinitely. By using a types-visited list, this will instead get expanded exactly one layer deep to X<Y>, and then stop, not attempting to normalize `Y` any further. This PR was split off from rust-lang#121962 as part of fixing the larger vtable compatibility issues. r? ``@workingjubilee``
…, r=workingjubilee CFI: Support self_cell-like recursion Current `transform_ty` attempts to avoid cycles when normalizing `#[repr(transparent)]` types to their interior, but runs afoul of this pattern used in `self_cell`: ``` struct X<T> { x: u8, p: PhantomData<T>, } #[repr(transparent)] struct Y(X<Y>); ``` When attempting to normalize Y, it will still cycle indefinitely. By using a types-visited list, this will instead get expanded exactly one layer deep to X<Y>, and then stop, not attempting to normalize `Y` any further. This PR was split off from rust-lang#121962 as part of fixing the larger vtable compatibility issues. r? ```@workingjubilee```
…, r=workingjubilee CFI: Support self_cell-like recursion Current `transform_ty` attempts to avoid cycles when normalizing `#[repr(transparent)]` types to their interior, but runs afoul of this pattern used in `self_cell`: ``` struct X<T> { x: u8, p: PhantomData<T>, } #[repr(transparent)] struct Y(X<Y>); ``` When attempting to normalize Y, it will still cycle indefinitely. By using a types-visited list, this will instead get expanded exactly one layer deep to X<Y>, and then stop, not attempting to normalize `Y` any further. This PR was split off from rust-lang#121962 as part of fixing the larger vtable compatibility issues. r? ````@workingjubilee````
…, r=workingjubilee CFI: Support self_cell-like recursion Current `transform_ty` attempts to avoid cycles when normalizing `#[repr(transparent)]` types to their interior, but runs afoul of this pattern used in `self_cell`: ``` struct X<T> { x: u8, p: PhantomData<T>, } #[repr(transparent)] struct Y(X<Y>); ``` When attempting to normalize Y, it will still cycle indefinitely. By using a types-visited list, this will instead get expanded exactly one layer deep to X<Y>, and then stop, not attempting to normalize `Y` any further. This PR was split off from rust-lang#121962 as part of fixing the larger vtable compatibility issues. r? `````@workingjubilee`````
…, r=workingjubilee CFI: Support self_cell-like recursion Current `transform_ty` attempts to avoid cycles when normalizing `#[repr(transparent)]` types to their interior, but runs afoul of this pattern used in `self_cell`: ``` struct X<T> { x: u8, p: PhantomData<T>, } #[repr(transparent)] struct Y(X<Y>); ``` When attempting to normalize Y, it will still cycle indefinitely. By using a types-visited list, this will instead get expanded exactly one layer deep to X<Y>, and then stop, not attempting to normalize `Y` any further. This PR was split off from rust-lang#121962 as part of fixing the larger vtable compatibility issues. r? ``````@workingjubilee``````
Rollup merge of rust-lang#122875 - maurer:cfi-transparent-termination, r=workingjubilee CFI: Support self_cell-like recursion Current `transform_ty` attempts to avoid cycles when normalizing `#[repr(transparent)]` types to their interior, but runs afoul of this pattern used in `self_cell`: ``` struct X<T> { x: u8, p: PhantomData<T>, } #[repr(transparent)] struct Y(X<Y>); ``` When attempting to normalize Y, it will still cycle indefinitely. By using a types-visited list, this will instead get expanded exactly one layer deep to X<Y>, and then stop, not attempting to normalize `Y` any further. This PR was split off from rust-lang#121962 as part of fixing the larger vtable compatibility issues. r? ``````@workingjubilee``````
…, r=workingjubilee CFI: Support self_cell-like recursion Current `transform_ty` attempts to avoid cycles when normalizing `#[repr(transparent)]` types to their interior, but runs afoul of this pattern used in `self_cell`: ``` struct X<T> { x: u8, p: PhantomData<T>, } #[repr(transparent)] struct Y(X<Y>); ``` When attempting to normalize Y, it will still cycle indefinitely. By using a types-visited list, this will instead get expanded exactly one layer deep to X<Y>, and then stop, not attempting to normalize `Y` any further. This PR was split off from rust-lang#121962 as part of fixing the larger vtable compatibility issues. r? ``````@workingjubilee``````
I'm closing this PR as I'm taking an approach much closer to the type removal one. The branch will still be there for reference until we finish fixing the overall vtable issues in case we need to crib solutions from it. |
Since we're now using an approach which is a variant of rust-lang#123082 (that transforms closures into dynamic Fn traits but isolating it to the Fn call methods only) instead of rust-lang#121962 or rust-lang#122573, skipping non-passed arguments shouldn't be necessary KCFI anymore and we can claim back the reduced granularity. This reverts commit f2f0d25.
Since we're now using an approach which is a variant of rust-lang#123082 (that transforms closures into dynamic Fn traits but isolating it to the Fn call methods only) instead of rust-lang#121962 or rust-lang#122573, skipping non-passed arguments shouldn't be necessary for KCFI anymore and we can claim back the reduced granularity. This reverts commit f2f0d25.
Since we're now using an approach which is a variant of rust-lang#123082 (that transforms closures into dynamic Fn traits but isolating it to the Fn call methods only) instead of rust-lang#121962 or rust-lang#122573, skipping non-passed arguments shouldn't be necessary for KCFI anymore and we can claim back the reduced granularity. This reverts commit f2f0d25.
This PR makes it possible to use trait objects with CFI enabled, as checked by the various tests added.
Non-landable tests I also did:
./x.py build --stage 2
with CFI hacked on for all build targets after#[no_sanitize]
ing parts of theproc-macro
crate. This gives me some confidence in the completeness of these CFI modifications.The general approach taken here is is to make
Virtual
instances and the instances put into vtables match their abstracted type, without changing the type of the underlying function instances.cc @rcvalle
cc @tmandry
cc @workingjubilee
I'm not sure who an appropriate reviewer is for this, so I'll let rustbot take a guess and post this PR in the zulip thread.