NonCopyable structs attribute and analyzer #50389

jkotas · 2021-03-29T19:27:25Z

Background and Motivation

New high-performance APIs are often exposed as structs to avoid GC heap allocation. Callers of such APIs have to be careful to avoid creating accidental copies of the struct. Failure to do so can lead to correctness or security issues. We need a capability to prevent or detect this class of bugs at build time.

ValueStringBuilder is an example of an API where creating accidental copy is a potential security bug: #25587 (comment)

Existing implementations of similar analyzers:

Proposed API

Promote https://github.com/ufcpp/NonCopyableAnalyzer developed by @ufcpp into .NET platform API so that it can be used by the platform itself.

 namespace System
 {
+    [AttributeUsage(AttributeTargets.Struct)]
+    public class NonCopyableAttribute : Attribute
+    {
+    }
 }

Usage Examples

[NonCopyable]
public struct ValueStringBuilder
{
...
}

ValueStringBuilder vsb = new ValueStringBuilder();
f(vsb); // Error. ValueStringBuilder must by passed by reference

Alternative Designs

Full C# language feature. The difficulty in doing so is described in https://blog.paranoidcoding.com/2019/12/02/borrowing.html .

Risks

Corner cases that are missed by the analyzer, e.g. use of reflection.

The text was updated successfully, but these errors were encountered:

jkotas · 2021-03-29T19:31:24Z

More context dotnet/designs#189 (comment)
cc @GrabYourPitchforks

sharwell · 2021-03-29T19:44:13Z

The one feature that I really wish the analyzer supported is "allow copies to read-only locations". This configuration would allow non-mutating operations to run against snapshot copies, which broadens applicability of [NonCopyable] from strictly non-copyable types to more general usage on mutable value types to avoid mutating a copy that will not be observed.

jkotas · 2021-03-29T19:49:58Z

allow copies to read-only location

It would not work for ValueStringBuilder and other similar cases where the struct holds manually managed resource.

bartonjs · 2021-03-29T19:50:20Z

The one feature that I really wish the analyzer supported is "allow copies to read-only locations"

That wouldn't solve the ValueStringBuilder problem, though. At least, not if it uses ArrayPool.

A VSB gets into a given state.
A readonly copy gets created for the current state.
The writable version calls Append
- VSB needs to grow, so it rents a new array, copies the contents, returns the old array.
State sanity check, the copy still looks like it did prior to the call to Append.
Something else happens
- The something else rents an array, gets the recently returned array, and uses it.
State sanity check fails.
- The copy had a reference to the returned and re-rented array. That array's contents have changed.

Even without ArrayPool, overwrites would be visible on the copy (writing through into arrays), but appends wouldn't (changing fields).

I don't doubt that readonly copies can make sense in some cases, but they wouldn't solve this one, and feel like they're a weird knifes-edge in general.

sharwell · 2021-03-29T19:54:46Z

It would not work for ValueStringBuilder and other similar cases where the struct holds manually managed resource.

Note that my comment was more for wishing we could do this:

[NonCopyable(AllowCopyToReadOnlyLocation = true)]

Types like ValueStringBuilder would use the default value false, but many value types which only need to be non-copyable during an initial mutable phase could leverage this for improved usability.

sharwell · 2021-03-29T19:57:50Z

On a totally separate note, this feature is going to be extremely difficult to implement without the compiler exposing some sort of an API specifically to analyze value copies. Even with the conservative approach used by RS0042 (fail on any unrecognized construct) we've found a significant number of holes.

tannergooding · 2021-03-29T20:02:32Z

CC. @jaredpar. Would this conflict with or be superseded by any work being done on the language side?

sharwell · 2021-03-30T02:04:57Z

I'm not aware of LDM work on this, but will let @jaredpar provide a more definitive answer. I stopped pushing on this as a language feature primarily due to questions surrounding the value of #50389 (comment) (the compiler would not be able to provide this value through built-in language features, but an analyzer could account for it as long as an analyzer API for value copies existed).

I believe the ideal handling here would be a general analyzer callback for any copy in IL (without any consideration for the type being copied), and the analyzer for this feature would filter as necessary.

jaredpar · 2021-03-30T04:25:09Z

Coulpe of questions:

Do you all have a fleshed out set of rules for NonCopyable or ValueStringBuilder that you want to enforce with this analyzer? Couple that seem to apply:
- Disallowed as generic arguments
- Disallow [NonCopyable] struct as a field of standard struct
- Disallow as locals in an async method or iterator
- etc ...
Generally non-copyable types end up desiring to have move semantics. Essentially copying is bad but move is okay. Do you all desire this property here and if so how do you all define what constitutes a "move". Consider that returning a local from a function is likely a safe "move" operation.
Are multiple mutable references okay here? Basically can you do the following when vsb represents a [NonCopyable] struct: SomeMethod(ref vsb, ref vsb).
Are [NonCopyable] types forced to be ref struct or can they be simple struct? If the latter then that implies tearing is okay (is that the case)?

On a totally separate note, this feature is going to be extremely difficult to implement without the compiler exposing some sort of an API specifically to analyze value copies.

I agree. There are a lot of subtle cases where the compiler does or does not copy a value. It's possible to write conservative analyzers which flag more copies than exist but are unlikely to miss any. It's hard to write one that exactly matches the behavior of the compiler though.

It's also important to know if we care about invisible copies. There are several places where the compiler will introduce copies in synthesized members that are invisible to the customer but do constitute copies of the values (consider the places where the compiler introduces thunks to meet some metadata contract and hence has to pass values through the thunk). Do this matter for this feature? My assumption is no but wanted to check.

sharwell · 2021-03-30T04:56:57Z

⚠️ This comment is written from the perspective of all the reference types I've been making NonCopyable, along with the RS0042 (Do not copy value) analyzer. It may on may not apply to ValueStringBuilder.

Disallowed as generic arguments

This is still a known hole in RS0042

Disallow [NonCopyable] struct as a field of standard struct

This was one of the holes we recently fixed in RS0042 (dotnet/roslyn-analyzers#4798). It's also one of the things that makes #50389 (comment) particularly desirable.

Disallow as locals in an async method or iterator

This should be safe right?

etc ...

Boxing is the main one. Originally I wrote RS0042 to disallow boxing, but later I realized that boxing is a safe option (when it's moved to a box) and unboxing is the problematic operation. A boxed value type can be accessed using Unsafe.Unbox or through a virtual call.

Generally non-copyable types end up desiring to have move semantics. Essentially copying is bad but move is okay. Do you all desire this property here and if so how do you all define what constitutes a "move". Consider that returning a local from a function is likely a safe "move" operation.

For all the cases I've encountered to date, implicit move semantics were desirable (i.e. a copy is only a problem if the location is read after that copy). If a type is not movable, there are many additional constraints (e.g. cannot store as a field of a reference type unless that reference type is pinned, and cannot capture a local into a reference type).

Are multiple mutable references okay here?

💭 This should be a safe operation.

Are [NonCopyable] types forced to be ref struct or can they be simple struct?

For the types I've been working with, making the types a ref struct is extremely undesirable. This breaks the ability to use the types in state machines, including async code.

Joe4evr · 2021-03-30T05:15:18Z

Also related: dotnet/csharplang#2372

Joe4evr · 2021-03-30T05:32:25Z

Disallowed as generic arguments

I imagine an exception could be designed here, similar to how people want an exception to passing ref structs as a type argument:

public void M<[NonCopyableSafe] T>(ref T val)
{
    // compiler checks that all uses of T don't make "forbidden" copies,
    // then callers are allowed to substitute a non-copyable struct type
}

Of course, this can always be done later and isn't strictly necessary for an MVP. I'm just throwing some 🍝.

huoyaoyuan · 2021-03-30T05:44:02Z

Related: dotnet/csharplang#4345

benaadams · 2021-03-30T09:21:54Z

Would this work for e.g. SpinLock which might be ok to pass byref but definitely don't want to copy it as then the (now) two become unsynchronised

runtime/src/libraries/System.Private.CoreLib/src/System/Threading/SpinLock.cs

Lines 51 to 53 in 82c7051

    
           [DebuggerDisplay("IsHeld = {IsHeld}")] 
        
           public struct SpinLock 
        
           {

stephentoub · 2021-03-30T11:05:50Z

@333fred, how would this play with how you're defining builders for interpolated strings? Presumably some builders (e.g. ones wrapping ArrayPool buffers) would benefit from being annotated as non-copyable, but the design as proposed today actually requires them to be passed around.

jkotas · 2021-03-30T13:49:44Z

Would this work for e.g. SpinLock

Yes. More examples of NonCopyable types in Roslyn: https://github.com/dotnet/roslyn/search?q=%5BNonCopyable%5D

GrabYourPitchforks · 2021-03-31T16:22:46Z

@jaredpar I tried to draft a list of restrictions a while back (see https://gist.github.com/GrabYourPitchforks/7ab6a440100467a82cfe5998cd1e91be). I saw Jan linked to this gist in an earlier comment, but I don't know how much it influenced this proposal or whether there was any kind of value judgment made over it. When you and I talked about this a few weeks back you had pointed out some rough spots.

333fred · 2021-03-31T16:36:05Z

@333fred, how would this play with how you're defining builders for interpolated strings? Presumably some builders (e.g. ones wrapping ArrayPool buffers) would benefit from being annotated as non-copyable, but the design as proposed today actually requires them to be passed around.

The answer for that will depend on what LDM thinks about builders passed by in or ref, I guess.

stephentoub · 2021-03-31T16:46:59Z

The answer for that will depend on what LDM thinks about builders passed by in or ref, I guess.

Yes... we should have an opinion / proposal, though.

jaredpar · 2021-03-31T18:04:05Z

The answer for that will depend on what LDM thinks about builders passed by in or ref, I guess.
Yes... we should have an opinion / proposal, though.

I actually think this is a good motivating example for why we need to think about "move" semantics.

Consider that in is unlikely to be the answer here. The builders need to be mutated in the calling function, if for nothing else than to call ToString. That is a mutating function for the builders, ValueStringBuilder returns items to the array pool, hence it's unlikely ToString will be readonly. That means it won't be callable on in unless we first copy which violates NonCopyable.

Passing by ref will work but the design specifically allows for multiple builders to be in play here. That means we're going to have cases where multiple builders are being passed by ref and the builders are likely to be ref struct. That gets into territory where it gets easy to run afowl of ref struct lifetime rules.

The idea of "moving" the value, essentially passing it to the function such that the calling function's copy is destroyed (assigned default) seems like it is the most friction free option.

benaadams · 2021-03-31T19:16:27Z

The idea of "moving" the value, essentially passing it to the function such that the calling function's copy is destroyed (assigned default) seems like it is the most friction free option.

That would mean it couldn't be used for SpinLock as that would also destroy its locking semantics just as much as copying would.

jkotas · 2021-03-31T20:51:36Z

That would mean it couldn't be used for SpinLock

We may need multiple levels:

No copy/move (e.g. SpinLock)
Move is ok (e.g. ValueStringBuilder)
Read-only copy is ok (e.g. NonCopyable structs attribute and analyzer #50389 (comment))

acaly · 2021-04-08T18:48:01Z

I have had a closely related issue. I sometimes internally use a mutable struct to implement "fast list", which is just a wrapper of array, from which references of the elements can be obtained. I remember once I marked such a struct with readonly. As a result, adding elements fails because it cannot modify the field to the newly allocated array in case of initialization and reallocation.

So in some cases, if not all, I would suggest to have a way to declare that a struct should not be used as readonly field. I guess many move-is-ok noncopyable structs should fit in this category.

This is more like a language feature instead of runtime feature.

sharwell · 2021-04-08T18:51:34Z

@benaadams @jkotas Can you explain why move semantics would not work for SpinLock? Looking at the implementation, it appears that move semantics would be fine.

acaly · 2021-04-08T18:55:17Z

@sharwell I guess it's because someone might be waiting for the lock, effectively storing a reference to the struct on stack.

stephentoub · 2021-04-08T18:56:38Z

@sharwell, if "moving" involves copying and zero'ing out the original, that a) is not atomic / thread-safe and b) still leaves the original field as a default value which is a valid spin lock that's now disassociated from the copy.

sharwell · 2021-04-08T19:02:49Z

@stephentoub A move is defined as the relocation of a value from location A to a different location B, after which point no code will access location A expecting to observe the original value. Atomicity of the move is not guaranteed; code which accesses the location is expected to either provide synchronization for the move or intentionally operate on a value which can be moved atomically. Zeroing the original value is not required for a move to be treated as a move, as any read access to the original location after a move is a semantic error.

stephentoub · 2021-04-08T19:07:56Z

So how are you envisioning that would work with a SpinLock:

private SpinLock _lock;

What code do two threads write to successfully use that lock? Today it's:

_lock.Enter();
...
_lock.Exit();

What is it when SpinLock is attributed as being moveable?

sharwell · 2021-04-08T19:12:43Z

It depends entirely on the manner in which the instance is moved. For the case of StrongBox<SpinLock> (or any SpinLock stored as a field of a reference type), there is no special handling needed for the case where the compacting GC moves the storage location for the boxed value.

stephentoub · 2021-04-08T19:15:35Z

SpinLock is rarely used that way. It's typically a field in and of itself, with threads accessing that field. So how does it work in that case?

Maybe I've misunderstood what you were proposing. The earlier comment was that SpinLock shouldn't be copyable/moveable, and you asked what would be the problem with it being moveable. I'm trying to understand how you envision it being moveable safely.

sharwell · 2021-04-08T19:25:39Z

Non-copyable value types are types where the user's code is responsible for managing the storage location for the value. Likewise, moving a non-copyable value type is an operation implemented in the user's code, with all the caveats faced by the compacting GC for reference types. Application misbehavior following failure to perform a move with correct semantics does not mean a correctly-performed move operation would have the same result.

sharwell · 2021-04-08T19:29:11Z

Put another way: do we have examples of situations where a user is likely to misuse a SpinLock without a warning if we implement it as non-copyable but movable?

stephentoub · 2021-04-08T19:52:18Z

Sam and I spoke offline. The disconnect stems from two completely different definitions of moveable:

I was using it to mean an API / language syntax for copying the value and clearing out the original or otherwise invalidating the original.
Sam was using it to mean that its address in memory could change, which would be the case for anything non-pinned on the GC heap.

danmoseley · 2021-12-23T19:43:14Z

@stephentoub thoughts about whether we should aim to do work here in 7.0 (either analyzer or language feature)? Seems there's a fair bit of interest in ValueStringBuilder alone.

stephentoub · 2022-01-03T03:27:54Z

@stephentoub thoughts about whether we should aim to do work here in 7.0 (either analyzer or language feature)?

In theory it's a valuable thing to do and would unblock a set of features we've been hesitant to do, making them enough less of a footgun that we'd then move forward with them. In practice the initial work associated with this would be coming up with the actual rules it would implement, and then seeing which of the things we've avoided doing would now be practical/still valuable given those rules. As is my typical want, we'd need to be ready to use it in multiple places in the same release that we add it. That goes whether it's done as a C# language/compiler or analyzer feature. We'd also need to work through any compat concerns with applying the attribute to types we've already shipped, and what level of noise we'd be ok with on existing code.

mrpmorris · 2022-06-05T13:23:16Z

I've been stung by this twice with SpinLock

First

public readonly SpinLock Locker = new();

Second

public static void ExecuteLocked(this SpinLock locker)
{
  // Above should have `ref`
}

jkotas added Security api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.Runtime code-analyzer Marks an issue that suggests a Roslyn analyzer labels Mar 29, 2021

dotnet-issue-labeler bot added the untriaged New issue has not been triaged by the area owner label Mar 29, 2021

jkotas mentioned this issue Mar 29, 2021

.NET runtime security mitigations dotnet/designs#189

Merged

pgovind mentioned this issue Apr 9, 2021

Regex Capture Trees #38776

Closed

stephentoub mentioned this issue Apr 13, 2021

Add S.R.CompilerServices.InterpolatedStringBuilder #51086

Merged

This was referenced May 20, 2021

[WIP] Add rule for detecting certain readonly mutable structs dotnet/roslyn-analyzers#2831

Open

Do not hold SpinLock in fields marked as readonly #33773

Open

tannergooding removed the untriaged New issue has not been triaged by the area owner label Jul 12, 2021

tannergooding modified the milestones: 7.0.0, Future Jul 12, 2021

danmoseley mentioned this issue Dec 23, 2021

API Proposal: Add a ValueStringBuilder #25587

Open

buyaa-n modified the milestones: Future, 8.0.0 Nov 11, 2022

buyaa-n mentioned this issue Nov 16, 2022

.NET 8 developers can verify more APIs for correct usage to speed up their development #78442

Closed

dakersnar mentioned this issue Nov 30, 2022

System.Runtime work planned for .NET 8 #79053

Closed

31 tasks

tannergooding modified the milestones: 8.0.0, Future Jul 24, 2023

stephentoub mentioned this issue Apr 12, 2024

Regression in R2R compilation time #100995

Closed

fiotti mentioned this issue Aug 27, 2024

Fix S2933 FP: readonly fields in a struct re-assigned with 'this' SonarSource/sonar-dotnet#9657

Open

NonCopyable structs attribute and analyzer #50389

NonCopyable structs attribute and analyzer #50389

Comments

jkotas commented Mar 29, 2021

Background and Motivation

Proposed API

Usage Examples

Alternative Designs

Risks

jkotas commented Mar 29, 2021

sharwell commented Mar 29, 2021

jkotas commented Mar 29, 2021

bartonjs commented Mar 29, 2021

sharwell commented Mar 29, 2021 • edited Loading

sharwell commented Mar 29, 2021 • edited Loading

tannergooding commented Mar 29, 2021

sharwell commented Mar 30, 2021 • edited Loading

jaredpar commented Mar 30, 2021

sharwell commented Mar 30, 2021 • edited Loading

Joe4evr commented Mar 30, 2021

Joe4evr commented Mar 30, 2021 • edited Loading

huoyaoyuan commented Mar 30, 2021

benaadams commented Mar 30, 2021

stephentoub commented Mar 30, 2021

jkotas commented Mar 30, 2021

GrabYourPitchforks commented Mar 31, 2021

333fred commented Mar 31, 2021 • edited Loading

stephentoub commented Mar 31, 2021

jaredpar commented Mar 31, 2021

benaadams commented Mar 31, 2021 • edited Loading

jkotas commented Mar 31, 2021

acaly commented Apr 8, 2021

sharwell commented Apr 8, 2021

acaly commented Apr 8, 2021

stephentoub commented Apr 8, 2021

sharwell commented Apr 8, 2021 • edited Loading

stephentoub commented Apr 8, 2021

sharwell commented Apr 8, 2021 • edited Loading

stephentoub commented Apr 8, 2021 • edited Loading

sharwell commented Apr 8, 2021 • edited Loading

sharwell commented Apr 8, 2021

stephentoub commented Apr 8, 2021 • edited Loading

danmoseley commented Dec 23, 2021

stephentoub commented Jan 3, 2022

mrpmorris commented Jun 5, 2022

sharwell commented Mar 29, 2021 •

edited

Loading

sharwell commented Mar 29, 2021 •

edited

Loading

sharwell commented Mar 30, 2021 •

edited

Loading

sharwell commented Mar 30, 2021 •

edited

Loading

Joe4evr commented Mar 30, 2021 •

edited

Loading

333fred commented Mar 31, 2021 •

edited

Loading

benaadams commented Mar 31, 2021 •

edited

Loading

sharwell commented Apr 8, 2021 •

edited

Loading

sharwell commented Apr 8, 2021 •

edited

Loading

stephentoub commented Apr 8, 2021 •

edited

Loading

sharwell commented Apr 8, 2021 •

edited

Loading

stephentoub commented Apr 8, 2021 •

edited

Loading