Move semantics for IDisposable types #7620
Unanswered
Yen
asked this question in
Language Ideas
Replies: 1 comment 2 replies
-
While this is somewhat interesting, you have a large digression in the middle about reassigning using variables that can't happen. In |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello, this is a follow up to my previous discussion #7611. I think this is out of the scope of this previous discussion however, so am making a separate one.
This proposal is also following the comment from @HaloFour on that discussion:
This believe now this is what I want, and so will outline a first pass at how I might see this work.
I will make note that there are several other issues and discussions relating to RAII and move semantics in C# (#4808, #6612, #1110, #5916, #5277). However most of these focus on implementing a new object model into C#. I would instead focus on extending IDisposable and its interaction with using statements to achieve a similar goal while improving the use of IDisposable and using statements across existing code.
As a preface for why and how moving semantics work in other languages such as C++/Rust, their resource model is referred to as RAII, and in short, all variables are IDisposable and all variables are always in a using statement of some sort. The exact semantics are quite a lot more complex and depend on the language, but that is the best view of it from a C# perspective. The benefit of this is that we can have types that manually manage resources (IDisposable), but be sure in all cases, that if they leave scope via an early return or via an exception being thrown, they are correctly disposed. This can have huge benefits to code safety and ergonomics, and is a big focus of C++/Rust for what gives them the ability to claim the level of resource safety they have. I won't go into more detail about the benefits of RAII and move semantics here, as this discussion is much longer and directly relating to the benefits of them in existing languages. For the most part, the arguments should be the same.
Firstly, let us consider a common piece of code that I am sure almost everyone has seen:
Unfortunately, this is a programming error, but one that is common for newer developers who do not fully understand using statements or IDisposable lifetimes in general. In slightly more advanced cases, I believe tooling may even suggest the conversion to a using statement as language tooling has no real way of understanding this either. The issue here is that we don't have a way to define and subsequently transfer "ownership" of IDisposable objects.
This also is possible in current C# code, and the compiler will not show any warning that this is incorrect.
Lastly, we have a sample where just looking here, we can't tell if this is correct or not. We don't know from the callsite alone if
MyClass
will hold on toobj
, or if it just uses it for construction. We don't have any way of declaring the intent ofobj
when it's taken by theMyClass
constructor, so we have no way of providing analyser support either.With these samples, what I have attempted to outline is a lack of support for declaring in C# what a callee (or more specifically, what a declaration) wants to do with an IDisposable assigned to it. Also, even if we did have a way to declare it, we have no way of declaring our intent at satisfying the callees requirement. With that in mind, let us look at a potential solution to this using move semantics.
First, we will look again at the simplest case with the addition of a
move
keyword:With this addition, this would become correct code in this context. The semantics of the
move
keyword is that it can only be used on anIDisposable
variable that is currently being used. It can also only be used to be assigned to variables that are also part of a using declaration (or also being used, but this will be covered later). In this scenario, the first value is considered moved from, and is no longer considered used. As a result, the compiler will no longer callDispose
on the moved from variable. The result of this is that we can safely move the variable and ensure it will only be disposed once. We can't move from a non-used variable, and we can't assign to a non-using declaration, as these variables are not part of this scope bound lifetime model introduced by using statements/declarations. An interesting addition here is that we can now look at the first version of this without themove
statement and consider it a compiler error or analyser warning as doing this withoutmove
is almost always going to be a mistake.On its own, this is not super useful, so we need to extend where
using
can be used to allow us to solve the other problems we have seen before. For example, we extend "using declarations" to work on all declaration types. That isout using var
(more for convenience), and also more importantly, as using declarations in function arguments. This would look something like so:Once again, we can now be explicit here and have our call to
SetInternalValue
analysed in the same way the using declaration above was analyser. Tooling is able to see that the argument toSetInternalValue
is a using declaration, and as such, the idea of passingobj
withoutmove
is most likely an error. This allows us to move variables into functions, and more importantly, have these variables never lose their lifetime guarantees as even in thisSetInternalValue
function, the variable is immediately being "used", and an exception would trigger the Dispose call as would be expected. I have omitted the implementation of this function right now as it requires something more we have yet to cover, but for now we can look at how using declarations on functions are implemented. Semantically they are equivalent to:Again, we are missing something here to tie this all together, but first I wish to cover assignments and their interaction with the
move
keyword.Consider the following code using our
move
keyword:With our current rules, this doesn't work fully, as we are forgetting
objA
here. We have told the compiler to not dispose ofobjB
as it has been moved from, but as the value inobjB
has been reassigned toobjA
, the initial value inobjA
id lost and won't get disposed. As such, extra semantics are required for the move expression. These semantics state that when assigning to a variable with a move expression (not declaring a new variable), the variable being assigned to must be "used" currently, and as a result,Dispose
will be called on the variable before assignment. The result is that when we moveobjB
toobjA
, the current value inobjA
is disposed first. This means we no longer leak the initial value ofobjA
. Although having different semantics for a keyword depending on the context, in this case, the keyword is only valid when dealing with already used variables, that this should be something the user can't easily misuse.Another case that needs considering is when assigning to a "used" variable with a temporary variable:
This is of course valid, but in order to avoid changing the current C# semantics, its likely this code would instead emit an analyser warning instead of implicitly calling Dispose on
objA
before assignment. The corrected code would instead look like this:This does not work exactly how move semantics work in C++ for example, where the value we are assigning to is considered a temporary and as such, it is moved by default. But as this should for the most part be an opt-in feature, changing the default behaviour of C# assignments is likely not a good idea. The first code would leak resources in current C# however, so I think it would be acceptable for an analyser to catch this and suggest the addition of
move
into the assignment.We are almost finished, but there is one last part we need to cover, and that is that classes/structs can also have declarations internally in the form of members. In order to make this work, I think we need to support
using
on these also, but not implement the full object model we might get with C++. Consider the completed example from above:Here we have the member variable
MyObject
as part ofMyClass : IDisposable
. This member variable is marked asusing
, which for our purposes here, means it owns the object and manages its lifetime. As a result, we can now implement theSetInternalValue
function, and see that we canmove
obj
into it. This correctly tells the compiler to no longer disposeobj
, as well as to callDispose
onMyObject
before assigning to it, asMyObject
is marked asusing
. There is a question as to ifusing
on member variables should automatically have Dispose generated for them or something along those lines, but I think this is unnecessary as we are used to writingDispose
functions in C#, and at the very least, we can now add an analyser to warn when aDispose
function does not dispose of a member marked asusing
. I will add that there is a need for non-using and using member variables in the same class, as if you simply wish to hold onto a reference to something without owning it, you will just define it normally as non-using. Hopefully this is all that is required, and as a result, we can even rewrite ourUpdate
function to not needSetInternalValue
:Naturally, if the using member variable is public, we can assign to it and hopefully all of the same rules should apply. There are some edge cases around get/set properties, but hopefully the rules around that can be worked out.
Hopefully this is most of what is needed to make something like this work. As move semantics are a complex thing, you need a handful of different features to make any one function, but I think I have been able to cut out a minimal subset here to make it viable. We may however end up in situations where this model does not cover every edge case, specifically to the fact that C# is a GC language that does not require single ownership, and in situations where we want IDisposable objects to be manually managed but are also ok if they are just forgotten and we have a finalizer to deal with it. This does not model this well, but we can take a trick from rust and use the model to add escaped hatches to it for these unusual scenarios.
The first escape hatch is one that allows us to explicitly "forget" an object, that is if we assume the addition of the static function
IDisposable.Forget
:Note here that we are explicitly moving into it, and the result is that we would pass ownership of
obj
toForget
, but thatForget
won't actually dispose of the object and is just an empty operation. To implement this, we would likely need attributes the same as how we have attributes for nullable types to tell the compiler to trust us. As such, theForget
function might be implemented as such:If we were to implement this as
Forget<T>(using T obj)
,obj
would be destroyed at the end of the function and would undermine the point. Having the[UsingDeclaration]
lets us cheat out of this, and in most cases, shouldn't need to be used directly asIDisposable.Forget
hides this implementation detail. For most case, this would serve the same purpose as Rusts std::mem::forget and is generally reserved for edge cases.The other trick we can do is
IDisposable.Dispose
(this might be badly named, but it's a static function on the interface instead):This doesn't require any tricks and is implemented as such:
This simply takes in the variable and immediately disposes it. You will note its implementation is almost identical to
Forget
, but it uses a realusing
declaration, and as such, disposes ofobj
. This is useful for if you don't want to wait until the end of the scope to dispose a variable and is more available as an optimisation. This is analogous to Rusts std::mem::drop, which is even implemented in the same way (pub fn drop<T>(_x: T) {}
) :).It is possible that to avoid generation of a try/finally block, one might wish to implement
Dispose
like this instead, but it is semantically equivalent:I think that is all that is needed for now. I understand this is a feature that at first was targeted at lower-level code, but assuming it can be made approachable using analyzers and warning to pretty much dictate to the user if a
move
should be used or not,using
member variables in classes could become common place in normal code. I think there is a benefit to having at least basic tooling to express where ownership of resources lies in regard toIDisposable
, and I hope that this could become a commonplace tool the same as nullable reference types as a way to avoid common mistakes that the compiler can catch if given enough information.Thanks
Beta Was this translation helpful? Give feedback.
All reactions