Thread local Cell methods. #3184

m-ou-se · 2021-10-17T18:20:06Z

This is a relatively small RFC. But I figured it'd be nice to discuss the API and alternatives here where it gets a bit more attention. (And without getting distracted by the implementation details of .set().)

m-ou-se · 2021-10-17T18:29:42Z

cc @rust-lang/libs-api

BurntSushi · 2021-10-18T12:29:59Z

This seems like a nice quality of life improvement to me.

comex · 2021-10-18T21:38:58Z

Alternatives

Add THREAD_LOCAL.borrow() and THREAD_LOCAL.borrow_mut(), just like RefCell has.

This wouldn't be sound. One could move the returned proxy object into a thread local that outlives this thread local. (Or just Box::leak() it.)

I don't want to derail the discussion, but it seems like it should be possible to provide a macro that safely returns a reference to a thread-local. Create a dummy local variable, make sure it can't be moved, and tie the reference lifetime to that.

~~The main difficulty I can think of would be preventing the macro from being used in async fns (or at least prevent the reference from being held across suspend points).~~ Actually, I think it would just be a matter of making the dummy variable's type !Send.

m-ou-se · 2021-10-19T12:42:05Z

@comex That's an interesting idea and worth investigating at some point, thanks. :) But even if we had such a macro, I'd still want the changes in this RFC.

m-ou-se · 2021-10-19T12:42:14Z

@rfcbot merge

rfcbot · 2021-10-19T12:42:20Z

Team member @m-ou-se has proposed to merge this. The next step is review by the rest of the tagged team members:

Concerns:

~~should-set-skip-the-initializer-or-not~~ resolved by Thread local Cell methods. #3184 (comment)

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

Mark-Simulacrum · 2021-10-25T13:18:57Z

text/0000-thread-local-cell-methods.md

+}
+```
+
+For `.set()`, this *skips the initialization expression*:


This feels a bit weird to me, and inconsistent with "desugaring" to with(...) -- can you say more about why we want it?

In particular, set() presumably does run the destructor in general, so the behavior would only be different for the first-time initialization. So we probably can't avoid checking whether the value is initialized before writing to the thread local.

Maybe we can adjust thread local declarations to permit omitting the initialization expression entirely and have set() still work on those, but other methods (like get(), take(), with()) fail?

This feels a bit weird to me, and inconsistent with "desugaring" to with(...) -- can you say more about why we want it?

The alternative is that .set() (the first time on each thread) will run the (possibly expensive) initialization expression, directly drop the resulting value, and then write the value you gave it. With X.with(|x| x.set(..)) that might be somewhat expected (but still easy to miss), but with just X.set(..) that'd be quite surprising.

So we probably can't avoid checking whether the value is initialized before writing to the thread local.

That is true. But we can avoid running the initialization expression, which can be expensive.

Maybe we can adjust thread local declarations to permit omitting the initialization expression entirely and have set() still work on those, but other methods (like get(), take(), with()) fail?

With the proposed changes, you can achieve that with:

thread_local! { static ID: Cell<i32> = panic!("ID not set on this thread!"); }

If this pattern is used often, we can consider making the panic implicit when the initialization expression is missing:

// Maybe? thread_local! { static ID: Cell<i32>; // Will panic when used before `.set()`'ing it. }

With X.with(|x| x.set(..)) that might be somewhat expected (but still easy to miss), but with just X.set(..) that'd be quite surprising.

Huh. I guess maybe our model for the desugaring is different: the way I saw it is that X desugars to X.with(|x| x).set(...), basically akin to a lazy static, which implies that the result here is not surprising, and indeed is the expected behavior.

I also see the "initialization expression" being run as not something that happens in set(...) but sort of separately -- this obviously doesn't match what actually happens in our current implementation.

Maybe it makes sense to move and/or also have a separate initialize(...) on LocalKey<T> that has this behavior? It seems like we actually sort of always have a cell-like primitive here tracking the initialization regardless, so that should be possible, and lets the user get this behavior if they want it without complicating set to have separate codepaths/mental models for already initialized and not initialized thread locals.

I also see the "initialization expression" being run as not something that happens in set(...) but sort of separately -- this obviously doesn't match what actually happens in our current implementation.

This is not just an implementation detail of our current implementaion, but a guarantee we already make in the documentation:

"Initialization is dynamically performed on the first call to with within a thread"

So yes, a thread_local!{} is already basically a 'Lazy' / 'OnceCell'.*

(*Except those with a const { .. } initializer, which are currently unstable. See rust-lang/rust#84223 (comment))

a separate initialize(...) on LocalKey

Yeah maybe, and is_initialized() etc. too. However, right now, if the initializer is just a constant (and doesn't need Drop), we could compile it into a thread local that's not lazily but directly/always initialized, since the difference is not noticable. Adding .set() as proposed here will not break that. Adding a .initialize() will make the difference observable, since .initialize() would then fail or succeed depending on if the optimization is made, or we'd always have to track the 'is_initialized' state even when that would otherwise not be necessary.

(And even if we have .initialize() on all (non-const (?)) thread locals, I'd still like .set() on a thread local Cell without unnecessarily initializing and instantly dropping a value before setting the value you want to set. Unlike all the other functions (including .with, .set is the only one that doesn't give you access to the stored value, so initializing it first seems quite useless. And there are good use cases for skipping it, such as the panic!() example above.)

I guess we can maybe add this to unresolved questions? I think skipping user-written code on particular API calls feels like a footgun, since e.g. maybe there's validation of some kind there (can only be used on particular threads, or some such). Obviously the user maybe shouldn't have written that, but it feels iffy to just bypass user-written code like this.

I feel like the panic example is illustrative of why this is potentially not good: if the user actually wants that behavior, we should encourage use of something like the suggested initialization APIs, with a std-provided panic on with/get/etc, rather than having the user write that.

In particular, it feels weird that an API like set(...) has different behavior if the value in the thread local is not initialized vs. is. I think it'll probably be fine in most cases, but I don't think it buys us much for the potential harm it causes. Anyone actually needing that kind of behavior seems to want a more explicit initialization step, I'd imagine, rather than an arbitrary set call being sufficient...

rfcbot · 2021-10-25T21:19:46Z

🔔 This is now entering its final comment period, as per the review above. 🔔

Julian-Wollersberger · 2021-10-30T23:28:55Z

+1 for these methods in general, but I'm concerned that the special case on .set() will hurt readability. It also introduces non-local edge cases.

Let's assume one is reading code like this for the first time.

// Inspired by the epoch garbage collector in crossbeam.
thread_local! {
    static GARBAGE_QUEUE: RefCell<Vec<Atomic>> = {
        register_thread();
        RefCell::new(Vec::new());
   }
}

First, when reading the code, one can clearly see the initializer. Since it looks similar to a normal static, it leads to the assumtion that it is always called too. You'd have to read the documentation of the innocent looking .set() method that this isn't the case.

Next, the initializer could have important side effects, like in the example. All .set() calls would need to make sure the side effects happen. Since those calls can be in many places, this is error-prone.

Finally, there might be some weird thread shenanigans elsewhere in the program that the code author hasn't thought about, that breaks assumptions about .set() being called before anything else. Like someone using your library from an async context and tokio migrating a task to a new thread behind your back. Arguably, if this is a problem, then the use of thread_local!() is buggy, but good luck reproducing such a bug.
If .set() always called the initializer, this would be less of a problem.

m-ou-se · 2021-10-31T18:40:31Z

Since it looks similar to a normal static, it leads to the assumtion that it is always called too.

But that already isn't the case. Thread locals like this are lazily initialized on the first call to .with(). If you don't call .with() on some thread, the initializer never gets run on that thread at all.

Mark-Simulacrum · 2021-10-31T22:22:58Z

My own trade-off analysis is that the cost-saving from not calling the initializer is outweighed by the potential for not knowing of this edge case when the initializer does something important (e.g., registering a thread somewhere), so I still lean towards not having that special case.

@m-ou-se do you think it's just very unlikely that such an initializer exists? I think we may just have a different judgement there, and that leads to different evaluations of this tradeoff perhaps?

Do you agree that if you had written such an initializer, and depended on access to thread local running it, the ability to bypass that with a new std API which sets the variable directly is a footgun?

kennytm · 2021-11-01T06:40:51Z

someone from T-libs-api should register a concern regarding #3184 (comment) (unless you all believe that this can reach a consensus in 4 days)

m-ou-se · 2021-11-01T09:33:12Z

cc @rust-lang/libs-api opinions on the .set() behaviour? See above.

m-ou-se · 2021-11-01T09:50:28Z

@Mark-Simulacrum I think the opposite is a footgun. If I have a thread local ID: u64 = generate_id(); and on some threads I manually .set() a (reserved) ID, I wouldn't expect additional unused IDs to be generated for those threads.

Mark-Simulacrum · 2021-11-01T13:10:15Z

That seems to speak in favor of the initialize or similar method I suggested earlier, I think? That is, in that situation you know you want to skip the initializer, so you can reach for the right tool. But the reverse is not true: if you aren't aware of this behavior, then you can accidentally skip the initializer and get the footgun behavior I had earlier.

IOW, I think I'm in some sense agreeing that both sides of the failure mode are feasible, but it seems like we can solve both by having a dedicated API for only initializing and not have this behavior on set.

At minimum, I think it makes sense for the set(...) method to return some kind of bool (maybe a wrapper struct) to indicate whether it skipped the initializer: if you do need that to happen, you probably should write an assert! rather than hoping it's the first thing on the thread. It seems really easy for someone to introduce an extra .get() earlier that runs the initializer and then runs off with the wrong ID (from your example).

kennytm · 2021-11-01T19:24:16Z

maybe name the method .set_bypass_init() to clarify the initializer is skipped.

m-ou-se · 2021-11-03T21:49:19Z

@rfcbot concern should-set-skip-the-initializer-or-not

text/0000-thread-local-cell-methods.md

danielhenrymantilla · 2021-11-10T21:44:09Z

Oh yeah, I like this proposal very much. There is a naming nitpick though: I find the names .with_{ref,mut} to be too terse: they kind of hide the fact that a locking operation is taking place.
What about naming those methods .with_borrow() and .with_borrow_mut()?

m-ou-se · 2021-12-08T19:56:54Z

@rfcbot resolve should-set-skip-the-initializer-or-not

In the libs meeting from a few weeks ago, we agreed to continue with the behaviour as described currently in the RFC, and leave it as an open question for the tracking issue.

rfcbot · 2021-12-08T19:56:57Z

🔔 This is now entering its final comment period, as per the review above. 🔔

Co-authored-by: Josh Triplett <josh@joshtriplett.org>

danielhenrymantilla · 2021-12-08T20:16:52Z

What about the naming concern? with_borrow{,_mut} makes more sense when dealing with a RefCell:

// OK
THING.with(|_| {
    THING.with(|_| { ... }) // OK
})

// OK
THING.with_ref(|_| {
    THING.with_mut(|_| { ... }) // panics
})

rfcbot · 2021-12-18T19:57:45Z

The final comment period, with a disposition to merge, as per the review above, is now complete.

As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.

This will be merged soon.

m-ou-se · 2021-12-20T12:29:47Z

What about the naming concern? with_borrow{,_mut} makes more sense when dealing with a RefCell:

I'll add it to the open questions. I'll make the first implementation with the naming as you suggested, so we can try it out.

m-ou-se · 2021-12-20T12:37:17Z

This is now merged. 🎉

Tracking issue: rust-lang/rust#92122

QuineDot · 2021-12-27T09:35:16Z

The Rendered link is 404.

danielhenrymantilla · 2021-12-27T10:57:04Z

Rendered

…joshtriplett Implement RFC 3184 - thread local cell methods This implements [RFC 3184](rust-lang/rfcs#3184), with `@danielhenrymantilla's` [suggestion](rust-lang/rfcs#3184 (comment)) for the `with_` method names. Tracking issue: rust-lang#92122

Add thread_local_cell_methods RFC.

be03a7a

m-ou-se added the T-libs-api Relevant to the library API team, which will review and decide on the RFC. label Oct 17, 2021

m-ou-se added 5 commits October 17, 2021 20:40

Update.

5c0bc10

Insert PR number.

a322edc

Update.

1174fff

Update.

93b07c5

Update.

5bde562

rfcbot added proposed-final-comment-period Currently awaiting signoff of all team members in order to enter the final comment period. disposition-merge This RFC is in PFCP or FCP with a disposition to merge it. labels Oct 19, 2021

Mark-Simulacrum reviewed Oct 25, 2021

View reviewed changes

rfcbot added final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. and removed proposed-final-comment-period Currently awaiting signoff of all team members in order to enter the final comment period. labels Oct 25, 2021

rfcbot added proposed-final-comment-period Currently awaiting signoff of all team members in order to enter the final comment period. and removed final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. labels Nov 3, 2021

m-ou-se added the I-libs-api-nominated Indicates that an issue has been nominated for prioritizing at the next libs-api team meeting. label Nov 3, 2021

joshtriplett reviewed Nov 10, 2021

View reviewed changes

text/0000-thread-local-cell-methods.md Outdated Show resolved Hide resolved

rfcbot added final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. and removed proposed-final-comment-period Currently awaiting signoff of all team members in order to enter the final comment period. labels Dec 8, 2021

Update text/0000-thread-local-cell-methods.md

7237fc5

Co-authored-by: Josh Triplett <josh@joshtriplett.org>

m-ou-se removed the I-libs-api-nominated Indicates that an issue has been nominated for prioritizing at the next libs-api team meeting. label Dec 8, 2021

rfcbot added finished-final-comment-period The final comment period is finished for this RFC. to-announce and removed final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. labels Dec 18, 2021

m-ou-se mentioned this pull request Dec 20, 2021

Tracking issue for thread local Cell methods rust-lang/rust#92122

Closed

4 tasks

Update RFC 3184.

b766399

m-ou-se merged commit 4a97533 into rust-lang:master Dec 20, 2021

m-ou-se deleted the thread-local-cell-methods branch December 20, 2021 12:36

m-ou-se mentioned this pull request Dec 20, 2021

Implement RFC 3184 - thread local cell methods rust-lang/rust#92123

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Thread local Cell methods. #3184

Thread local Cell methods. #3184

m-ou-se commented Oct 17, 2021 •

edited

Loading

m-ou-se commented Oct 17, 2021

BurntSushi commented Oct 18, 2021

comex commented Oct 18, 2021 •

edited

Loading

m-ou-se commented Oct 19, 2021

m-ou-se commented Oct 19, 2021

rfcbot commented Oct 19, 2021 •

edited

Loading

Mark-Simulacrum Oct 25, 2021

m-ou-se Oct 25, 2021

Mark-Simulacrum Oct 25, 2021

m-ou-se Oct 25, 2021

Mark-Simulacrum Oct 26, 2021

rfcbot commented Oct 25, 2021

Julian-Wollersberger commented Oct 30, 2021

m-ou-se commented Oct 31, 2021

Mark-Simulacrum commented Oct 31, 2021

kennytm commented Nov 1, 2021 •

edited

Loading

m-ou-se commented Nov 1, 2021

m-ou-se commented Nov 1, 2021

Mark-Simulacrum commented Nov 1, 2021

kennytm commented Nov 1, 2021

m-ou-se commented Nov 3, 2021

danielhenrymantilla commented Nov 10, 2021 •

edited

Loading

m-ou-se commented Dec 8, 2021

rfcbot commented Dec 8, 2021

danielhenrymantilla commented Dec 8, 2021

rfcbot commented Dec 18, 2021

m-ou-se commented Dec 20, 2021 •

edited

Loading

m-ou-se commented Dec 20, 2021

QuineDot commented Dec 27, 2021

danielhenrymantilla commented Dec 27, 2021

Thread local Cell methods. #3184

Thread local Cell methods. #3184

Conversation

m-ou-se commented Oct 17, 2021 • edited Loading

m-ou-se commented Oct 17, 2021

BurntSushi commented Oct 18, 2021

comex commented Oct 18, 2021 • edited Loading

m-ou-se commented Oct 19, 2021

m-ou-se commented Oct 19, 2021

rfcbot commented Oct 19, 2021 • edited Loading

Mark-Simulacrum Oct 25, 2021

Choose a reason for hiding this comment

m-ou-se Oct 25, 2021

Choose a reason for hiding this comment

Mark-Simulacrum Oct 25, 2021

Choose a reason for hiding this comment

m-ou-se Oct 25, 2021

Choose a reason for hiding this comment

Mark-Simulacrum Oct 26, 2021

Choose a reason for hiding this comment

rfcbot commented Oct 25, 2021

Julian-Wollersberger commented Oct 30, 2021

m-ou-se commented Oct 31, 2021

Mark-Simulacrum commented Oct 31, 2021

kennytm commented Nov 1, 2021 • edited Loading

m-ou-se commented Nov 1, 2021

m-ou-se commented Nov 1, 2021

Mark-Simulacrum commented Nov 1, 2021

kennytm commented Nov 1, 2021

m-ou-se commented Nov 3, 2021

danielhenrymantilla commented Nov 10, 2021 • edited Loading

m-ou-se commented Dec 8, 2021

rfcbot commented Dec 8, 2021

danielhenrymantilla commented Dec 8, 2021

rfcbot commented Dec 18, 2021

m-ou-se commented Dec 20, 2021 • edited Loading

m-ou-se commented Dec 20, 2021

QuineDot commented Dec 27, 2021

danielhenrymantilla commented Dec 27, 2021

m-ou-se commented Oct 17, 2021 •

edited

Loading

comex commented Oct 18, 2021 •

edited

Loading

rfcbot commented Oct 19, 2021 •

edited

Loading

kennytm commented Nov 1, 2021 •

edited

Loading

danielhenrymantilla commented Nov 10, 2021 •

edited

Loading

m-ou-se commented Dec 20, 2021 •

edited

Loading