-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
allow tasks to migrate among threads #35688
Comments
I brought it up in #35686 (comment) but what would happen in |
Reloading tls sounds hard, but I think we can instead expose a task local storage instead. |
Since this bites me with an occasional 30-60% performance hit I've had a look at it. (I'm an old cilk-fan, and sometimes throw in spawns indiscriminately). If I define MIGRATE_TASKS in options.h and rebuild julia, some test code I've made now runs fine with a stable performance. However, other real-world code invariably segfaults in gc. I've not managed to figure out where the ptls needs to be reloaded. Presumably it's only after a task has been switched out, but that seems to be taken care of with the ifdef's. The julia part of the scheduler does not seem to save the ptls across any jl_switch. I must admit that I do not fully understand the gc, is it somewhere there the ptls gets outdated? |
I have done a little experiment. I have uncommented
Now, what are we doing here? We are comparing three ways to identify a thread in Julia. The Threads.threadid() looks up the local thread storage by calling jl_get_ptls_states() and extract the tid-field, this is done in jl_threadid() in threading.c. The mythread() calls the pthread_self() which returns a thread id, an unsigned long identifying the current OS-thread. myptls() returns the address of the thread local storage block in julia by calling jl_get_ptls_states(). Now, given any of a tid, a pthread or a ptls, the other two are determined, after all its all the same thread. But this is not true. The output is:
In the four last lines, the pthread changes without the tid changing, and then the tid changes without the pthread changing. The ptls and the tid are however always in sync, it seems. The only way this can happen is if the jl_get_ptls_states() returns different thread local storage with successive calls in the same thread. And the same value with calls from different threads. That is, the jl_get_ptls_states_static() in julia.h does not seem to work as it should. This will create all sorts of havoc with scheduling and gc. How that is possible, I am not certain. I thought that the JL_CONST_FUNC attribute on some of these functions could be the culprit, but I tried to remove it to no avail. Moreover, if I replace the sleep() with a yield(), everything seems fine and in sync.
|
Fixed by #40715 |
Currently when a task blocks, it must be restarted on the same thread it ran on before. This is because some of our code (in both the runtime and in generated code) assumes the ptls pointer cannot change during a task. We need to arrange to reload it in the right places.
We've been aware of this limitation the whole time, but I don't believe there's an issue to track it yet.
The text was updated successfully, but these errors were encountered: