Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correct TSAN tasking integration #36929

Merged
merged 1 commit into from
Aug 7, 2020
Merged

Correct TSAN tasking integration #36929

merged 1 commit into from
Aug 7, 2020

Conversation

Keno
Copy link
Member

@Keno Keno commented Aug 5, 2020

In order for tsan to work, any setjmp/longjmp must be executed while
that task's tsan state is active. As a result, we cannot switch the
tsan state until said setjmp is completed, but need to do it
(as basically the only thing happening) between the setjmp and the
subsequent longjmp. To facilitate this without too much disruption,
move the tsan state into the jl_ucontext_t, which seems appropriate
since it's additional state that needs to be restored on context
switch.

Also forbid using TSAN from Clang < 11, where the runtime library
has bugs that cause us to exhaust the maximum number of allowed
mappings.

With this, TSAN works, but obviously has a fairly large number of complaints.
For reference, here is my Make.user:

USE_BINARYBUILDER_LLVM=0
USECLANG=1
SANITIZE=1
SANITIZE_THREAD=1
JULIA_PRECOMPILE=0
override CC=/home/keno/llvm-project-build/bin/clang
override CXX=/home/keno/llvm-project-build/bin/clang++

@Keno Keno requested review from vchuravy and vtjnash August 5, 2020 18:50
@Keno Keno force-pushed the kf/fixtsan branch 5 times, most recently from b69fa10 to 7f9ab34 Compare August 6, 2020 04:33
Comment on lines +434 to +437
ifeq ($(LLVM_ASSERTIONS),0)
CXX_DISABLE_ASSERTION += -DNDEBUG
endif

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this variable does not have the actual runtime value (which should get set from llvm-config), we have a header (julia_assert.h) that is supposed to already deal with this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like llvm-config no longer sets NDEBUG, but it still needs to match what LLVM expects for ABI reasons.

#ifdef JL_TSAN_ENABLED
if (t->ctx.tsan_state) {
t->ctx.tsan_state = NULL;
__tsan_destroy_fiber(t->ctx.tsan_state);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a NULL pointer?

Suggested change
__tsan_destroy_fiber(t->ctx.tsan_state);
__tsan_destroy_fiber(NULL);

src/task.c Outdated
@@ -279,6 +277,10 @@ JL_DLLEXPORT jl_task_t *jl_get_next_task(void)
return ptls->current_task;
}

#ifdef JL_TSAN_ENABLED
char tsan_state_corruption[] = "TSAN state corrupted. Exiting HARD!\n";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
char tsan_state_corruption[] = "TSAN state corrupted. Exiting HARD!\n";
const char tsan_state_corruption[] = "TSAN state corrupted. Exiting HARD!\n";

src/task.c Outdated
// Something went really wrong - don't even assume that we can
// use assert/abort which involve lots of signal handling that
// looks at the tsan state.
write(STDERR_FILENO, tsan_state_corruption, sizeof(tsan_state_corruption));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
write(STDERR_FILENO, tsan_state_corruption, sizeof(tsan_state_corruption));
write(STDERR_FILENO, tsan_state_corruption, sizeof(tsan_state_corruption) - 1);

src/task.c Outdated
tsan_destroy_ctx(ptls, &lastt->ctx);
jl_set_fiber(&t->ctx); // (doesn't return)
abort(); // unreachable
} else {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style: line break between } and else

In order for tsan to work, any setjmp/longjmp must be executed while
that task's tsan state is active. As a result, we cannot switch the
tsan state until said setjmp is completed, but need to do it
(as basically the only thing happening) between the setjmp and the
subsequent longjmp. To facilitate this without too much disruption,
move the tsan state into the jl_ucontext_t, which seems appropriate
since it's additional state that needs to be restored on context
switch.

Also forbid using TSAN from Clang < 11, where the runtime library
has bugs that cause us to exhaust the maximum number of allowed
mappings.
@Keno Keno merged commit 7c0cb30 into master Aug 7, 2020
@Keno Keno deleted the kf/fixtsan branch August 7, 2020 05:02
simeonschaub pushed a commit to simeonschaub/julia that referenced this pull request Aug 11, 2020
In order for tsan to work, any setjmp/longjmp must be executed while
that task's tsan state is active. As a result, we cannot switch the
tsan state until said setjmp is completed, but need to do it
(as basically the only thing happening) between the setjmp and the
subsequent longjmp. To facilitate this without too much disruption,
move the tsan state into the jl_ucontext_t, which seems appropriate
since it's additional state that needs to be restored on context
switch.

Also forbid using TSAN from Clang < 11, where the runtime library
has bugs that cause us to exhaust the maximum number of allowed
mappings.
ararslan added a commit that referenced this pull request Feb 19, 2021
This a component of #36929 which happens to fix building on FreeBSD 11.

Co-authored-by: Keno Fischer <keno@juliacomputing.com>
Keno added a commit that referenced this pull request Oct 1, 2021
Looks like #36929 got reverted
without comment. Not sure what happened there. Re-apply it to fix TSAN.
Keno added a commit that referenced this pull request Oct 1, 2021
Looks like #36929 got reverted
without comment. Not sure what happened there. Re-apply it to fix TSAN.
Keno added a commit that referenced this pull request Oct 1, 2021
Looks like #36929 got reverted
without comment. Not sure what happened there. Re-apply it to fix TSAN.
vtjnash pushed a commit that referenced this pull request Oct 3, 2021
Looks like #36929 got reverted
without comment. Not sure what happened there. Re-apply it to fix TSAN.
vtjnash pushed a commit that referenced this pull request Oct 7, 2021
This reverts commit 15772ba
"Update references to tsan state (#42440)", and fixes integration.

Looks like #36929 got reverted
when trying to simplify the code, but it now isn't legal. Since these
must be inlined in the right point in the runtime call/return context,
use a macro to ensure that.
vtjnash pushed a commit that referenced this pull request Oct 8, 2021
This reverts commit 15772ba
"Update references to tsan state (#42440)", and fixes integration.

Looks like #36929 got reverted
when trying to simplify the code, but it now isn't legal. Since these
must be inlined in the right point in the runtime call/return context,
use a macro to ensure that.
vtjnash pushed a commit that referenced this pull request Oct 14, 2021
This reverts commit 15772ba
"Update references to tsan state (#42440)", and fixes integration.

Looks like #36929 got reverted
when trying to simplify the code, but it now isn't legal. Since these
must be inlined in the right point in the runtime call/return context,
use a macro to ensure that.
vtjnash pushed a commit that referenced this pull request Oct 14, 2021
This reverts commit 15772ba
"Update references to tsan state (#42440)", and fixes integration.

Looks like #36929 got reverted
when trying to simplify the code, but it now isn't legal. Since these
must be inlined in the right point in the runtime call/return context,
use a macro to ensure that.
LilithHafner pushed a commit to LilithHafner/julia that referenced this pull request Feb 22, 2022
This reverts commit 15772ba
"Update references to tsan state (JuliaLang#42440)", and fixes integration.

Looks like JuliaLang#36929 got reverted
when trying to simplify the code, but it now isn't legal. Since these
must be inlined in the right point in the runtime call/return context,
use a macro to ensure that.
LilithHafner pushed a commit to LilithHafner/julia that referenced this pull request Mar 8, 2022
This reverts commit 15772ba
"Update references to tsan state (JuliaLang#42440)", and fixes integration.

Looks like JuliaLang#36929 got reverted
when trying to simplify the code, but it now isn't legal. Since these
must be inlined in the right point in the runtime call/return context,
use a macro to ensure that.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants