Skip to content

Race detection in OCaml using the ThreadSanitizer runtime analysis.

License

Notifications You must be signed in to change notification settings

ocaml-multicore/ocaml-tsan

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ThreadSanitizer support for OCaml 5

ThreadSanitizer is an effective approach to locate data races in concurrent code bases. This is an experimental extension of the OCaml compiler to add ThreadSanitizer support to OCaml 5.0.

Caveats

  • OCaml ThreadSanitizer support inherits the portability limitations of ThreadSanitizer: it is not available on Windows.

  • TSan investigates a particular execution and therefore will not detect races in code paths that are not visited.

  • TSan may still report false positives in some rare cases (see section 6.4 of the WBIA ’09 paper below).

  • Currently, ThreadSanitizer detection only works for native x86_64 executables; not bytecode programs, nor other architectures.

Trying it out

For Linux, you first need to install libunwind stack unwinding library on your system. Most Linux distributions provide a package.

Then you will need to work in a dedicated opam switch:

opam update
opam switch create 5.0.0+tsan

Building the switch may take a bit longer than usual due to some unavoidable ThreadSanitizer-incurred slowdowns in parts of the process. Expect a few minutes of build time.

On this switch, your programs will now be compiled with ThreadSanitizer instrumentation, including the packages you may install with opam.

If you link against C files, it is best to instrument them to detect data races in them as well, by passing -fsanitize=thread to GCC or clang. Note that if you use Dune, this setting can be added to a Dune profile by adding e.g. the following to your dune-workspace:

(env
 (tsan
  (c_flags -fsanitize=thread)))

Then you only need to select this profile, e.g. dune exec --profile tsan myexecutable.exe or dune runtest --profile tsan.

Examples

A simple race

The simplest of data races can be created by having two accesses to the same mutable location, one of them being a write:

type t = { mutable x : int }

let v = { x = 0 }

let () =
  let t1 = Domain.spawn (fun () -> v.x <- 10; Unix.sleep 1) in
  let t2 = Domain.spawn (fun () -> ignore v.x; Unix.sleep 1) in
  Domain.join t1;
  Domain.join t2

Running this program with ThreadSanitizer instrumentation will output a report looking like this:

==================
WARNING: ThreadSanitizer: data race (pid=4170290)
  Read of size 8 at 0x7f072bbfe498 by thread T4 (mutexes: write M0):
    #0 camlSimpleRace__fun_524 /tmp/simpleRace.ml:7 (simpleRace.exe+0x43dc9d)
    #1 camlStdlib__Domain__body_696 /home/olivier/.opam/5.0.0+tsan/.opam-switch/build/ocaml-variants.5.0.0+tsan/stdlib/domain.ml:202 (simpleRace.exe+0x47b5dc)
    #2 caml_start_program ??:? (simpleRace.exe+0x4f51c3)
    #3 caml_callback_exn /home/olivier/.opam/5.0.0+tsan/.opam-switch/build/ocaml-variants.5.0.0+tsan/runtime/callback.c:168 (simpleRace.exe+0x4c2b93)
    #4 caml_callback /home/olivier/.opam/5.0.0+tsan/.opam-switch/build/ocaml-variants.5.0.0+tsan/runtime/callback.c:256 (simpleRace.exe+0x4c36e3)
    #5 domain_thread_func /home/olivier/.opam/5.0.0+tsan/.opam-switch/build/ocaml-variants.5.0.0+tsan/runtime/domain.c:1093 (simpleRace.exe+0x4c6ad1)

  Previous write of size 8 at 0x7f072bbfe498 by thread T1 (mutexes: write M1):
    #0 camlSimpleRace__fun_520 /tmp/simpleRace.ml:6 (simpleRace.exe+0x43dc45)
    #1 camlStdlib__Domain__body_696 /home/olivier/.opam/5.0.0+tsan/.opam-switch/build/ocaml-variants.5.0.0+tsan/stdlib/domain.ml:202 (simpleRace.exe+0x47b5dc)
    #2 caml_start_program ??:? (simpleRace.exe+0x4f51c3)
    #3 caml_callback_exn /home/olivier/.opam/5.0.0+tsan/.opam-switch/build/ocaml-variants.5.0.0+tsan/runtime/callback.c:168 (simpleRace.exe+0x4c2b93)
    #4 caml_callback /home/olivier/.opam/5.0.0+tsan/.opam-switch/build/ocaml-variants.5.0.0+tsan/runtime/callback.c:256 (simpleRace.exe+0x4c36e3)
    #5 domain_thread_func /home/olivier/.opam/5.0.0+tsan/.opam-switch/build/ocaml-variants.5.0.0+tsan/runtime/domain.c:1093 (simpleRace.exe+0x4c6ad1)

  Mutex M0 (0x000000567ad8) created at:
    #0 pthread_mutex_init /home/olivier/other_projects/llvm-project/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp:1316 (libtsan.so.0+0x3cafb)
    [...]

SUMMARY: ThreadSanitizer: data race /tmp/simpleRace.ml:7 in camlSimpleRace__fun_524
==================
ThreadSanitizer: reported 1 warnings

If the mutable field is replaced with an Atomic reference, the warning disappears:

let v = Atomic.make 0

let () =
  let t1 = Domain.spawn (fun () -> Atomic.set v 10; Unix.sleep 1) in
  let t2 = Domain.spawn (fun () -> ignore (Atomic.get v); Unix.sleep 1) in
  Domain.join t1;
  Domain.join t2

Synchronization using an atomic variable

Synchronizing the two accesses above by busy-waiting on an atomic boolean will be detected by ThreadSanitizer and no data race will be reported:

type t = { mutable x : int }

let v = { x = 0 }

let v_modified = Atomic.make false

let () =
  let t1 =
    Domain.spawn (fun () ->
        v.x <- 10;
        Atomic.set v_modified true;
        Unix.sleep 1)
  in
  let t2 =
    Domain.spawn (fun () ->
        while not (Atomic.get v_modified) do () done;
        ignore v.x;
        Unix.sleep 1)
  in
  Domain.join t1;
  Domain.join t2

More efficiently, such synchronization can be implemented using a Mutex.t with the same result.

Background

There are two components to ThreadSanitizer (TSan): 1. A runtime library to track accesses to shared data and report races 2. A compiler instrumentation feature that inserts calls to that runtime library.

Internally the runtime library associates with each word of application memory at least 2 "shadow words". Each shadow word contains information about a recent memory access to that word, including a "scalar clock". Those clocks serve to establish a happens-before (HB) relation, i.e. an event orderings that we are certain of.

This information is maintained as a "shadow state" in a separate memory region, and updated at every (instrumented) memory access. A data race is reported every time two memory accesses are made to overlapping memory regions, and: - one of them is a write, and - there is no established happens-before relation between them. More information about TSan’s algorithm on their wiki.

Performance cost

ThreadSanitizer for OCaml incurs a slowdown and increases memory consumption. Preliminary benchmarks show a slowdown in the range 7x-13x and a memory consumption increase in the range 2x-7x.

Troubleshooting

OCaml-TSan is still considered experimental: although tested on large projects, there is a slight chance that it introduces crashes in your code. If you encounters problems using it, please do let us know and we will do our best to improve things.

  • My program crashed when compiled on your ThreadSanitizer switch.
    Try to check whether your program pushes a lot of stack frames on the stack. A call tree of depth >64k can cause a buffer overflow in the ThreadSanitizer runtime. This is a ThreadSanitizer limitation that we are trying to work around.

  • TSan warns me about this data race, but it’s benign / I don’t want to deal with it for now.
    You can use a suppressions file to silence TSan warnings.