Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a donotdelete builtin #44036

Merged
merged 1 commit into from
Feb 9, 2022
Merged

Add a donotdelete builtin #44036

merged 1 commit into from
Feb 9, 2022

Commits on Feb 8, 2022

  1. Add a DCE barrier builtin

    In #43852 we noticed that the compiler is getting good enough to
    completely DCE a number of our benchmarks. We need to add some sort
    of mechanism to prevent the compiler from doing so. This adds just
    such an intrinsic. The intrinsic itself doesn't do anything, but
    it is considered effectful by our optimizer, preventing it from
    being DCE'd. At the LLVM level, it turns into a volatile store to
    an alloca (or an llvm.sideeffect if the values passed to the
    `dcebarrier` do not have any actual LLVM-level representation).
    
    The docs for the new intrinsic are as follows:
    ```
        dcebarrier(args...)
    
    This function prevents dead-code elimination (DCE) of itself and any arguments
    passed to it, but is otherwise the lightest barrier possible. In particular,
    it is not a GC safepoint, does model an observable heap effect, does not expand
    to any code itself and may be re-ordered with respect to other side effects
    (though the total number of executions may not change).
    
    A useful model for this function is that it hashes all memory `reachable` from
    args and escapes this information through some observable side-channel that does
    not otherwise impact program behavior. Of course that's just a model. The
    function does nothing and returns `nothing`.
    
    This is intended for use in benchmarks that want to guarantee that `args` are
    actually computed. (Otherwise DCE may see that the result of the benchmark is
    unused and delete the entire benchmark code).
    
    **Note**: `dcebarrier` does not affect constant foloding. For example, in
              `dcebarrier(1+1)`, no add instruction needs to be executed at runtime and
              the code is semantically equivalent to `dcebarrier(2).`
    
    *# Examples
    
    function loop()
        for i = 1:1000
            # The complier must guarantee that there are 1000 program points (in the correct
           	# order) at which the value of `i` is in a register, but has otherwise
            # total control over the program.
            dcebarrier(i)
        end
    end
    ```
    
    I believe the voltatile store at the LLVM level is actually somewhat
    stronger than what we want here. Ideally the `dcebarrier` would not
    and up generating any machine code at all and would also be compatible
    with optimizations like SROA and vectorization. However, I think this
    is fine for now.
    Keno committed Feb 8, 2022
    Configuration menu
    Copy the full SHA
    3f2a323 View commit details
    Browse the repository at this point in the history