`@asmcall` should take an option for emitting as `volatile` #320

Seelengrab · 2022-12-01T15:49:01Z

While helping out debugging some slowness on an A64FX machine, I encountered this:

julia> using MCAnalyzer
[ Info: Precompiling MCAnalyzer [a81df072-f4bb-11e8-03d3-cfaeda626d18]

julia> function mysum(A)
           acc = zero(eltype(A))
           for i in eachindex(A)
               mark_start()
               @inbounds acc += A[i]
           end
           mark_end()
           return acc
       end
mysum (generic function with 1 method)

julia> analyze(mysum, (Vector{Int},))
/tmp/jl_zCoQt8/a.S:67:3: error: found an invalid region end directive
        # LLVM-MCA-END
         ^
/tmp/jl_zCoQt8/a.S:67:3: note: unable to find an active anonymous region
        # LLVM-MCA-END
         ^

This is reproducible with https://github.com/JuliaPerf/MCAnalyzer.jl/tree/no_iaca, while the example above works fine on the most recent release of MCAnalyzer.jl. The reason this fails seems to be due to the mark in the assembly being emitted twice, which of course llvm-mca then doesn't like:

	.text
	.file	"start"
	.globl	julia_mysum_1409                # -- Begin function julia_mysum_1409
	.p2align	4, 0x90
	.type	julia_mysum_1409,@function
julia_mysum_1409:                       # @julia_mysum_1409
	.cfi_startproc
# %bb.0:                                # %top
	movq	8(%rdi), %rcx
	testq	%rcx, %rcx
	je	.LBB0_1
# %bb.2:                                # %L13.preheader
	xorl	%edx, %edx
	xorl	%eax, %eax
	.p2align	4, 0x90
.LBB0_3:                                # %L13
                                        # =>This Inner Loop Header: Depth=1
	#APP
	# LLVM-MCA-BEGIN

	#NO_APP
	movq	(%rdi), %rsi
	addq	(%rsi,%rdx,8), %rax
	incq	%rdx
	cmpq	%rdx, %rcx
	jne	.LBB0_3
# %bb.4:                                # %L31
	#APP
	# LLVM-MCA-END

	#NO_APP
	retq
.LBB0_1:
	xorl	%eax, %eax
	#APP
	# LLVM-MCA-END

	#NO_APP
	retq
.Lfunc_end0:
	.size	julia_mysum_1409, .Lfunc_end0-julia_mysum_1409
	.cfi_endproc
                                        # -- End function
	.section	".note.GNU-stack","",@progbits

I think this wouldn't happen if this were marked as volatile, but @asmcall doesn't have this in its interface.

The text was updated successfully, but these errors were encountered:

vchuravy · 2022-12-01T19:44:10Z

The LLVM LangRef does not specify volatile for assembly https://llvm.org/docs/LangRef.html#inline-assembler-expressions

IIRC C level volatile get's translated to side-effect=true. This is more a codegen problem that the return block got split and the assembly duplicated.

Seelengrab · 2022-12-01T21:25:50Z

Hm, that makes sense. I misremembered that I used volatile in a similar situation before, but that too only has sideeffect. Where should I look into to fix this (if necessary just for MCAnalyzer)? Base? Some custom pass in the IR using GPUCompiler..?

vchuravy · 2022-12-02T04:36:16Z

Uhm... It's probably somewhere in LLVM. We could put in a pseudo intrinsic that we noop after optimization.

Seelengrab · 2022-12-02T14:18:06Z

Yeah, that might be necessary.. I've noticed that adding this via the sideeffect/memory clobber breaks autovectorization pretty bad :/ Should I close this and track this in MCAnalyzer instead?

Seelengrab mentioned this issue Dec 1, 2022

Remove IACA specific byte marks JuliaPerf/MCAnalyzer.jl#29

Draft

vchuravy closed this as not planned Won't fix, can't repro, duplicate, stale Dec 2, 2022

Seelengrab mentioned this issue Dec 2, 2022

Find a replacement for @asmcall for mark_end()/mark_start() JuliaPerf/MCAnalyzer.jl#31

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`@asmcall` should take an option for emitting as `volatile` #320

`@asmcall` should take an option for emitting as `volatile` #320

Seelengrab commented Dec 1, 2022

vchuravy commented Dec 1, 2022

Seelengrab commented Dec 1, 2022 •

edited

Loading

vchuravy commented Dec 2, 2022

Seelengrab commented Dec 2, 2022

@asmcall should take an option for emitting as volatile #320

@asmcall should take an option for emitting as volatile #320

Comments

Seelengrab commented Dec 1, 2022

vchuravy commented Dec 1, 2022

Seelengrab commented Dec 1, 2022 • edited Loading

vchuravy commented Dec 2, 2022

Seelengrab commented Dec 2, 2022

`@asmcall` should take an option for emitting as `volatile` #320

`@asmcall` should take an option for emitting as `volatile` #320

Seelengrab commented Dec 1, 2022 •

edited

Loading