Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Special case Mono in SpanHelpers.Fill/CleanWithoutReference #99059

Closed
wants to merge 3 commits into from

Conversation

EgorBo
Copy link
Member

@EgorBo EgorBo commented Feb 28, 2024

@ghost
Copy link

ghost commented Feb 28, 2024

Tagging subscribers to this area: @dotnet/area-system-memory
See info in area-owners.md if you want to be subscribed.

Issue Details

Fixes dotnet/perf-autofiling-issues#29872 regression from #98623

Author: EgorBo
Assignees: EgorBo
Labels:

area-System.Memory

Milestone: -

Unsafe.Add(ref dest, (nint)i + 6) = value;
Unsafe.Add(ref dest, (nint)i + 7) = value;
// broadcast the value to all 8 bytes of the ulong and write it to memory
Unsafe.WriteUnaligned(ref Unsafe.Add(ref dest, i), value * 0x101010101010101ul);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not just for Mono, but it's shorter (and faster)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is 64-bit multiplication. Is it faster on 32-bit platforms as well?

Comment on lines +304 to +310
#if MONO
if (sizeof(T) == 1)
{
Unsafe.InitBlockUnaligned(ref Unsafe.As<T, byte>(ref _reference), *(byte*)&value, (uint)_length);
return;
}
#endif
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth doing this over updating Mono to recognize/special-case the same scenarios as dotnet/runtime?

At the very least, should we ensure an issue tracking that same support is added so this #if MONO can be removed long term?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure whether Mono folks want to add more intrinsics in Mini and Interpreter considering this short path just works as is, so leaving that decision up to them, cc @BrzVlad

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a big fan of having MONO ifdefs in SPC. Also I think the regression is limited to interpreter, so disabling for all mono might be uncalled for. Also this should be easy to intrinsify, I'll take a look.

Copy link
Member

@matouskozak matouskozak Feb 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a big fan of having MONO ifdefs in SPC. Also I think the regression is limited to interpreter, so disabling for all mono might be uncalled for. Also this should be easy to intrinsify, I'll take a look.

The regression can be seen for mini (JIT) as well (https://pvscmdupload.blob.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu%2022.04_LLVM=false_MonoAOT=false_MonoInterpreter=false_RunKind=micro_mono/System.Memory.Span(Char).Clear(Size:%2033).html)
image

Most likely for AOT as well since, the microbenchmark is written using generics, however, our AOT perf lines are currently broken.

Having Mono ifdefs in managed code doesn't seem to be a good long-term solution. If we go with this solution and the solution proves to remove the regression, I think we should add a tracking issue for adding these intrinsics to Mono. What do you think about this @fanyang-mono?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took a look a quick look at the original PR (#98623). It seems to me that to intrinsify SpanHelpers.Memmove, the code path for intrinsifying Buffer.Memmove was used with some tweak. Mono intrinsify Buffer.Memmove as well. See the code below.

if (in_corlib && !strcmp (m_class_get_name (cmethod->klass), "Buffer")) {
if (!strcmp (cmethod->name, "Memmove") && fsig->param_count == 3 && m_type_is_byref (fsig->params [0]) && m_type_is_byref (fsig->params [1]) && !cmethod->is_inflated) {
MonoBasicBlock *end_bb;
NEW_BBLOCK (cfg, end_bb);
// do nothing if len == 0 (even if src or dst are nulls)
MONO_EMIT_NEW_BIALU_IMM (cfg, OP_COMPARE_IMM, -1, args [2]->dreg, 0);
MONO_EMIT_NEW_BRANCH_BLOCK (cfg, OP_IBEQ, end_bb);
// throw NRE if src or dst are nulls
MONO_EMIT_NEW_BIALU_IMM (cfg, OP_COMPARE_IMM, -1, args [0]->dreg, 0);
MONO_EMIT_NEW_COND_EXC (cfg, EQ, "NullReferenceException");
MONO_EMIT_NEW_BIALU_IMM (cfg, OP_COMPARE_IMM, -1, args [1]->dreg, 0);
MONO_EMIT_NEW_COND_EXC (cfg, EQ, "NullReferenceException");
MONO_INST_NEW (cfg, ins, OP_MEMMOVE);
ins->sreg1 = args [0]->dreg; // i1* dst
ins->sreg2 = args [1]->dreg; // i1* src
ins->sreg3 = args [2]->dreg; // i32/i64 len
MONO_ADD_INS (cfg->cbb, ins);
MONO_START_BB (cfg, end_bb);
}
}

I think the proper fix should be updating the above code as CoreCLR did.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created a github issue for it #99161

@EgorBo
Copy link
Member Author

EgorBo commented Feb 29, 2024

Closing in favor of fixes in Mono itself

@EgorBo EgorBo closed this Feb 29, 2024
@github-actions github-actions bot locked and limited conversation to collaborators Apr 1, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Perf] Linux/x64: 19 Regressions on 2/25/2024 4:37:10 PM
6 participants