Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Z80] Add feature to enable SLI/SLL instruction #2

Merged
merged 2 commits into from
Jan 31, 2022

Conversation

b-s-a
Copy link

@b-s-a b-s-a commented Nov 27, 2019

Also this patch contains fix for requirements for TST instruction (it is documented since Z180).

Copy link
Owner

@jacobly0 jacobly0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intent of the undoc feature was to handle these instructions because they are not implemented on every "z80 compatible" processor produced. Also, I have been avoiding adding unused instructions/registers to the tblgen files because these files may need to be rewritten yet again at some point and I would rather that they didn't get too complicated before then. Also, extra registers can cause problems with implicit regclass computation and regalloc that I'm not sure how well I have fixed yet. In any case, random encodings aren't that useful, and useful ones won't be known until they are needed.

@b-s-a
Copy link
Author

b-s-a commented Nov 27, 2019

Some features may on may not be enabled on cpu implementations. For example, I use my own ez80 implementation (z80 mode only) with sli, but without in f,(c), out (c),0, and all FD/DD CB dd xx. R800 supports in f,(c), but do not sli and others. So putting all these ops under one flag will cause later refactoring.

Also, I try to implement AsmParser but codegen causes lot of errors.

@jacobly0 jacobly0 force-pushed the z80 branch 4 times, most recently from 6696e2f to 7681d6e Compare January 2, 2020 16:13
@jacobly0 jacobly0 force-pushed the z80 branch 2 times, most recently from 9886223 to 506f7ea Compare January 16, 2020 22:16
jacobly0 pushed a commit that referenced this pull request Jan 20, 2020
…F optimizing part. #2.

Summary:
This patch relands D71271. The problem with D71271 is that it has cyclic dependency:
CodeGen->AsmPrinter->DebugInfoDWARF->CodeGen. To avoid cyclic dependency this patch
puts implementation for DWARFOptimizer into separate library: lib/DWARFLinker.

Thus the difference between this patch and D71271 is in that DWARFOptimizer renamed into
DWARFLinker and it`s files are put into lib/DWARFLinker.

Reviewers: JDevlieghere, friss, dblaikie, aprantl

Reviewed By: JDevlieghere

Subscribers: thegameg, merge_guards_bot, probinson, mgorny, hiraditya, llvm-commits

Tags: #llvm, #debug-info

Differential Revision: https://reviews.llvm.org/D71839
jacobly0 pushed a commit that referenced this pull request Jan 20, 2020
…t binding

This fixes a failing testcase on Fedora 30 x86_64 (regression Fedora 29->30):

PASS:
./bin/lldb ./lldb-test-build.noindex/functionalities/unwind/noreturn/TestNoreturnUnwind.test_dwarf/a.out -o 'settings set symbols.enable-external-lookup false' -o r -o bt -o quit
  * frame #0: 0x00007ffff7aa6e75 libc.so.6`__GI_raise + 325
    frame #1: 0x00007ffff7a91895 libc.so.6`__GI_abort + 295
    frame #2: 0x0000000000401140 a.out`func_c at main.c:12:2
    frame #3: 0x000000000040113a a.out`func_b at main.c:18:2
    frame #4: 0x0000000000401134 a.out`func_a at main.c:26:2
    frame #5: 0x000000000040112e a.out`main(argc=<unavailable>, argv=<unavailable>) at main.c:32:2
    frame #6: 0x00007ffff7a92f33 libc.so.6`__libc_start_main + 243
    frame #7: 0x000000000040106e a.out`_start + 46

vs.

FAIL - unrecognized abort() function:
./bin/lldb ./lldb-test-build.noindex/functionalities/unwind/noreturn/TestNoreturnUnwind.test_dwarf/a.out -o 'settings set symbols.enable-external-lookup false' -o r -o bt -o quit
  * frame #0: 0x00007ffff7aa6e75 libc.so.6`.annobin_raise.c + 325
    frame #1: 0x00007ffff7a91895 libc.so.6`.annobin_loadmsgcat.c_end.unlikely + 295
    frame #2: 0x0000000000401140 a.out`func_c at main.c:12:2
    frame #3: 0x000000000040113a a.out`func_b at main.c:18:2
    frame #4: 0x0000000000401134 a.out`func_a at main.c:26:2
    frame #5: 0x000000000040112e a.out`main(argc=<unavailable>, argv=<unavailable>) at main.c:32:2
    frame #6: 0x00007ffff7a92f33 libc.so.6`.annobin_libc_start.c + 243
    frame #7: 0x000000000040106e a.out`.annobin_init.c.hot + 46

The extra ELF symbols are there due to Annobin (I did not investigate why this
problem happened specifically since F-30 and not since F-28).

It is due to:

Symbol table '.dynsym' contains 2361 entries:
Valu e          Size Type   Bind   Vis     Name
0000000000022769   5 FUNC   LOCAL  DEFAULT _nl_load_domain.cold
000000000002276e   0 NOTYPE LOCAL  HIDDEN  .annobin_abort.c.unlikely
...
000000000002276e   0 NOTYPE LOCAL  HIDDEN  .annobin_loadmsgcat.c_end.unlikely
...
000000000002276e   0 NOTYPE LOCAL  HIDDEN  .annobin_textdomain.c_end.unlikely
000000000002276e 548 FUNC   GLOBAL DEFAULT abort
000000000002276e 548 FUNC   GLOBAL DEFAULT abort@@GLIBC_2.2.5
000000000002276e 548 FUNC   LOCAL  DEFAULT __GI_abort
0000000000022992   0 NOTYPE LOCAL  HIDDEN  .annobin_abort.c_end.unlikely

GDB has some more complicated preferences between overlapping and/or sharing
address symbols, I have made here so far the most simple fix for this case.

Differential revision: https://reviews.llvm.org/D63540
jacobly0 pushed a commit that referenced this pull request Jan 20, 2020
The test is currently failing on some systems with ASAN enabled due to:
```
==22898==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x603000003da4 at pc 0x00010951c33d bp 0x7ffee6709e00 sp 0x7ffee67095c0
READ of size 5 at 0x603000003da4 thread T0
    #0 0x10951c33c in wrap_memmove+0x16c (libclang_rt.asan_osx_dynamic.dylib:x86_64+0x1833c)
    #1 0x7fff4a327f57 in CFDataReplaceBytes+0x1ba (CoreFoundation:x86_64+0x13f57)
    #2 0x7fff4a415a44 in __CFDataInit+0x2db (CoreFoundation:x86_64+0x101a44)
    #3 0x1094f8490 in main main.m:424
    #4 0x7fff77482084 in start+0x0 (libdyld.dylib:x86_64+0x17084)
0x603000003da4 is located 0 bytes to the right of 20-byte region [0x603000003d90,0x603000003da4)
allocated by thread T0 here:
    #0 0x109547c02 in wrap_calloc+0xa2 (libclang_rt.asan_osx_dynamic.dylib:x86_64+0x43c02)
    #1 0x7fff763ad3ef in class_createInstance+0x52 (libobjc.A.dylib:x86_64+0x73ef)
    #2 0x7fff4c6b2d73 in NSAllocateObject+0x12 (Foundation:x86_64+0x1d73)
    #3 0x7fff4c6b5e5f in -[_NSPlaceholderData initWithBytes:length:copy:deallocator:]+0x40 (Foundation:x86_64+0x4e5f)
    #4 0x7fff4c6d4cf1 in -[NSData(NSData) initWithBytes:length:]+0x24 (Foundation:x86_64+0x23cf1)
    #5 0x1094f8245 in main main.m:404
    #6 0x7fff77482084 in start+0x0 (libdyld.dylib:x86_64+0x17084)
```

The reason is that we create a string "HELLO" but get the size wrong (it's 5 bytes instead
of 4). Later on we read the buffer and pretend it is 5 bytes long, causing an OOB read
which ASAN detects.

In general this test probably needs some cleanup as it produces on macOS 10.15 around
100 compiler warnings which isn't great, but let's first get the bot green.
jacobly0 pushed a commit that referenced this pull request Jan 26, 2020
This reverts commit e57a9ab.

Parser/cxx2a-placeholder-type-constraint.cpp has MSan failures.

Present at 7b81c3f:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap-msan/builds/17133/steps/check-clang%20msan/logs/stdio
not present at eaa594f:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap-msan/builds/17132/steps/check-clang%20msan/logs/stdio

Stack trace:
```
==57032==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0xccfe016 in clang::AutoTypeLoc::getLocalSourceRange() const /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/include/clang/AST/TypeLoc.h:2036:19
    #1 0xcc56758 in CheckDeducedPlaceholderConstraints(clang::Sema&, clang::AutoType const&, clang::AutoTypeLoc, clang::QualType) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Sema/SemaTemplateDeduction.cpp:4505:56
    #2 0xcc550ce in clang::Sema::DeduceAutoType(clang::TypeLoc, clang::Expr*&, clang::QualType&, llvm::Optional<unsigned int>, bool) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Sema/SemaTemplateDeduction.cpp:4707:11
    #3 0xcc52407 in clang::Sema::DeduceAutoType(clang::TypeSourceInfo*, clang::Expr*&, clang::QualType&, llvm::Optional<unsigned int>, bool) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Sema/SemaTemplateDeduction.cpp:4457:10
    #4 0xba38332 in clang::Sema::deduceVarTypeFromInitializer(clang::VarDecl*, clang::DeclarationName, clang::QualType, clang::TypeSourceInfo*, clang::SourceRange, bool, clang::Expr*) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Sema/SemaDecl.cpp:11351:7
    #5 0xba3a8a9 in clang::Sema::DeduceVariableDeclarationType(clang::VarDecl*, bool, clang::Expr*) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Sema/SemaDecl.cpp:11385:26
    #6 0xba3c520 in clang::Sema::AddInitializerToDecl(clang::Decl*, clang::Expr*, bool) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Sema/SemaDecl.cpp:11725:9
    #7 0xb39c498 in clang::Parser::ParseDeclarationAfterDeclaratorAndAttributes(clang::Declarator&, clang::Parser::ParsedTemplateInfo const&, clang::Parser::ForRangeInit*) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Parse/ParseDecl.cpp:2399:17
    #8 0xb394d80 in clang::Parser::ParseDeclGroup(clang::ParsingDeclSpec&, clang::DeclaratorContext, clang::SourceLocation*, clang::Parser::ForRangeInit*) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Parse/ParseDecl.cpp:2128:21
    #9 0xb383bbf in clang::Parser::ParseSimpleDeclaration(clang::DeclaratorContext, clang::SourceLocation&, clang::Parser::ParsedAttributesWithRange&, bool, clang::Parser::ForRangeInit*, clang::SourceLocation*) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Parse/ParseDecl.cpp:1848:10
    #10 0xb383129 in clang::Parser::ParseDeclaration(clang::DeclaratorContext, clang::SourceLocation&, clang::Parser::ParsedAttributesWithRange&, clang::SourceLocation*) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/llvm/include/llvm/ADT/PointerUnion.h
    #11 0xb53a388 in clang::Parser::ParseStatementOrDeclarationAfterAttributes(llvm::SmallVector<clang::Stmt*, 32u>&, clang::Parser::ParsedStmtContext, clang::SourceLocation*, clang::Parser::ParsedAttributesWithRange&) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Parse/ParseStmt.cpp:221:13
    #12 0xb539309 in clang::Parser::ParseStatementOrDeclaration(llvm::SmallVector<clang::Stmt*, 32u>&, clang::Parser::ParsedStmtContext, clang::SourceLocation*) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Parse/ParseStmt.cpp:106:20
    #13 0xb55610e in clang::Parser::ParseCompoundStatementBody(bool) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Parse/ParseStmt.cpp:1079:11
    #14 0xb559529 in clang::Parser::ParseFunctionStatementBody(clang::Decl*, clang::Parser::ParseScope&) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Parse/ParseStmt.cpp:2204:21
    #15 0xb33c13e in clang::Parser::ParseFunctionDefinition(clang::ParsingDeclarator&, clang::Parser::ParsedTemplateInfo const&, clang::Parser::LateParsedAttrList*) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Parse/Parser.cpp:1339:10
    #16 0xb394703 in clang::Parser::ParseDeclGroup(clang::ParsingDeclSpec&, clang::DeclaratorContext, clang::SourceLocation*, clang::Parser::ForRangeInit*) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Parse/ParseDecl.cpp:2068:11
    #17 0xb338e52 in clang::Parser::ParseDeclOrFunctionDefInternal(clang::Parser::ParsedAttributesWithRange&, clang::ParsingDeclSpec&, clang::AccessSpecifier) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Parse/Parser.cpp:1099:10
    #18 0xb337674 in clang::Parser::ParseDeclarationOrFunctionDefinition(clang::Parser::ParsedAttributesWithRange&, clang::ParsingDeclSpec*, clang::AccessSpecifier) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Parse/Parser.cpp:1115:12
    #19 0xb334a96 in clang::Parser::ParseExternalDeclaration(clang::Parser::ParsedAttributesWithRange&, clang::ParsingDeclSpec*) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Parse/Parser.cpp:935:12
    #20 0xb32f12a in clang::Parser::ParseTopLevelDecl(clang::OpaquePtr<clang::DeclGroupRef>&, bool) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Parse/Parser.cpp:686:12
    #21 0xb31e193 in clang::ParseAST(clang::Sema&, bool, bool) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Parse/ParseAST.cpp:158:20
    #22 0x80263f0 in clang::FrontendAction::Execute() /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Frontend/FrontendAction.cpp:936:8
    #23 0x7f2a257 in clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Frontend/CompilerInstance.cpp:965:33
    #24 0x8288bef in clang::ExecuteCompilerInvocation(clang::CompilerInstance*) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/FrontendTool/ExecuteCompilerInvocation.cpp:290:25
    #25 0xad44c2 in cc1_main(llvm::ArrayRef<char const*>, char const*, void*) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/tools/driver/cc1_main.cpp:239:15
    #26 0xacd76a in ExecuteCC1Tool(llvm::ArrayRef<char const*>) /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/tools/driver/driver.cpp:325:12
    #27 0xacc9fd in main /b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/tools/driver/driver.cpp:398:12
    #28 0x7f7d82cdb2e0 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x202e0)
    #29 0xa4dde9 in _start (/b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm_build_msan/bin/clang-11+0xa4dde9)
```
jacobly0 pushed a commit that referenced this pull request Jan 26, 2020
Using the same strategy as c38e425.

D69825 revealed (introduced?) a problem when building with ASan, and
some memory leaks somewhere. More details are available in the original
patch.

Looks like we missed one failing tests, this patch adds the workaround
to this test as well.
@jacobly0 jacobly0 force-pushed the z80 branch 3 times, most recently from 14276c6 to 132d81e Compare January 27, 2020 00:44
jacobly0 pushed a commit that referenced this pull request Feb 9, 2020
…itions

This is a revert-of-revert (i.e. this reverts commit 802bec8, which
itself reverted fa4701e and 79daafc) with a fix folded in. The problem
was that call site tags weren't emitted properly when LTO was enabled
along with split-dwarf. This required a minor fix. I've added a reduced
test case in test/DebugInfo/X86/fission-call-site.ll.

Original commit message:

This allows a call site tag in CU A to reference a callee DIE in CU B
without resorting to creating an incomplete duplicate DIE for the callee
inside of CU A.

We already allow cross-CU references of subprogram declarations, so it
doesn't seem like definitions ought to be special.

This improves entry value evaluation and tail call frame synthesis in
the LTO setting. During LTO, it's common for cross-module inlining to
produce a call in some CU A where the callee resides in a different CU,
and there is no declaration subprogram for the callee anywhere. In this
case llvm would (unnecessarily, I think) emit an empty DW_TAG_subprogram
in order to fill in the call site tag. That empty 'definition' defeats
entry value evaluation etc., because the debugger can't figure out what
it means.

As a follow-up, maybe we could add a DWARF verifier check that a
DW_TAG_subprogram at least has a DW_AT_name attribute.

Update #1:

Reland with a fix to create a declaration DIE when the declaration is
missing from the CU's retainedTypes list. The declaration is left out
of the retainedTypes list in two cases:

1) Re-compiling pre-r266445 bitcode (in which declarations weren't added
   to the retainedTypes list), and
2) Doing LTO function importing (which doesn't update the retainedTypes
   list).

It's possible to handle (1) and (2) by modifying the retainedTypes list
(in AutoUpgrade, or in the LTO importing logic resp.), but I don't see
an advantage to doing it this way, as it would cause more DWARF to be
emitted compared to creating the declaration DIEs lazily.

Update #2:

Fold in a fix for call site tag emission in the split-dwarf + LTO case.

Tested with a stage2 ThinLTO+RelWithDebInfo build of clang, and with a
ReleaseLTO-g build of the test suite.

rdar://46577651, rdar://57855316, rdar://57840415, rdar://58888440

Differential Revision: https://reviews.llvm.org/D70350
@jacobly0 jacobly0 force-pushed the z80 branch 2 times, most recently from ec762b0 to ab7efaa Compare February 9, 2020 04:22
@jacobly0 jacobly0 force-pushed the z80 branch 6 times, most recently from e5934f7 to 5b472dc Compare March 7, 2020 23:41
@jacobly0 jacobly0 force-pushed the z80 branch 3 times, most recently from 0832f55 to 55bc950 Compare January 30, 2022 23:05
@jacobly0 jacobly0 merged commit af3048c into jacobly0:z80 Jan 31, 2022
jacobly0 pushed a commit that referenced this pull request Feb 8, 2022
We experienced some deadlocks when we used multiple threads for logging
using `scan-builds` intercept-build tool when we used multiple threads by
e.g. logging `make -j16`

```
(gdb) bt
#0  0x00007f2bb3aff110 in __lll_lock_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00007f2bb3af70a3 in pthread_mutex_lock () from /lib/x86_64-linux-gnu/libpthread.so.0
#2  0x00007f2bb3d152e4 in ?? ()
#3  0x00007ffcc5f0cc80 in ?? ()
#4  0x00007f2bb3d2bf5b in ?? () from /lib64/ld-linux-x86-64.so.2
#5  0x00007f2bb3b5da27 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#6  0x00007f2bb3b5dbe0 in exit () from /lib/x86_64-linux-gnu/libc.so.6
#7  0x00007f2bb3d144ee in ?? ()
#8  0x746e692f706d742f in ?? ()
#9  0x692d747065637265 in ?? ()
#10 0x2f653631326b3034 in ?? ()
#11 0x646d632e35353532 in ?? ()
#12 0x0000000000000000 in ?? ()
```

I think the gcc's exit call caused the injected `libear.so` to be unloaded
by the `ld`, which in turn called the `void on_unload() __attribute__((destructor))`.
That tried to acquire an already locked mutex which was left locked in the
`bear_report_call()` call, that probably encountered some error and
returned early when it forgot to unlock the mutex.

All of these are speculation since from the backtrace I could not verify
if frames 2 and 3 are in fact corresponding to the `libear.so` module.
But I think it's a fairly safe bet.

So, hereby I'm releasing the held mutex on *all paths*, even if some failure
happens.

PS: I would use lock_guards, but it's C.

Reviewed-by: NoQ

Differential Revision: https://reviews.llvm.org/D118439
nebulatgs pushed a commit to nebulatgs/llvm-project that referenced this pull request Mar 23, 2022
We experienced some deadlocks when we used multiple threads for logging
using `scan-builds` intercept-build tool when we used multiple threads by
e.g. logging `make -j16`

```
(gdb) bt
#0  0x00007f2bb3aff110 in __lll_lock_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
jacobly0#1  0x00007f2bb3af70a3 in pthread_mutex_lock () from /lib/x86_64-linux-gnu/libpthread.so.0
jacobly0#2  0x00007f2bb3d152e4 in ?? ()
jacobly0#3  0x00007ffcc5f0cc80 in ?? ()
jacobly0#4  0x00007f2bb3d2bf5b in ?? () from /lib64/ld-linux-x86-64.so.2
jacobly0#5  0x00007f2bb3b5da27 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
jacobly0#6  0x00007f2bb3b5dbe0 in exit () from /lib/x86_64-linux-gnu/libc.so.6
jacobly0#7  0x00007f2bb3d144ee in ?? ()
jacobly0#8  0x746e692f706d742f in ?? ()
jacobly0#9  0x692d747065637265 in ?? ()
jacobly0#10 0x2f653631326b3034 in ?? ()
jacobly0#11 0x646d632e35353532 in ?? ()
jacobly0#12 0x0000000000000000 in ?? ()
```

I think the gcc's exit call caused the injected `libear.so` to be unloaded
by the `ld`, which in turn called the `void on_unload() __attribute__((destructor))`.
That tried to acquire an already locked mutex which was left locked in the
`bear_report_call()` call, that probably encountered some error and
returned early when it forgot to unlock the mutex.

All of these are speculation since from the backtrace I could not verify
if frames 2 and 3 are in fact corresponding to the `libear.so` module.
But I think it's a fairly safe bet.

So, hereby I'm releasing the held mutex on *all paths*, even if some failure
happens.

PS: I would use lock_guards, but it's C.

Reviewed-by: NoQ

Differential Revision: https://reviews.llvm.org/D118439

(cherry picked from commit d919d02)
jacobly0 pushed a commit that referenced this pull request Jun 4, 2022
This reverts commit c274b6e.

The x86_64 debian bot got a failure with this patch,
https://lab.llvm.org/buildbot#builders/68/builds/33078
where
SymbolFile/DWARF/x86/DW_TAG_variable-DW_AT_decl_file-DW_AT_abstract_origin-crosscu1.s
is crashing here -

 #2 0x0000000000425a9f SignalHandler(int) Signals.cpp:0:0
 #3 0x00007f57160e9140 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x14140)
 #4 0x00007f570d911e43 lldb_private::SourceManager::GetFile(lldb_private::FileSpec const&) crtstuff.c:0:0
 #5 0x00007f570d914270 lldb_private::SourceManager::DisplaySourceLinesWithLineNumbers(lldb_private::FileSpec const&, unsigned int, unsigned int, unsigned int, unsigned int, char const*, lldb_private::Stream*, lldb_private::SymbolContextList const*) crtstuff.c:0:0
 #6 0x00007f570da662c8 lldb_private::StackFrame::GetStatus(lldb_private::Stream&, bool, bool, bool, char const*) crtstuff.c:0:0

I don't get a failure here my mac, I'll review this method more
closely tomorrow.
jacobly0 pushed a commit that referenced this pull request Aug 6, 2024
```
  UBSan-Standalone-sparc :: TestCases/Misc/Linux/diag-stacktrace.cpp
```
`FAIL`s on 32 and 64-bit Linux/sparc64 (and on Solaris/sparcv9, too: the
test isn't Linux-specific at all). With
`UBSAN_OPTIONS=fast_unwind_on_fatal=1`, the stack trace shows a
duplicate innermost frame:
```
compiler-rt/test/ubsan/TestCases/Misc/Linux/diag-stacktrace.cpp:14:31: runtime error: execution reached the end of a value-returning function without returning a value
    #0 0x7003a708 in f() compiler-rt/test/ubsan/TestCases/Misc/Linux/diag-stacktrace.cpp:14:35
    #1 0x7003a708 in f() compiler-rt/test/ubsan/TestCases/Misc/Linux/diag-stacktrace.cpp:14:35
    #2 0x7003a714 in g() compiler-rt/test/ubsan/TestCases/Misc/Linux/diag-stacktrace.cpp:17:38
```
which isn't seen with `fast_unwind_on_fatal=0`.

This turns out to be another fallout from fixing
`__builtin_return_address`/`__builtin_extract_return_addr` on SPARC. In
`sanitizer_stacktrace_sparc.cpp` (`BufferedStackTrace::UnwindFast`) the
`pc` arg is the return address, while `pc1` from the stack frame
(`fr_savpc`) is the address of the `call` insn, leading to a double
entry for the innermost frame in `trace_buffer[]`.

This patch fixes this by moving the adjustment before all uses.

Tested on `sparc64-unknown-linux-gnu` and `sparcv9-sun-solaris2.11`
(with the `ubsan/TestCases/Misc/Linux` tests enabled).
jacobly0 pushed a commit that referenced this pull request Aug 22, 2024
…lvm#104148)

`hasOperands` does not always execute matchers in the order they are
written. This can cause issue in code using bindings when one operand
matcher is relying on a binding set by the other. With this change, the
first matcher present in the code is always executed first and any
binding it sets are available to the second matcher.

Simple example with current version (1 match) and new version (2
matches):
```bash
> cat tmp.cpp
int a = 13;
int b = ((int) a) - a;
int c = a - ((int) a);

> clang-query tmp.cpp
clang-query> set traversal IgnoreUnlessSpelledInSource
clang-query> m binaryOperator(hasOperands(cStyleCastExpr(has(declRefExpr(hasDeclaration(valueDecl().bind("d"))))), declRefExpr(hasDeclaration(valueDecl(equalsBoundNode("d"))))))

Match #1:

tmp.cpp:1:1: note: "d" binds here
int a = 13;
^~~~~~~~~~
tmp.cpp:2:9: note: "root" binds here
int b = ((int)a) - a;
        ^~~~~~~~~~~~
1 match.

> ./build/bin/clang-query tmp.cpp
clang-query> set traversal IgnoreUnlessSpelledInSource
clang-query> m binaryOperator(hasOperands(cStyleCastExpr(has(declRefExpr(hasDeclaration(valueDecl().bind("d"))))), declRefExpr(hasDeclaration(valueDecl(equalsBoundNode("d"))))))

Match #1:

tmp.cpp:1:1: note: "d" binds here
    1 | int a = 13;
      | ^~~~~~~~~~
tmp.cpp:2:9: note: "root" binds here
    2 | int b = ((int)a) - a;
      |         ^~~~~~~~~~~~

Match #2:

tmp.cpp:1:1: note: "d" binds here
    1 | int a = 13;
      | ^~~~~~~~~~
tmp.cpp:3:9: note: "root" binds here
    3 | int c = a - ((int)a);
      |         ^~~~~~~~~~~~
2 matches.
```

If this should be documented or regression tested anywhere please let me
know where.
jacobly0 pushed a commit that referenced this pull request Aug 22, 2024
…104523)

Compilers and language runtimes often use helper functions that are
fundamentally uninteresting when debugging anything but the
compiler/runtime itself. This patch introduces a user-extensible
mechanism that allows for these frames to be hidden from backtraces and
automatically skipped over when navigating the stack with `up` and
`down`.

This does not affect the numbering of frames, so `f <N>` will still
provide access to the hidden frames. The `bt` output will also print a
hint that frames have been hidden.

My primary motivation for this feature is to hide thunks in the Swift
programming language, but I'm including an example recognizer for
`std::function::operator()` that I wished for myself many times while
debugging LLDB.

rdar://126629381


Example output. (Yes, my proof-of-concept recognizer could hide even
more frames if we had a method that returned the function name without
the return type or I used something that isn't based off regex, but it's
really only meant as an example).

before:
```
(lldb) thread backtrace --filtered=false
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
  * frame #0: 0x0000000100001f04 a.out`foo(x=1, y=1) at main.cpp:4:10
    frame #1: 0x0000000100003a00 a.out`decltype(std::declval<int (*&)(int, int)>()(std::declval<int>(), std::declval<int>())) std::__1::__invoke[abi:se200000]<int (*&)(int, int), int, int>(__f=0x000000016fdff280, __args=0x000000016fdff224, __args=0x000000016fdff220) at invoke.h:149:25
    frame #2: 0x000000010000399c a.out`int std::__1::__invoke_void_return_wrapper<int, false>::__call[abi:se200000]<int (*&)(int, int), int, int>(__args=0x000000016fdff280, __args=0x000000016fdff224, __args=0x000000016fdff220) at invoke.h:216:12
    frame #3: 0x0000000100003968 a.out`std::__1::__function::__alloc_func<int (*)(int, int), std::__1::allocator<int (*)(int, int)>, int (int, int)>::operator()[abi:se200000](this=0x000000016fdff280, __arg=0x000000016fdff224, __arg=0x000000016fdff220) at function.h:171:12
    frame #4: 0x00000001000026bc a.out`std::__1::__function::__func<int (*)(int, int), std::__1::allocator<int (*)(int, int)>, int (int, int)>::operator()(this=0x000000016fdff278, __arg=0x000000016fdff224, __arg=0x000000016fdff220) at function.h:313:10
    frame #5: 0x0000000100003c38 a.out`std::__1::__function::__value_func<int (int, int)>::operator()[abi:se200000](this=0x000000016fdff278, __args=0x000000016fdff224, __args=0x000000016fdff220) const at function.h:430:12
    frame #6: 0x0000000100002038 a.out`std::__1::function<int (int, int)>::operator()(this= Function = foo(int, int) , __arg=1, __arg=1) const at function.h:989:10
    frame #7: 0x0000000100001f64 a.out`main(argc=1, argv=0x000000016fdff4f8) at main.cpp:9:10
    frame #8: 0x0000000183cdf154 dyld`start + 2476
(lldb) 
```

after

```
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
  * frame #0: 0x0000000100001f04 a.out`foo(x=1, y=1) at main.cpp:4:10
    frame #1: 0x0000000100003a00 a.out`decltype(std::declval<int (*&)(int, int)>()(std::declval<int>(), std::declval<int>())) std::__1::__invoke[abi:se200000]<int (*&)(int, int), int, int>(__f=0x000000016fdff280, __args=0x000000016fdff224, __args=0x000000016fdff220) at invoke.h:149:25
    frame #2: 0x000000010000399c a.out`int std::__1::__invoke_void_return_wrapper<int, false>::__call[abi:se200000]<int (*&)(int, int), int, int>(__args=0x000000016fdff280, __args=0x000000016fdff224, __args=0x000000016fdff220) at invoke.h:216:12
    frame #6: 0x0000000100002038 a.out`std::__1::function<int (int, int)>::operator()(this= Function = foo(int, int) , __arg=1, __arg=1) const at function.h:989:10
    frame #7: 0x0000000100001f64 a.out`main(argc=1, argv=0x000000016fdff4f8) at main.cpp:9:10
    frame #8: 0x0000000183cdf154 dyld`start + 2476
Note: Some frames were hidden by frame recognizers
```
jacobly0 pushed a commit that referenced this pull request Aug 26, 2024
)

Currently, process of replacing bitwise operations consisting of
`LSR`/`LSL` with `And` is performed by `DAGCombiner`.

However, in certain cases, the `AND` generated by this process
can be removed.

Consider following case:
```
        lsr x8, x8, llvm#56
        and x8, x8, #0xfc
        ldr w0, [x2, x8]
        ret
```

In this case, we can remove the `AND` by changing the target of `LDR`
to `[X2, X8, LSL #2]` and right-shifting amount change to 56 to 58.

after changed:
```
        lsr x8, x8, llvm#58
        ldr w0, [x2, x8, lsl #2]
        ret
```

This patch checks to see if the `SHIFTING` + `AND` operation on load
target can be optimized and optimizes it if it can.
jacobly0 pushed a commit that referenced this pull request Aug 28, 2024
`JITDylibSearchOrderResolver` local variable can be destroyed before
completion of all callbacks. Capture it together with `Deps` in
`OnEmitted` callback.

Original error:

```
==2035==ERROR: AddressSanitizer: stack-use-after-return on address 0x7bebfa155b70 at pc 0x7ff2a9a88b4a bp 0x7bec08d51980 sp 0x7bec08d51978
READ of size 8 at 0x7bebfa155b70 thread T87 (tf_xla-cpu-llvm)
    #0 0x7ff2a9a88b49 in operator() llvm/lib/ExecutionEngine/Orc/RTDyldObjectLinkingLayer.cpp:55:58
    #1 0x7ff2a9a88b49 in __invoke<(lambda at llvm/lib/ExecutionEngine/Orc/RTDyldObjectLinkingLayer.cpp:55:9) &, const llvm::DenseMap<llvm::orc::JITDylib *, llvm::DenseSet<llvm::orc::SymbolStringPtr, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void> >, llvm::DenseMapInfo<llvm::orc::JITDylib *, void>, llvm::detail::DenseMapPair<llvm::orc::JITDylib *, llvm::DenseSet<llvm::orc::SymbolStringPtr, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void> > > > &> libcxx/include/__type_traits/invoke.h:149:25
    #2 0x7ff2a9a88b49 in __call<(lambda at llvm/lib/ExecutionEngine/Orc/RTDyldObjectLinkingLayer.cpp:55:9) &, const llvm::DenseMap<llvm::orc::JITDylib *, llvm::DenseSet<llvm::orc::SymbolStringPtr, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void> >, llvm::DenseMapInfo<llvm::orc::JITDylib *, void>, llvm::detail::DenseMapPair<llvm::orc::JITDylib *, llvm::DenseSet<llvm::orc::SymbolStringPtr, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void> > > > &> libcxx/include/__type_traits/invoke.h:224:5
    #3 0x7ff2a9a88b49 in operator() libcxx/include/__functional/function.h:210:12
    #4 0x7ff2a9a88b49 in void std::__u::__function::__policy_invoker<void (llvm::DenseMap<llvm::orc::JITDylib*, llvm::DenseSet<llvm::orc::SymbolStringPtr,
```
jacobly0 pushed a commit that referenced this pull request Sep 6, 2024
Static destructor can race with calls to notify and trigger tsan
warning.

```
WARNING: ThreadSanitizer: data race (pid=5787)
  Write of size 1 at 0x55bec9df8de8 by thread T23:
    #0 pthread_mutex_destroy [third_party/llvm/llvm-project/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp:1344](third_party/llvm/llvm-project/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp?l=1344&cl=669089572):3 (be1eb158bb70fc9cf7be2db70407e512890e5c6e20720cd88c69d7d9c26ea531_0200d5f71908+0x1b12affb) (BuildId: ff25ace8b17d9863348bb1759c47246c)
    #1 __libcpp_recursive_mutex_destroy [third_party/crosstool/v18/stable/src/libcxx/include/__thread/support/pthread.h:91](third_party/crosstool/v18/stable/src/libcxx/include/__thread/support/pthread.h?l=91&cl=669089572):10 (be1eb158bb70fc9cf7be2db70407e512890e5c6e20720cd88c69d7d9c26ea531_0200d5f71908+0x4523d4e9) (BuildId: ff25ace8b17d9863348bb1759c47246c)
    #2 std::__tsan::recursive_mutex::~recursive_mutex() [third_party/crosstool/v18/stable/src/libcxx/src/mutex.cpp:52](third_party/crosstool/v18/stable/src/libcxx/src/mutex.cpp?l=52&cl=669089572):11 (be1eb158bb70fc9cf7be2db70407e512890e5c6e20720cd88c69d7d9c26ea531_0200d5f71908+0x4523d4e9)
    #3 ~SmartMutex [third_party/llvm/llvm-project/llvm/include/llvm/Support/Mutex.h:28](third_party/llvm/llvm-project/llvm/include/llvm/Support/Mutex.h?l=28&cl=669089572):11 (be1eb158bb70fc9cf7be2db70407e512890e5c6e20720cd88c69d7d9c26ea531_0200d5f71908+0x2bcaedfe) (BuildId: ff25ace8b17d9863348bb1759c47246c)
    #4 (anonymous namespace)::PerfJITEventListener::~PerfJITEventListener() [third_party/llvm/llvm-project/llvm/lib/ExecutionEngine/PerfJITEvents/PerfJITEventListener.cpp:65](third_party/llvm/llvm-project/llvm/lib/ExecutionEngine/PerfJITEvents/PerfJITEventListener.cpp?l=65&cl=669089572):3 (be1eb158bb70fc9cf7be2db70407e512890e5c6e20720cd88c69d7d9c26ea531_0200d5f71908+0x2bcaedfe)
    #5 cxa_at_exit_callback_installed_at(void*) [third_party/llvm/llvm-project/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp:437](third_party/llvm/llvm-project/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp?l=437&cl=669089572):3 (be1eb158bb70fc9cf7be2db70407e512890e5c6e20720cd88c69d7d9c26ea531_0200d5f71908+0x1b172cb9) (BuildId: ff25ace8b17d9863348bb1759c47246c)
    #6 llvm::JITEventListener::createPerfJITEventListener() [third_party/llvm/llvm-project/llvm/lib/ExecutionEngine/PerfJITEvents/PerfJITEventListener.cpp:496](third_party/llvm/llvm-project/llvm/lib/ExecutionEngine/PerfJITEvents/PerfJITEventListener.cpp?l=496&cl=669089572):3 (be1eb158bb70fc9cf7be2db70407e512890e5c6e20720cd88c69d7d9c26ea531_0200d5f71908+0x2bcad8f5) (BuildId: ff25ace8b17d9863348bb1759c47246c)
```
```
Previous atomic read of size 1 at 0x55bec9df8de8 by thread T192 (mutexes: write M0, write M1):
    #0 pthread_mutex_unlock [third_party/llvm/llvm-project/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp:1387](third_party/llvm/llvm-project/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp?l=1387&cl=669089572):3 (be1eb158bb70fc9cf7be2db70407e512890e5c6e20720cd88c69d7d9c26ea531_0200d5f71908+0x1b12b6bb) (BuildId: ff25ace8b17d9863348bb1759c47246c)
    #1 __libcpp_recursive_mutex_unlock [third_party/crosstool/v18/stable/src/libcxx/include/__thread/support/pthread.h:87](third_party/crosstool/v18/stable/src/libcxx/include/__thread/support/pthread.h?l=87&cl=669089572):10 (be1eb158bb70fc9cf7be2db70407e512890e5c6e20720cd88c69d7d9c26ea531_0200d5f71908+0x4523d589) (BuildId: ff25ace8b17d9863348bb1759c47246c)
    #2 std::__tsan::recursive_mutex::unlock() [third_party/crosstool/v18/stable/src/libcxx/src/mutex.cpp:64](third_party/crosstool/v18/stable/src/libcxx/src/mutex.cpp?l=64&cl=669089572):11 (be1eb158bb70fc9cf7be2db70407e512890e5c6e20720cd88c69d7d9c26ea531_0200d5f71908+0x4523d589)
    #3 unlock [third_party/llvm/llvm-project/llvm/include/llvm/Support/Mutex.h:47](third_party/llvm/llvm-project/llvm/include/llvm/Support/Mutex.h?l=47&cl=669089572):16 (be1eb158bb70fc9cf7be2db70407e512890e5c6e20720cd88c69d7d9c26ea531_0200d5f71908+0x2bcaf968) (BuildId: ff25ace8b17d9863348bb1759c47246c)
    #4 ~lock_guard [third_party/crosstool/v18/stable/src/libcxx/include/__mutex/lock_guard.h:39](third_party/crosstool/v18/stable/src/libcxx/include/__mutex/lock_guard.h?l=39&cl=669089572):101 (be1eb158bb70fc9cf7be2db70407e512890e5c6e20720cd88c69d7d9c26ea531_0200d5f71908+0x2bcaf968)
    #5 (anonymous namespace)::PerfJITEventListener::notifyObjectLoaded(unsigned long, llvm::object::ObjectFile const&, llvm::RuntimeDyld::LoadedObjectInfo const&) [third_party/llvm/llvm-project/llvm/lib/ExecutionEngine/PerfJITEvents/PerfJITEventListener.cpp:290](https://cs.corp.google.com/piper///depot/google3/third_party/llvm/llvm-project/llvm/lib/ExecutionEngine/PerfJITEvents/PerfJITEventListener.cpp?l=290&cl=669089572):1 (be1eb158bb70fc9cf7be2db70407e512890e5c6e20720cd88c69d7d9c26ea531_0200d5f71908+0x2bcaf968)
    #6 llvm::orc::RTDyldObjectLinkingLayer::onObjEmit(llvm::orc::MaterializationResponsibility&, llvm::object::OwningBinary<llvm::object::ObjectFile>, std::__tsan::unique_ptr<llvm::RuntimeDyld::MemoryManager, std::__tsan::default_delete<llvm::RuntimeDyld::MemoryManager>>, std::__tsan::unique_ptr<llvm::RuntimeDyld::LoadedObjectInfo, std::__tsan::default_delete<llvm::RuntimeDyld::LoadedObjectInfo>>, std::__tsan::unique_ptr<llvm::DenseMap<llvm::orc::JITDylib*, llvm::DenseSet<llvm::orc::SymbolStringPtr, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void>>, llvm::DenseMapInfo<llvm::orc::JITDylib*, void>, llvm::detail::DenseMapPair<llvm::orc::JITDylib*, llvm::DenseSet<llvm::orc::SymbolStringPtr, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void>>>>, std::__tsan::default_delete<llvm::DenseMap<llvm::orc::JITDylib*, llvm::DenseSet<llvm::orc::SymbolStringPtr, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void>>, llvm::DenseMapInfo<llvm::orc::JITDylib*, void>, llvm::detail::DenseMapPair<llvm::orc::JITDylib*, llvm::DenseSet<llvm::orc::SymbolStringPtr, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void>>>>>>, llvm::Error) [third_party/llvm/llvm-project/llvm/lib/ExecutionEngine/Orc/RTDyldObjectLinkingLayer.cpp:386](https://cs.corp.google.com/piper///depot/google3/third_party/llvm/llvm-project/llvm/lib/ExecutionEngine/Orc/RTDyldObjectLinkingLayer.cpp?l=386&cl=669089572):10 (be1eb158bb70fc9cf7be2db70407e512890e5c6e20720cd88c69d7d9c26ea531_0200d5f71908+0x2bc404a8) (BuildId: ff25ace8b17d9863348bb1759c47246c)
```
jacobly0 pushed a commit that referenced this pull request Sep 9, 2024
…llvm#94981)

This extends default argument deduction to cover class templates as
well, applying only to partial ordering, adding to the provisional
wording introduced in llvm#89807.

This solves some ambuguity introduced in P0522 regarding how template
template parameters are partially ordered, and should reduce the
negative impact of enabling `-frelaxed-template-template-args` by
default.

Given the following example:
```C++
template <class T1, class T2 = float> struct A;
template <class T3> struct B;

template <template <class T4> class TT1, class T5> struct B<TT1<T5>>;   // #1
template <class T6, class T7>                      struct B<A<T6, T7>>; // #2

template struct B<A<int>>;
```
Prior to P0522, `#2` was picked. Afterwards, this became ambiguous. This
patch restores the pre-P0522 behavior, `#2` is picked again.
jacobly0 pushed a commit that referenced this pull request Sep 20, 2024
When SPARC Asan testing is enabled by PR llvm#107405, many Linux/sparc64
tests just hang like
```
#0  0xf7ae8e90 in syscall () from /usr/lib32/libc.so.6
#1  0x701065e8 in __sanitizer::FutexWait(__sanitizer::atomic_uint32_t*, unsigned int) ()
    at compiler-rt/lib/sanitizer_common/sanitizer_linux.cpp:766
#2  0x70107c90 in Wait ()
    at compiler-rt/lib/sanitizer_common/sanitizer_mutex.cpp:35
#3  0x700f7cac in Lock ()
    at compiler-rt/lib/asan/../sanitizer_common/sanitizer_mutex.h:196
#4  Lock ()
    at compiler-rt/lib/asan/../sanitizer_common/sanitizer_thread_registry.h:98
#5  LockThreads ()
    at compiler-rt/lib/asan/asan_thread.cpp:489
#6  0x700e9c8c in __asan::BeforeFork() ()
    at compiler-rt/lib/asan/asan_posix.cpp:157
#7  0xf7ac83f4 in ?? () from /usr/lib32/libc.so.6
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
```
It turns out that this happens in tests using `internal_fork` (e.g.
invoking `llvm-symbolizer`): unlike most other Linux targets, which use
`clone`, Linux/sparc64 has to use `__fork` instead. While `clone`
doesn't trigger `pthread_atfork` handlers, `__fork` obviously does,
causing the hang.

To avoid this, this patch disables `InstallAtForkHandler` and lets the
ASan tests run to completion.

Tested on `sparc64-unknown-linux-gnu`.
jacobly0 pushed a commit that referenced this pull request Sep 20, 2024
…ap (llvm#108825)

This attempts to improve user-experience when LLDB stops on a
verbose_trap. Currently if a `__builtin_verbose_trap` triggers, we
display the first frame above the call to the verbose_trap. So in the
newly added test case, we would've previously stopped here:
```
(lldb) run
Process 28095 launched: '/Users/michaelbuch/a.out' (arm64)
Process 28095 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = Bounds error: out-of-bounds access
    frame #1: 0x0000000100003f5c a.out`std::__1::vector<int>::operator[](this=0x000000016fdfebef size=0, (null)=10) at verbose_trap.cpp:6:9
   3    template <typename T>
   4    struct vector {
   5        void operator[](unsigned) {
-> 6            __builtin_verbose_trap("Bounds error", "out-of-bounds access");
   7        }
   8    };
```

After this patch, we would stop in the first non-`std` frame:
```
(lldb) run
Process 27843 launched: '/Users/michaelbuch/a.out' (arm64)
Process 27843 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = Bounds error: out-of-bounds access
    frame #2: 0x0000000100003f44 a.out`g() at verbose_trap.cpp:14:5
   11  
   12   void g() {
   13       std::vector<int> v;
-> 14       v[10];
   15   }
   16  
```

rdar://134490328
jacobly0 pushed a commit that referenced this pull request Oct 9, 2024
…ext is not fully initialized (llvm#110481)

As this comment around target initialization implies:
```
  // This can be NULL if we don't know anything about the architecture or if
  // the target for an architecture isn't enabled in the llvm/clang that we
  // built
```

There are cases where we might fail to call `InitBuiltinTypes` when
creating the backing `ASTContext` for a `TypeSystemClang`. If that
happens, the builtins `QualType`s, e.g., `VoidPtrTy`/`IntTy`/etc., are
not initialized and dereferencing them as we do in
`GetBuiltinTypeForEncodingAndBitSize` (and other places) will lead to
nullptr-dereferences. Example backtrace:
```
(lldb) run
Assertion failed: (!isNull() && "Cannot retrieve a NULL type pointer"), function getCommonPtr, file Type.h, line 958.
Process 2680 stopped
* thread #15, name = '<lldb.process.internal-state(pid=2712)>', stop reason = hit program assert
    frame #4: 0x000000010cdf3cdc liblldb.20.0.0git.dylib`DWARFASTParserClang::ExtractIntFromFormValue(lldb_private::CompilerType const&, lldb_private::plugin::dwarf::DWARFFormValue const&) const (.cold.1) + 
liblldb.20.0.0git.dylib`DWARFASTParserClang::ParseObjCMethod(lldb_private::ObjCLanguage::MethodName const&, lldb_private::plugin::dwarf::DWARFDIE const&, lldb_private::CompilerType, ParsedDWARFTypeAttributes
, bool) (.cold.1):
->  0x10cdf3cdc <+0>:  stp    x29, x30, [sp, #-0x10]!
    0x10cdf3ce0 <+4>:  mov    x29, sp
    0x10cdf3ce4 <+8>:  adrp   x0, 545
    0x10cdf3ce8 <+12>: add    x0, x0, #0xa25 ; "ParseObjCMethod"
Target 0: (lldb) stopped.
(lldb) bt
* thread #15, name = '<lldb.process.internal-state(pid=2712)>', stop reason = hit program assert
    frame #0: 0x0000000180d08600 libsystem_kernel.dylib`__pthread_kill + 8
    frame #1: 0x0000000180d40f50 libsystem_pthread.dylib`pthread_kill + 288
    frame #2: 0x0000000180c4d908 libsystem_c.dylib`abort + 128
    frame #3: 0x0000000180c4cc1c libsystem_c.dylib`__assert_rtn + 284
  * frame #4: 0x000000010cdf3cdc liblldb.20.0.0git.dylib`DWARFASTParserClang::ExtractIntFromFormValue(lldb_private::CompilerType const&, lldb_private::plugin::dwarf::DWARFFormValue const&) const (.cold.1) + 
    frame #5: 0x0000000109d30acc liblldb.20.0.0git.dylib`lldb_private::TypeSystemClang::GetBuiltinTypeForEncodingAndBitSize(lldb::Encoding, unsigned long) + 1188
    frame #6: 0x0000000109aaaed4 liblldb.20.0.0git.dylib`DynamicLoaderMacOS::NotifyBreakpointHit(void*, lldb_private::StoppointCallbackContext*, unsigned long long, unsigned long long) + 384
```

This patch adds a one-time user-visible warning for when we fail to
initialize the AST to indicate that initialization went wrong for the
given target. Additionally, we add checks for whether one of the
`ASTContext` `QualType`s is invalid before dereferencing any builtin
types.

The warning would look as follows:
```
(lldb) target create "a.out"
Current executable set to 'a.out' (arm64).
(lldb) b main
warning: Failed to initialize builtin ASTContext types for target 'some-unknown-triple'. Printing variables may behave unexpectedly.
Breakpoint 1: where = a.out`main + 8 at stepping.cpp:5:14, address = 0x0000000100003f90
```

rdar://134869779
jacobly0 pushed a commit that referenced this pull request Oct 9, 2024
Fixes llvm#102703.

https://godbolt.org/z/nfj8xsb1Y

The following pattern:

```
%2 = and i32 %0, 254
%3 = icmp eq i32 %2, 0
```
is optimised by instcombine into:

```%3 = icmp ult i32 %0, 2```

However, post instcombine leads to worse aarch64 than the unoptimised version.

Pre instcombine:
```
        tst     w0, #0xfe
        cset    w0, eq
        ret
```
Post instcombine:
```
        and     w8, w0, #0xff
        cmp     w8, #2
        cset    w0, lo
        ret
```


In the unoptimised version, SelectionDAG converts `SETCC (AND X 254) 0 EQ` into `CSEL 0 1 1 (ANDS X 254)`, which gets emitted as a `tst`.

In the optimised version, SelectionDAG converts `SETCC (AND X 255) 2 ULT` into `CSEL 0 1 2 (SUBS (AND X 255) 2)`, which gets emitted as an `and`/`cmp`.

This PR adds an optimisation to `AArch64ISelLowering`, converting `SETCC (AND X Y) Z ULT` into `SETCC (AND X (Y & ~(Z - 1))) 0 EQ` when `Z` is a power of two. This makes SelectionDAG/Codegen produce the same optimised code for both examples.
jacobly0 pushed a commit that referenced this pull request Nov 20, 2024
…onger cause a crash (llvm#116569)

This PR fixes a bug introduced by llvm#110199, which causes any half float
argument to crash the compiler on MIPS64.

Currently compiling this bit of code with `llc -mtriple=mips64`: 
```
define void @half_args(half %a) nounwind {
entry:
        ret void
}
```

Crashes with the following log:
```
LLVM ERROR: unable to allocate function argument #0
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.	Program arguments: llc -mtriple=mips64
1.	Running pass 'Function Pass Manager' on module '<stdin>'.
2.	Running pass 'MIPS DAG->DAG Pattern Instruction Selection' on function '@half_args'
 #0 0x000055a3a4013df8 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/davide/Ps2/rps2-tools/prefix/bin/llc+0x32d0df8)
 #1 0x000055a3a401199e llvm::sys::RunSignalHandlers() (/home/davide/Ps2/rps2-tools/prefix/bin/llc+0x32ce99e)
 #2 0x000055a3a40144a8 SignalHandler(int) Signals.cpp:0:0
 #3 0x00007f00bde558c0 __restore_rt libc_sigaction.c:0:0
 #4 0x00007f00bdea462c __pthread_kill_implementation ./nptl/pthread_kill.c:44:76
 #5 0x00007f00bde55822 gsignal ./signal/../sysdeps/posix/raise.c:27:6
 #6 0x00007f00bde3e4af abort ./stdlib/abort.c:81:7
 #7 0x000055a3a3f80e3c llvm::report_fatal_error(llvm::Twine const&, bool) (/home/davide/Ps2/rps2-tools/prefix/bin/llc+0x323de3c)
 #8 0x000055a3a2e20dfa (/home/davide/Ps2/rps2-tools/prefix/bin/llc+0x20dddfa)
 #9 0x000055a3a2a34e20 llvm::MipsTargetLowering::LowerFormalArguments(llvm::SDValue, unsigned int, bool, llvm::SmallVectorImpl<llvm::ISD::InputArg> const&, llvm::SDLoc const&, llvm::SelectionDAG&, llvm::SmallVectorImpl<llvm::SDValue>&) const MipsISelLowering.cpp:0:0
#10 0x000055a3a3d896a9 llvm::SelectionDAGISel::LowerArguments(llvm::Function const&) (/home/davide/Ps2/rps2-tools/prefix/bin/llc+0x30466a9)
#11 0x000055a3a3e0b3ec llvm::SelectionDAGISel::SelectAllBasicBlocks(llvm::Function const&) (/home/davide/Ps2/rps2-tools/prefix/bin/llc+0x30c83ec)
#12 0x000055a3a3e09e21 llvm::SelectionDAGISel::runOnMachineFunction(llvm::MachineFunction&) (/home/davide/Ps2/rps2-tools/prefix/bin/llc+0x30c6e21)
#13 0x000055a3a2aae1ca llvm::MipsDAGToDAGISel::runOnMachineFunction(llvm::MachineFunction&) MipsISelDAGToDAG.cpp:0:0
#14 0x000055a3a3e07706 llvm::SelectionDAGISelLegacy::runOnMachineFunction(llvm::MachineFunction&) (/home/davide/Ps2/rps2-tools/prefix/bin/llc+0x30c4706)
#15 0x000055a3a3051ed6 llvm::MachineFunctionPass::runOnFunction(llvm::Function&) (/home/davide/Ps2/rps2-tools/prefix/bin/llc+0x230eed6)
#16 0x000055a3a35a3ec9 llvm::FPPassManager::runOnFunction(llvm::Function&) (/home/davide/Ps2/rps2-tools/prefix/bin/llc+0x2860ec9)
#17 0x000055a3a35ac3b2 llvm::FPPassManager::runOnModule(llvm::Module&) (/home/davide/Ps2/rps2-tools/prefix/bin/llc+0x28693b2)
#18 0x000055a3a35a499c llvm::legacy::PassManagerImpl::run(llvm::Module&) (/home/davide/Ps2/rps2-tools/prefix/bin/llc+0x286199c)
#19 0x000055a3a262abbb main (/home/davide/Ps2/rps2-tools/prefix/bin/llc+0x18e7bbb)
#20 0x00007f00bde3fc4c __libc_start_call_main ./csu/../sysdeps/nptl/libc_start_call_main.h:74:3
#21 0x00007f00bde3fd05 call_init ./csu/../csu/libc-start.c:128:20
#22 0x00007f00bde3fd05 __libc_start_main@GLIBC_2.2.5 ./csu/../csu/libc-start.c:347:5
#23 0x000055a3a2624921 _start /builddir/glibc-2.39/csu/../sysdeps/x86_64/start.S:117:0
```

This is caused by the fact that after the change, `f16`s are no longer
lowered as `f32`s in calls.

Two possible fixes are available:
- Update calling conventions to properly support passing `f16` as
integers.
- Update `useFPRegsForHalfType()` to return `true` so that `f16` are
still kept in `f32` registers, as before llvm#110199.

This PR implements the first solution to not introduce any more ABI
changes as llvm#110199 already did.

As of what is the correct ABI for halfs, I don't think there is a
correct answer. GCC doesn't support halfs on MIPS, and I couldn't find
any information on old MIPS ABI manuals either.
jacobly0 pushed a commit that referenced this pull request Nov 27, 2024
…#116656)

The main issue to solve is that OpenMP modifiers can be specified in any
order, so the parser cannot expect any specific modifier at a given
position. To solve that, define modifier to be a union of all allowable
specific modifiers for a given clause.

Additionally, implement modifier descriptors: for each modifier the
corresponding descriptor contains a set of properties of the modifier
that allow a common set of semantic checks. Start with the syntactic
properties defined in the spec: Required, Unique, Exclusive, Ultimate,
and implement common checks to verify each of them.

OpenMP modifier overhaul: #2/3
jacobly0 pushed a commit that referenced this pull request Nov 29, 2024
…plementation (llvm#108413. llvm#117704) (llvm#117894)

Relands llvm#117704, which relanded changes from llvm#108413 - this was reverted
due to build issues. The new offload library did not build with
`LIBOMPTARGET_OMPT_SUPPORT` enabled, which was not picked up by
pre-merge testing.

The last commit contains the fix; everything else is otherwise identical
to the approved PR.
___

### New API

Previous discussions at the LLVM/Offload meeting have brought up the
need for a new API for exposing the functionality of the plugins. This
change introduces a very small subset of a new API, which is primarily
for testing the offload tooling and demonstrating how a new API can fit
into the existing code base without being too disruptive. Exact designs
for these entry points and future additions can be worked out over time.

The new API does however introduce the bare minimum functionality to
implement device discovery for Unified Runtime and SYCL. This means that
the `urinfo` and `sycl-ls` tools can be used on top of Offload. A
(rough) implementation of a Unified Runtime adapter (aka plugin) for
Offload is available
[here](https://github.com/callumfare/unified-runtime/tree/offload_adapter).
Our intention is to maintain this and use it to implement and test
Offload API changes with SYCL.

### Demoing the new API

```sh
# From the runtime build directory
$ ninja LibomptUnitTests
$ OFFLOAD_TRACE=1 ./offload/unittests/OffloadAPI/offload.unittests 
```


### Open questions and future work
* Only some of the available device info is exposed, and not all the
possible device queries needed for SYCL are implemented by the plugins.
A sensible next step would be to refactor and extend the existing device
info queries in the plugins. The existing info queries are all strings,
but the new API introduces the ability to return any arbitrary type.
* It may be sensible at some point for the plugins to implement the new
API directly, and the higher level code on top of it could be made
generic, but this is more of a long-term possibility.
jacobly0 pushed a commit that referenced this pull request Nov 29, 2024
…abort (llvm#117603)

Hey guys, I found that Flang's built-in ABORT function is incomplete
when I was using it. Compared with gfortran's ABORT (which can both
abort and print out a backtrace), flang's ABORT implementation lacks the
function of printing out a backtrace. This feature is essential for
debugging and understanding the call stack at the failure point.

To solve this problem, I completed the "// TODO:" of the abort function,
and then implemented an additional built-in function BACKTRACE for
flang. After a brief reading of the relevant source code, I used
backtrace and backtrace_symbols in "execinfo.h" to quickly implement
this. But since I used the above two functions directly, my
implementation is slightly different from gfortran's implementation (in
the output, the function call stack before main is additionally output,
and the function line number is missing). In addition, since I used the
above two functions, I did not need to add -g to embed debug information
into the ELF file, but needed -rdynamic to ensure that the symbols are
added to the dynamic symbol table (so that the function name will be
printed out).

Here is a comparison of the output between gfortran 's backtrace and my
implementation:
gfortran's implemention output:
```
#0  0x557eb71f4184 in testfun2_
        at /home/hunter/plct/fortran/test.f90:5
#1  0x557eb71f4165 in testfun1_
        at /home/hunter/plct/fortran/test.f90:13
#2  0x557eb71f4192 in test_backtrace
        at /home/hunter/plct/fortran/test.f90:17
#3  0x557eb71f41ce in main
        at /home/hunter/plct/fortran/test.f90:18
```
my impelmention output:
```
Backtrace:
#0 ./test(_FortranABacktrace+0x32) [0x574f07efcf92]
#1 ./test(testfun2_+0x14) [0x574f07efc7b4]
#2 ./test(testfun1_+0xd) [0x574f07efc7cd]
#3 ./test(_QQmain+0x9) [0x574f07efc7e9]
#4 ./test(main+0x12) [0x574f07efc802]
#5 /usr/lib/libc.so.6(+0x25e08) [0x76954694fe08]
#6 /usr/lib/libc.so.6(__libc_start_main+0x8c) [0x76954694fecc]
#7 ./test(_start+0x25) [0x574f07efc6c5]
```
test program is:
```
function testfun2() result(err)
  implicit none
  integer :: err
  err = 1
  call backtrace
end function testfun2

subroutine testfun1()
  implicit none
  integer :: err
  integer :: testfun2

  err = testfun2()
end subroutine testfun1

program test_backtrace
  call testfun1()
end program test_backtrace
```
I am well aware of the importance of line numbers, so I am now working
on implementing line numbers (by parsing DWARF information) and
supporting cross-platform (Windows) support.
jacobly0 pushed a commit that referenced this pull request Nov 29, 2024
jacobly0 pushed a commit that referenced this pull request Dec 6, 2024
…ne symbol size as symbols are created (llvm#117079)"

This reverts commit ba668eb.

Below test started failing again on x86_64 macOS CI. We're unsure
if this patch is the exact cause, but since this patch has broken
this test before, we speculatively revert it to see if it was indeed
the root cause.
```
FAIL: lldb-shell :: Unwind/trap_frame_sym_ctx.test (1692 of 2162)
******************** TEST 'lldb-shell :: Unwind/trap_frame_sym_ctx.test' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 7: /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/bin/clang --target=specify-a-target-or-use-a-_host-substitution --target=x86_64-apple-darwin22.6.0 -isysroot /Applications/Xcode-beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk -fmodules-cache-path=/Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/lldb-test-build.noindex/module-cache-clang/lldb-shell /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/Inputs/call-asm.c /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/Inputs/trap_frame_sym_ctx.s -o /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/tools/lldb/test/Shell/Unwind/Output/trap_frame_sym_ctx.test.tmp
+ /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/bin/clang --target=specify-a-target-or-use-a-_host-substitution --target=x86_64-apple-darwin22.6.0 -isysroot /Applications/Xcode-beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk -fmodules-cache-path=/Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/lldb-test-build.noindex/module-cache-clang/lldb-shell /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/Inputs/call-asm.c /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/Inputs/trap_frame_sym_ctx.s -o /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/tools/lldb/test/Shell/Unwind/Output/trap_frame_sym_ctx.test.tmp
clang: warning: argument unused during compilation: '-fmodules-cache-path=/Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/lldb-test-build.noindex/module-cache-clang/lldb-shell' [-Wunused-command-line-argument]
RUN: at line 8: /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/bin/lldb --no-lldbinit -S /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/tools/lldb/test/Shell/lit-lldb-init-quiet /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/tools/lldb/test/Shell/Unwind/Output/trap_frame_sym_ctx.test.tmp -s /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/trap_frame_sym_ctx.test -o exit | /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/bin/FileCheck /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/trap_frame_sym_ctx.test
+ /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/bin/lldb --no-lldbinit -S /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/tools/lldb/test/Shell/lit-lldb-init-quiet /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/tools/lldb/test/Shell/Unwind/Output/trap_frame_sym_ctx.test.tmp -s /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/trap_frame_sym_ctx.test -o exit
+ /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/bin/FileCheck /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/trap_frame_sym_ctx.test
/Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/trap_frame_sym_ctx.test:21:10: error: CHECK: expected string not found in input
         ^
<stdin>:26:64: note: scanning from here
 frame #1: 0x0000000100003ee9 trap_frame_sym_ctx.test.tmp`tramp
                                                               ^
<stdin>:27:2: note: possible intended match here
 frame #2: 0x00007ff7bfeff6c0
 ^

Input file: <stdin>
Check file: /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/trap_frame_sym_ctx.test

-dump-input=help explains the following input dump.

Input was:
<<<<<<
            .
            .
            .
           21:  0x100003ed1 <+0>: pushq %rbp
           22:  0x100003ed2 <+1>: movq %rsp, %rbp
           23: (lldb) thread backtrace -u
           24: * thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
           25:  * frame #0: 0x0000000100003ecc trap_frame_sym_ctx.test.tmp`bar
           26:  frame #1: 0x0000000100003ee9 trap_frame_sym_ctx.test.tmp`tramp
check:21'0                                                                    X error: no match found
           27:  frame #2: 0x00007ff7bfeff6c0
check:21'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
check:21'1      ?                             possible intended match
           28:  frame #3: 0x0000000100003ec6 trap_frame_sym_ctx.test.tmp`main + 22
check:21'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           29:  frame #4: 0x0000000100003ec6 trap_frame_sym_ctx.test.tmp`main + 22
check:21'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           30:  frame #5: 0x00007ff8193cc41f dyld`start + 1903
check:21'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           31: (lldb) exit
check:21'0     ~~~~~~~~~~~~
>>>>>>
```
jacobly0 pushed a commit that referenced this pull request Dec 7, 2024
## Description

This PR fixes a segmentation fault that occurs when passing options
requiring arguments via `-Xopenmp-target=<triple>`. The issue was that
the function `Driver::getOffloadArchs` did not properly parse the
extracted option, but instead assumed it was valid, leading to a crash
when incomplete arguments were provided.

## Backtrace

```sh
llvm-project/build/bin/clang++ main.cpp -fopenmp=libomp -fopenmp-targets=powerpc64le-ibm-linux-gnu -Xopenmp-target=powerpc64le-ibm-linux-gnu -o 
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.      Program arguments: llvm-project/build/bin/clang++ main.cpp -fopenmp=libomp -fopenmp-targets=powerpc64le-ibm-linux-gnu -Xopenmp-target=powerpc64le-ibm-linux-gnu -o
1.      Compilation construction
2.      Building compilation actions
 #0 0x0000562fb21c363b llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (llvm-project/build/bin/clang+++0x392f63b)
 #1 0x0000562fb21c0e3c SignalHandler(int) Signals.cpp:0:0
 #2 0x00007fcbf6c81420 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x14420)
 #3 0x0000562fb1fa5d70 llvm::opt::Option::matches(llvm::opt::OptSpecifier) const (llvm-project/build/bin/clang+++0x3711d70)
 #4 0x0000562fb2a78e7d clang::driver::Driver::getOffloadArchs(clang::driver::Compilation&, llvm::opt::DerivedArgList const&, clang::driver::Action::OffloadKind, clang::driver::ToolChain const*, bool) const (llvm-project/build/bin/clang+++0x41e4e7d)
 #5 0x0000562fb2a7a9aa clang::driver::Driver::BuildOffloadingActions(clang::driver::Compilation&, llvm::opt::DerivedArgList&, std::pair<clang::driver::types::ID, llvm::opt::Arg const*> const&, clang::driver::Action*) const (.part.1164) Driver.cpp:0:0
 #6 0x0000562fb2a7c093 clang::driver::Driver::BuildActions(clang::driver::Compilation&, llvm::opt::DerivedArgList&, llvm::SmallVector<std::pair<clang::driver::types::ID, llvm::opt::Arg const*>, 16u> const&, llvm::SmallVector<clang::driver::Action*, 3u>&) const (llvm-project/build/bin/clang+++0x41e8093)
 #7 0x0000562fb2a8395d clang::driver::Driver::BuildCompilation(llvm::ArrayRef<char const*>) (llvm-project/build/bin/clang+++0x41ef95d)
 #8 0x0000562faf92684c clang_main(int, char**, llvm::ToolContext const&) (llvm-project/build/bin/clang+++0x109284c)
 #9 0x0000562faf826cc6 main (llvm-project/build/bin/clang+++0xf92cc6)
#10 0x00007fcbf6699083 __libc_start_main /build/glibc-LcI20x/glibc-2.31/csu/../csu/libc-start.c:342:3
#11 0x0000562faf923a5e _start (llvm-project/build/bin/clang+++0x108fa5e)
[1]    2628042 segmentation fault (core dumped)   main.cpp -fopenmp=libomp -fopenmp-targets=powerpc64le-ibm-linux-gnu  -o
```
jacobly0 pushed a commit that referenced this pull request Dec 7, 2024
llvm#118923)

…d reentry.

These utilities provide new, more generic and easier to use support for
lazy compilation in ORC.

LazyReexportsManager is an alternative to LazyCallThroughManager. It
takes requests for lazy re-entry points in the form of an alias map:
lazy-reexports = {
  ( <entry point symbol #1>, <implementation symbol #1> ),
  ( <entry point symbol #2>, <implementation symbol #2> ),
  ...
  ( <entry point symbol #n>, <implementation symbol #n> )
}

LazyReexportsManager then:
1. binds the entry points to the implementation names in an internal
table.
2. creates a JIT re-entry trampoline for each entry point.
3. creates a redirectable symbol for each of the entry point name and
binds redirectable symbol to the corresponding reentry trampoline.

When an entry point symbol is first called at runtime (which may be on
any thread of the JIT'd program) it will re-enter the JIT via the
trampoline and trigger a lookup for the implementation symbol stored in
LazyReexportsManager's internal table. When the lookup completes the
entry point symbol will be updated (via the RedirectableSymbolManager)
to point at the implementation symbol, and execution will proceed to the
implementation symbol.

Actual construction of the re-entry trampolines and redirectable symbols
is delegated to an EmitTrampolines functor and the
RedirectableSymbolsManager respectively.

JITLinkReentryTrampolines.h provides a JITLink-based implementation of
the EmitTrampolines functor. (AArch64 only in this patch, but other
architectures will be added in the near future).

Register state save and reentry functionality is added to the ORC
runtime in the __orc_rt_sysv_resolve and __orc_rt_resolve_implementation
functions (the latter is generic, the former will need custom
implementations for each ABI and architecture to be supported, however
this should be much less effort than the existing OrcABISupport
approach, since the ORC runtime allows this code to be written as native
assembly).

The resulting system:
1. Works equally well for in-process and out-of-process JIT'd code.
2. Requires less boilerplate to set up.

Given an ObjectLinkingLayer and PlatformJD (JITDylib containing the ORC
runtime), setup is just:

```c++
auto RSMgr = JITLinkRedirectableSymbolManager::Create(OLL);
if (!RSMgr)
  return RSMgr.takeError();

auto LRMgr = createJITLinkLazyReexportsManager(OLL, **RSMgr, PlatformJD);
if (!LRMgr)
  return LRMgr.takeError();
```

after which lazy reexports can be introduced with:

```c++
JD.define(lazyReexports(LRMgr, <alias map>));
```

LazyObectLinkingLayer is updated to use this new method, but the LLVM-IR
level CompileOnDemandLayer will continue to use LazyCallThroughManager
and OrcABISupport until the new system supports a wider range of
architectures and ABIs.

The llvm-jitlink utility's -lazy option now uses the new scheme. Since
it depends on the ORC runtime, the lazy-link.ll testcase and associated
helpers are moved to the ORC runtime.
jacobly0 pushed a commit that referenced this pull request Dec 16, 2024
The Clang binary (and any binary linking Clang as a library), when built
using PIE, ends up with a pretty shocking number of dynamic relocations
to apply to the executable image: roughly 400k.

Each of these takes up binary space in the executable, and perhaps most
interestingly takes start-up time to apply the relocations.

The largest pattern I identified were the strings used to describe
target builtins. The addresses of these string literals were stored into
huge arrays, each one requiring a dynamic relocation. The way to avoid
this is to design the target builtins to use a single large table of
strings and offsets within the table for the individual strings. This
switches the builtin management to such a scheme.

This saves over 100k dynamic relocations by my measurement, an over 25%
reduction. Just looking at byte size improvements, using the `bloaty`
tool to compare a newly built `clang` binary to an old one:

```
    FILE SIZE        VM SIZE
 --------------  --------------
  +1.4%  +653Ki  +1.4%  +653Ki    .rodata
  +0.0%    +960  +0.0%    +960    .text
  +0.0%    +197  +0.0%    +197    .dynstr
  +0.0%    +184  +0.0%    +184    .eh_frame
  +0.0%     +96  +0.0%     +96    .dynsym
  +0.0%     +40  +0.0%     +40    .eh_frame_hdr
  +114%     +32  [ = ]       0    [Unmapped]
  +0.0%     +20  +0.0%     +20    .gnu.hash
  +0.0%      +8  +0.0%      +8    .gnu.version
  +0.9%      +7  +0.9%      +7    [LOAD #2 [R]]
  [ = ]       0 -75.4% -3.00Ki    .relro_padding
 -16.1%  -802Ki -16.1%  -802Ki    .data.rel.ro
 -27.3% -2.52Mi -27.3% -2.52Mi    .rela.dyn
  -1.6% -2.66Mi  -1.6% -2.66Mi    TOTAL
```

We get a 16% reduction in the `.data.rel.ro` section, and nearly 30%
reduction in `.rela.dyn` where those reloctaions are stored.

This is also visible in my benchmarking of binary start-up overhead at
least:

```
Benchmark 1: ./old_clang --version
  Time (mean ± σ):      17.6 ms ±   1.5 ms    [User: 4.1 ms, System: 13.3 ms]
  Range (min … max):    14.2 ms …  22.8 ms    162 runs

Benchmark 2: ./new_clang --version
  Time (mean ± σ):      15.5 ms ±   1.4 ms    [User: 3.6 ms, System: 11.8 ms]
  Range (min … max):    12.4 ms …  20.3 ms    216 runs

Summary
  './new_clang --version' ran
    1.13 ± 0.14 times faster than './old_clang --version'
```

We get about 2ms faster `--version` runs. While there is a lot of noise
in binary execution time, this delta is pretty consistent, and
represents over 10% improvement. This is particularly interesting to me
because for very short source files, repeatedly starting the `clang`
binary is actually the dominant cost. For example, `configure` scripts
running against the `clang` compiler are slow in large part because of
binary start up time, not the time to process the actual inputs to the
compiler.

----

This PR implements the string tables using `constexpr` code and the
existing macro system. I understand that the builtins are moving towards
a TableGen model, and if complete that would provide more options for
modeling this. Unfortunately, that migration isn't complete, and even
the parts that are migrated still rely on the ability to break out of
the TableGen model and directly expand an X-macro style `BUILTIN(...)`
textually. I looked at trying to complete the move to TableGen, but it
would both require the difficult migration of the remaining targets, and
solving some tricky problems with how to move away from any macro-based
expansion.

I was also able to find a reasonably clean and effective way of doing
this with the existing macros and some `constexpr` code that I think is
clean enough to be a pretty good intermediate state, and maybe give a
good target for the eventual TableGen solution. I was also able to
factor the macros into set of consistent patterns that avoids a
significant regression in overall boilerplate.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants