Skip to content
This repository has been archived by the owner on Feb 5, 2019. It is now read-only.

Add stack overflow check for ARM Thumb instruction set. #4

Closed

Conversation

neykov
Copy link

@neykov neykov commented Feb 19, 2014

Besides the mechanical changes between the ARM and Thumb functions,
because of the different instruction sets, there is difference in how the
stack limit is located. The ARM version uses hardware which isn't available
on the lower-end Thumb processors (namely system co-processor and MMU)
therefore the stack limit is placed at a predefined location in
memory - STACK_LIMIT. It is the responsibility of the wrapping runtime
to manage this location with the correct value. It can vary from a simple
constant defined by the linker to actively managed variable by a RTOS
implementation.

Besides the mechanincal changes between the ARM and Thumb functions,
because of the different instruction sets, there is difference in how the
stack limit is located. The ARM version uses hardware which isn't available
on the lower-end Thumb processors (namely system co-processor and MMU)
therefore the stack limit is placed at a predefined location in
memory - STACK_LIMIT. It is the responsibility of the wrapping runtime
to manage this location with the correct value. It can vary from a simple
constant defined by the linker to actively managed variable by a RTOS
implementation.
@alexcrichton
Copy link
Member

This even has tests, amazing!

Out of curiosity, would you be interested in helping us upstream the ARM-related segmented stack patches? All we really need are tests, and otherwise I think that upstream will be receptive to our patches.

I will merge this soon, thank you!

alexcrichton added a commit to alexcrichton/rust that referenced this pull request Feb 19, 2014
This updates the LLVM submodule to the `rust-llvm-2014-02-19` tag which is the
old one with rust-lang/llvm#4 cherry-picked on top.

Awesome job by @neykov for this!
@alexcrichton
Copy link
Member

Opened a PR on the rust repo as rust-lang/rust#12407 and landed as b015ecd, thanks!

@neykov
Copy link
Author

neykov commented Feb 20, 2014

Yes, I can prepare analogous patches for the ARM case.

@alexcrichton
Copy link
Member

That would be awesome! If you need any help, feel free to ask me questions.

bors added a commit to rust-lang/rust that referenced this pull request Feb 21, 2014
This updates the LLVM submodule to the `rust-llvm-2014-02-19` tag which is the
old one with rust-lang/llvm#4 cherry-picked on top.

Awesome job by @neykov for this!
@bharrisau
Copy link

The above doesn't work for thumb-1 (ARMV6-M, Cortex-M0). I had to modify the asm to make two changes.

Change MOV with MOVS (can only take an 8-bit constant). And align the STACK_LIMIT with .align 2.
Are the above constraints OK? Don't know if the 8-bit limit is too low?

@alexcrichton
Copy link
Member

Patches welcome!

@neykov
Copy link
Author

neykov commented Feb 25, 2014

@bharrisau The code is written specifically to support Thumb1 but I wouldn't be surprised if there are bugs in it. Could you give more details where you did the changes. You modified the assembly, generated by llc to compile it with GCC as? I haven't checked this workflow - will do in the following days.

  • MOV -> MOVS change - LLVM will generate the correct instruction in machine code. I wil check how it can be fixed in the assembly listing.
  • .align 2 - LLVM handles the data sections and inserts alignment instructions when necessary.
  • 8-bit limit - I was under the impression that MOVS supports shifting of the constant value (the constants are specifically selected to be a power of two for bigger values). After you pointed it out this seems not to be the case. Will have to research this in more detail.

Thanks for the feedback.

@bharrisau
Copy link

I ran as emit=asm so I will need to check a disassembly of an object to see
if this applies in that case. I'm still fighting with libstd so just used
the asm to see how it looked.
I built for thumbv6m-linux-eabi with CPU cortex-m0 if you want to try to
replicate. The error I was getting was something about suffix width can't
be guaranteed, was on the 'mov r, #x' instructions. I got the impression
that thumb 1 had to used movs for imm values.

I'm trying to get some of this done at the end of a 12 hr day, so I may be
slow in making any progress.
On 26/02/2014 3:59 am, "Svetoslav" notifications@github.com wrote:

@bharrisau https://github.com/bharrisau The code is written
specifically to support Thumb1 but I wouldn't be surprised if there are
bugs in it. Could you give more details where you did the changes. You
modified the assembly, generated by llc to compile it with GCC as? I
haven't checked this workflow - will do in the following days.

  • MOV -> MOVS change - LLVM will generate the correct instruction in
    machine code. I wil check how it can be fixed in the assembly listing.
  • .align 2 - LLVM handles the data sections and inserts alignment
    instructions when necessary.
  • 8-bit limit - I was under the impression that MOVS supports shifting
    of the constant value (the constants are specifically selected to be a
    power of two for bigger values). After you pointed it out this seems not to
    be the case. Will have to research this in more detail.

Thanks for the feedback.

Reply to this email directly or view it on GitHubhttps://github.com//pull/4#issuecomment-36050972
.

@bharrisau
Copy link

Pretty sure this will also surface if you compile with -C no-integrated-as

@bharrisau
Copy link

Emit=obj works fine, the disassembly has the proper movs instructions. So
the issue is with the asm printer or GCC.

@bharrisau
Copy link

Sorry to spam - I'll check this with a large stack frame tonight. The following suggests that the maximum movs value is 255 http://infocenter.arm.com/help/topic/com.arm.doc.dui0497a/BABHGAJI.html

LLVM might automatically change it into a LDR instruction?

@neykov
Copy link
Author

neykov commented Feb 26, 2014

Yes, if the value is larger than 255 it will have to be loaded analogously to the STACK_LIMIT address with LDR from the constant pool. At least the call to alignToARMConstant won't be necessary.

alexcrichton pushed a commit that referenced this pull request May 20, 2014
For pattern like ((x >> C1) & Mask) << C2, DAG combiner may convert it
into (x >> (C1-C2)) & (Mask << C2), which makes pattern matching of ubfx
more difficult.
For example:
Given
  %shr = lshr i64 %x, 4
  %and = and i64 %shr, 15
  %arrayidx = getelementptr inbounds [8 x [64 x i64]]* @arr, i64 0, %i64 2, i64 %and
  %0 = load i64* %arrayidx
With current shift folding, it takes 3 instrs to compute base address:
  lsr x8, x0, #1
  and x8, x8, #0x78
  add x8, x9, x8

If using ubfx, it only needs 2 instrs:
  ubfx  x8, x0, #4, #4
  add x8, x9, x8, lsl #3

This fixes bug 19589


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207702 91177308-0d34-0410-b5e6-96231b3b80d8
alexcrichton pushed a commit that referenced this pull request Jun 12, 2015
…airs.

Store instructions do not modify register values and therefore it's safe
to form a store pair even if the source register has been read in between
the two store instructions.

Previously, the read of w1 (see below) prevented the formation of a stp.

        str      w0, [x2]
        ldr     w8, [x2, #8]
        add      w0, w8, w1
        str     w1, [x2, #4]
        ret

We now generate the following code.

        stp      w0, w1, [x2]
        ldr     w8, [x2, #8]
        add      w0, w8, w1
        ret

All correctness tests with -Ofast on A57 with Spec200x and EEMBC pass.
Performance results for SPEC2K were within noise.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239432 91177308-0d34-0410-b5e6-96231b3b80d8
alexcrichton pushed a commit that referenced this pull request Jul 2, 2018
…ons.

Increment/decrement vector by multiple of predicate constraint
element count.

The variants added by this patch are:
 - INCH, INCW, INC 

and (saturating):
 - SQINCH, SQINCW, SQINCD
 - UQINCH, UQINCW, UQINCW
 - SQDECH, SQINCW, SQINCD
 - UQDECH, UQINCW, UQINCW

For example:
  incw z0.s, all, mul #4


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336090 91177308-0d34-0410-b5e6-96231b3b80d8
alexcrichton pushed a commit that referenced this pull request Jul 10, 2018
r335553 with the non-trivial unswitching of switches.

The code correctly updated most aspects of the CFG and analyses, but
missed some crucial aspects:
1) When multiple cases have the same successor, we unswitch that
   a single time and replace the switch with a direct branch. The CFG
   here is correct, but the target of this direct branch may have had
   a PHI node with multiple entries in it.
2) When we still have to clone a successor of the switch into an
   unswitched copy of the loop, we'll delete potentially multiple edges
   entering this successor, not just one.
3) We also have to delete multiple edges entering the successors in the
   original loop when they have to be retained.
4) When the "retained successor" *also* occurs as a case successor, we
   just assert failed everywhere. This doesn't happen very easily
   because its always valid to simply drop the case -- the retained
   successor for switches is always the default successor. However, it
   is likely possible through some contrivance of different loop passes,
   unrolling, and simplifying for this to occur in practice and
   certainly there is nothing "invalid" about the IR so this pass needs
   to handle it.
5) In the case of #4, we also will replace these multiple edges with
   a direct branch much like in #1 and need to collapse the entries in
   any PHI nodes to a single enrty.

All of this stems from the delightful fact that the same successor can
show up in multiple parts of the switch terminator, and each of these
are considered a distinct edge for the purpose of PHI nodes (and
iterating the successors and predecessors) but not for unswitching
itself, the dominator tree, or many other things. For the record,
I intensely dislike this "feature" of the IR in large part because of
the complexity it causes in passes like this. We already have a ton of
logic building sets and handling duplicates, and we just had to add
a bunch more.

I've added a complex test case that covers all five of the above failure
modes. I've also added a variation on it where #4 and #5 occur in loop
exit, adding fun where we have an LCSSA PHI node with "multiple entries"
despite have dedicated exits. There were no additional issues found by
this, but it seems a useful corner case to cover with testing.

One thing that working on all of this code has made painfully clear for
me as well is how amazingly inefficient our PHI node representation is
(in terms of the in-memory data structures and the APIs used to update
them). This code has truly marvelous complexity bounds because every
time we remove an entry from a PHI node we do a linear scan to find it
and then a linear update to the data structure to remove it. We could in
theory batch all of the PHI node updates into a single linear walk of
the operands making this much more efficient, but the APIs fight hard
against this and the fact that we have to handle duplicates in the
peculiar manner we do (removing all but one in some cases) makes even
implementing that very tedious and annoying. Anyways, none of this is
new here or specific to loop unswitching. All code in LLVM that updates
PHI node operands suffers from these problems.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336536 91177308-0d34-0410-b5e6-96231b3b80d8
alexcrichton pushed a commit that referenced this pull request Aug 2, 2018
…d VPlan for tests."

Memory leaks in tests.
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap/builds/6289/steps/check-llvm%20asan/logs/stdio

Direct leak of 192 byte(s) in 1 object(s) allocated from:
    #0 0x554ea8 in operator new(unsigned long) /b/sanitizer-x86_64-linux-bootstrap/build/llvm/projects/compiler-rt/lib/asan/asan_new_delete.cc:106
    #1 0x56cef1 in llvm::VPlanTestBase::doAnalysis(llvm::Function&) /b/sanitizer-x86_64-linux-bootstrap/build/llvm/unittests/Transforms/Vectorize/VPlanTestBase.h:53:14
    #2 0x56bec4 in llvm::VPlanTestBase::buildHCFG(llvm::BasicBlock*) /b/sanitizer-x86_64-linux-bootstrap/build/llvm/unittests/Transforms/Vectorize/VPlanTestBase.h:57:3
    #3 0x571f1e in llvm::(anonymous namespace)::VPlanHCFGTest_testVPInstructionToVPRecipesInner_Test::TestBody() /b/sanitizer-x86_64-linux-bootstrap/build/llvm/unittests/Transforms/Vectorize/VPlanHCFGTest.cpp:119:15
    #4 0xed2291 in testing::Test::Run() /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/src/gtest.cc
    #5 0xed44c8 in testing::TestInfo::Run() /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/src/gtest.cc:2656:11
    #6 0xed5890 in testing::TestCase::Run() /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/src/gtest.cc:2774:28
    #7 0xef3634 in testing::internal::UnitTestImpl::RunAllTests() /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/src/gtest.cc:4649:43
    #8 0xef27e0 in testing::UnitTest::Run() /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/src/gtest.cc
    #9 0xebbc23 in RUN_ALL_TESTS /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/include/gtest/gtest.h:2233:46
    #10 0xebbc23 in main /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/UnitTestMain/TestMain.cpp:51
    #11 0x7f65569592e0 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x202e0)

and more.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336718 91177308-0d34-0410-b5e6-96231b3b80d8
alexcrichton pushed a commit that referenced this pull request Aug 2, 2018
…ering"

This reverts commit r337021.

WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x1415cd65 in void write_signed<long>(llvm::raw_ostream&, long, unsigned long, llvm::IntegerStyle) /code/llvm-project/llvm/lib/Support/NativeFormatting.cpp:95:7
    #1 0x1415c900 in llvm::write_integer(llvm::raw_ostream&, long, unsigned long, llvm::IntegerStyle) /code/llvm-project/llvm/lib/Support/NativeFormatting.cpp:121:3
    #2 0x1472357f in llvm::raw_ostream::operator<<(long) /code/llvm-project/llvm/lib/Support/raw_ostream.cpp:117:3
    #3 0x13bb9d4 in llvm::raw_ostream::operator<<(int) /code/llvm-project/llvm/include/llvm/Support/raw_ostream.h:210:18
    #4 0x3c2bc18 in void printField<unsigned int, &(amd_kernel_code_s::amd_kernel_code_version_major)>(llvm::StringRef, amd_kernel_code_s const&, llvm::raw_ostream&) /code/llvm-project/llvm/lib/Target/AMDGPU/Utils/AMDKernelCodeTUtils.cpp:78:23
    #5 0x3c250ba in llvm::printAmdKernelCodeField(amd_kernel_code_s const&, int, llvm::raw_ostream&) /code/llvm-project/llvm/lib/Target/AMDGPU/Utils/AMDKernelCodeTUtils.cpp:104:5
    #6 0x3c27ca3 in llvm::dumpAmdKernelCode(amd_kernel_code_s const*, llvm::raw_ostream&, char const*) /code/llvm-project/llvm/lib/Target/AMDGPU/Utils/AMDKernelCodeTUtils.cpp:113:5
    #7 0x3a46e6c in llvm::AMDGPUTargetAsmStreamer::EmitAMDKernelCodeT(amd_kernel_code_s const&) /code/llvm-project/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp:161:3
    #8 0xd371e4 in llvm::AMDGPUAsmPrinter::EmitFunctionBodyStart() /code/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp:204:26

[...]

Uninitialized value was created by an allocation of 'KernelCode' in the stack frame of function '_ZN4llvm16AMDGPUAsmPrinter21EmitFunctionBodyStartEv'
    #0 0xd36650 in llvm::AMDGPUAsmPrinter::EmitFunctionBodyStart() /code/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp:192

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337079 91177308-0d34-0410-b5e6-96231b3b80d8
alexcrichton pushed a commit that referenced this pull request Aug 22, 2018
Summary:
Currently, in line with GCC, when specifying reserved registers like sp or pc on an inline asm() clobber list, we don't always preserve the original value across the statement. And in general, overwriting reserved registers can have surprising results.

For example:


```
extern int bar(int[]);

int foo(int i) {
  int a[i]; // VLA
  asm volatile(
      "mov r7, #1"
    :
    :
    : "r7"
  );

  return 1 + bar(a);
}
```

Compiled for thumb, this gives:
```
$ clang --target=arm-arm-none-eabi -march=armv7a -c test.c -o - -S -O1 -mthumb
...
foo:
        .fnstart
@ %bb.0:                                @ %entry
        .save   {r4, r5, r6, r7, lr}
        push    {r4, r5, r6, r7, lr}
        .setfp  r7, sp, #12
        add     r7, sp, #12
        .pad    #4
        sub     sp, #4
        movs    r1, #7
        add.w   r0, r1, r0, lsl #2
        bic     r0, r0, #7
        sub.w   r0, sp, r0
        mov     sp, r0
        @app
        mov.w   r7, #1
        @NO_APP
        bl      bar
        adds    r0, #1
        sub.w   r4, r7, #12
        mov     sp, r4
        pop     {r4, r5, r6, r7, pc}
...
```

r7 is used as the frame pointer for thumb targets, and this function needs to restore the SP from the FP because of the variable-length stack allocation a. r7 is clobbered by the inline assembly (and r7 is included in the clobber list), but LLVM does not preserve the value of the frame pointer across the assembly block.

This type of behavior is similar to GCC's and has been discussed on the bugtracker: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=11807 . No consensus seemed to have been reached on the way forward.  Clang behavior has briefly been discussed on the CFE mailing (starting here: http://lists.llvm.org/pipermail/cfe-dev/2018-July/058392.html). I've opted for following Eli Friedman's advice to print warnings when there are reserved registers on the clobber list so as not to diverge from GCC behavior for now.

The patch uses MachineRegisterInfo's target-specific knowledge of reserved registers, just before we convert the inline asm string in the AsmPrinter.

If we find a reserved register, we print a warning:
```
repro.c:6:7: warning: inline asm clobber list contains reserved registers: R7 [-Winline-asm]
      "mov r7, #1"
      ^
```

Reviewers: eli.friedman, olista01, javed.absar, efriedma

Reviewed By: efriedma

Subscribers: efriedma, eraman, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D49727

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@339257 91177308-0d34-0410-b5e6-96231b3b80d8
Sirokujira pushed a commit to Sirokujira/llvm that referenced this pull request Dec 3, 2018
…>> (32 - y) pattern"

*Seems* to be breaking sanitizer-x86_64-linux-fast buildbot,
the ELF/relocatable-versioned.s test:

==17758==MemorySanitizer CHECK failed: /b/sanitizer-x86_64-linux-fast/build/llvm/projects/compiler-rt/lib/sanitizer_common/sanitizer_allocator.cc:191 "((kBlockMagic)) == ((((u64*)addr)[0]))" (0x6a6cb03abcebc041, 0x0)
    #0 0x59716b in MsanCheckFailed(char const*, int, char const*, unsigned long long, unsigned long long) /b/sanitizer-x86_64-linux-fast/build/llvm/projects/compiler-rt/lib/msan/msan.cc:393
    rust-lang#1 0x586635 in __sanitizer::CheckFailed(char const*, int, char const*, unsigned long long, unsigned long long) /b/sanitizer-x86_64-linux-fast/build/llvm/projects/compiler-rt/lib/sanitizer_common/sanitizer_termination.cc:79
    rust-lang#2 0x57d5ff in __sanitizer::InternalFree(void*, __sanitizer::SizeClassAllocatorLocalCache<__sanitizer::SizeClassAllocator32<__sanitizer::AP32> >*) /b/sanitizer-x86_64-linux-fast/build/llvm/projects/compiler-rt/lib/sanitizer_common/sanitizer_allocator.cc:191
    rust-lang#3 0x7fc21b24193f  (/lib/x86_64-linux-gnu/libc.so.6+0x3593f)
    rust-lang#4 0x7fc21b241999 in exit (/lib/x86_64-linux-gnu/libc.so.6+0x35999)
    rust-lang#5 0x7fc21b22c2e7 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x202e7)
    rust-lang#6 0x57c039 in _start (/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/lld+0x57c039)

This reverts commit r345014.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345017 91177308-0d34-0410-b5e6-96231b3b80d8
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants