Pass fat pointers in two immediate arguments #26411

dotdash · 2015-06-18T22:28:30Z

This has a number of advantages compared to creating a copy in memory
and passing a pointer. The obvious one is that we don't have to put the
data into memory but can keep it in registers. Since we're currently
passing a pointer anyway (instead of using e.g. a known offset on the
stack, which is what the byval attribute would achieve), we only use a
single additional register for each fat pointer, but save at least two
pointers worth of stack in exchange (sometimes more because more than
one copy gets eliminated). On archs that pass arguments on the stack, we
save a pointer worth of stack even without considering the omitted
copies.

Additionally, LLVM can optimize the code a lot better, to a large degree
due to the fact that lots of copies are gone or can be optimized away.
Additionally, we can now emit attributes like nonnull on the data and/or
vtable pointers contained in the fat pointer, potentially allowing for
even more optimizations.

This results in LLVM passes being about 3-7% faster (depending on the
crate), and the resulting code is also a few percent smaller, for
example:

text	data	filename
5671479	3941461	before/librustc-d8ace771.so
5447663	3905745	after/librustc-d8ace771.so

1944425	2394024	before/libstd-d8ace771.so
1896769	2387610	after/libstd-d8ace771.so

I had to remove a call in the backtrace-debuginfo test, because LLVM can
now merge the tails of some blocks when optimizations are turned on,
which can't correctly preserve line info.

Fixes #22924

Cc #22891 (at least for fat pointers the code is good now)

rust-highfive · 2015-06-18T22:28:42Z

r? @pnkfelix

(rust_highfive has picked a reviewer for you, use r? to override)

dotdash · 2015-06-18T22:30:05Z

This is still WIP because the seemingly fragile debug backtrace test keeps failing on me because of different inlining behaviour or so, but I wanted to put this up to get some feedback on it.

huonw · 2015-06-18T22:42:00Z

Will this apply to:

struct NotAFatPointer { x: usize, y: isize }

?

dotdash · 2015-06-18T22:52:08Z

No. For now this handles fat pointers only. I still plan to revamp the way
arguments are handled and then expand on this to handle all small structs.
But I'm not sure when I'll have the time to finish that.
Am 19.06.2015 12:42 vorm. schrieb "Huon Wilson" notifications@github.com:

Will this apply to:

struct NotAFatPointer { x: usize, y: isize }

?

—
Reply to this email directly or view it on GitHub
#26411 (comment).

petrochenkov · 2015-06-19T00:54:26Z

What are long-term goals for argument passing? Does it make sense to get it as close to some well-established ABIs (like System V on x86_64) as possible? This PR looks like a step in that direction.

And maybe a silly question - why aren't fat pointers moved around as (one or two) immediates everywhere and not only in arguments? Did anyone tried it and is there any difference (during and after the optimization) compared to the current scheme?

Aatch · 2015-06-19T04:02:29Z

@petrochenkov the answer to your question is just that what we have now works. More specifically, before we had fat pointers we only handled word-sized or smaller types as immediates. So it stuck when we added fat pointers.

dotdash · 2015-06-19T07:29:14Z

We're probably going to get closer to the well-established ABIs because some things are plain better than what we currently have. But we'll also have to check where rust's semantics allow us to do even better. For example, we might be able to statically omit some copies that couldn't be omitted in C when passing things by value. In that case, in might be better if we keep passing pointers to the "copy", instead of using the copy-at-a-fixed-stack-offset mechanism which usually prohibits plain forwarding of the existing copy but needs a new copy for each callee.

dotdash · 2015-06-19T12:34:40Z

So the backtrace-debuginfo test fails with optimizations enabled because LLVM can tail-merge the blocks that call dump_filelines now, which means that we get the same line info for both of these call paths:

https://github.com/dotdash/rust/blob/e4872167f5dda8eebc3b68a2050f870fa4457b50/src/test/run-pass/backtrace-debuginfo.rs#L92
https://github.com/dotdash/rust/blob/e4872167f5dda8eebc3b68a2050f870fa4457b50/src/test/run-pass/backtrace-debuginfo.rs#L103

Does anybody have an idea how to "work around" that optimization? If not, I'd like to just remove the second call, as that seems to be testing LLVM rather than rust.

dotdash · 2015-06-19T12:56:16Z

With the debug backtrace test fixed, this passes the test suite for me locally, so I consider this ready now.

bors · 2015-06-19T22:57:23Z

☔ The latest upstream changes (presumably #26351) made this pull request unmergeable. Please resolve the merge conflicts.

…argument attributes This makes it a lot easier to later add attributes for fat pointers.

…ts result This makes it easier to support translating a single rust argument to more than one llvm argument value later.

dotdash · 2015-06-20T02:29:54Z

Rebased

Aatch · 2015-06-20T03:56:03Z

@bors r+

bors · 2015-06-20T03:56:03Z

📌 Commit a3d66ae has been approved by Aatch

bors · 2015-06-20T12:10:46Z

⌛ Testing commit a3d66ae with merge e25c15b...

bors · 2015-06-20T13:11:53Z

💔 Test failed - auto-mac-32-opt

This has a number of advantages compared to creating a copy in memory and passing a pointer. The obvious one is that we don't have to put the data into memory but can keep it in registers. Since we're currently passing a pointer anyway (instead of using e.g. a known offset on the stack, which is what the `byval` attribute would achieve), we only use a single additional register for each fat pointer, but save at least two pointers worth of stack in exchange (sometimes more because more than one copy gets eliminated). On archs that pass arguments on the stack, we save a pointer worth of stack even without considering the omitted copies. Additionally, LLVM can optimize the code a lot better, to a large degree due to the fact that lots of copies are gone or can be optimized away. Additionally, we can now emit attributes like nonnull on the data and/or vtable pointers contained in the fat pointer, potentially allowing for even more optimizations. This results in LLVM passes being about 3-7% faster (depending on the crate), and the resulting code is also a few percent smaller, for example: text data filename 5671479 3941461 before/librustc-d8ace771.so 5447663 3905745 after/librustc-d8ace771.so 1944425 2394024 before/libstd-d8ace771.so 1896769 2387610 after/libstd-d8ace771.so I had to remove a call in the backtrace-debuginfo test, because LLVM can now merge the tails of some blocks when optimizations are turned on, which can't correctly preserve line info. Fixes rust-lang#22924 Cc rust-lang#22891 (at least for fat pointers the code is good now)

dotdash · 2015-06-20T16:59:37Z

@bors r=aatch

bors · 2015-06-20T16:59:37Z

📌 Commit f777562 has been approved by aatch

bors · 2015-06-20T19:29:34Z

⌛ Testing commit f777562 with merge 306a99e...

This has a number of advantages compared to creating a copy in memory and passing a pointer. The obvious one is that we don't have to put the data into memory but can keep it in registers. Since we're currently passing a pointer anyway (instead of using e.g. a known offset on the stack, which is what the `byval` attribute would achieve), we only use a single additional register for each fat pointer, but save at least two pointers worth of stack in exchange (sometimes more because more than one copy gets eliminated). On archs that pass arguments on the stack, we save a pointer worth of stack even without considering the omitted copies. Additionally, LLVM can optimize the code a lot better, to a large degree due to the fact that lots of copies are gone or can be optimized away. Additionally, we can now emit attributes like nonnull on the data and/or vtable pointers contained in the fat pointer, potentially allowing for even more optimizations. This results in LLVM passes being about 3-7% faster (depending on the crate), and the resulting code is also a few percent smaller, for example: |text|data|filename| |----|----|--------| |5671479|3941461|before/librustc-d8ace771.so| |5447663|3905745|after/librustc-d8ace771.so| | | | | |1944425|2394024|before/libstd-d8ace771.so| |1896769|2387610|after/libstd-d8ace771.so| I had to remove a call in the backtrace-debuginfo test, because LLVM can now merge the tails of some blocks when optimizations are turned on, which can't correctly preserve line info. Fixes #22924 Cc #22891 (at least for fat pointers the code is good now)

bors · 2015-06-20T21:36:47Z

☀️ Test successful - auto-linux-32-nopt-t, auto-linux-32-opt, auto-linux-64-nopt-t, auto-linux-64-opt, auto-linux-64-x-android-t, auto-mac-32-opt, auto-mac-64-nopt-t, auto-mac-64-opt, auto-win-gnu-32-nopt-t, auto-win-gnu-32-opt, auto-win-gnu-64-nopt-t, auto-win-gnu-64-opt

rust-highfive assigned pnkfelix Jun 18, 2015

dotdash force-pushed the fat_in_registers branch from e487216 to 986be42 Compare June 19, 2015 12:52

dotdash changed the title ~~[WIP] Pass fat pointers in two immediate arguments instead an indirect arguments~~ Pass fat pointers in two immediate arguments Jun 19, 2015

dotdash added 3 commits June 20, 2015 03:33

Use a single match arm for all TyRef variants when deducing function …

f862da5

…argument attributes This makes it a lot easier to later add attributes for fat pointers.

Simplify argument forwarding in the various shim generators

dea5a96

Make trans_arg_datum fill a destination vector instead of returning i…

02d74a4

…ts result This makes it easier to support translating a single rust argument to more than one llvm argument value later.

dotdash force-pushed the fat_in_registers branch from 986be42 to a3d66ae Compare June 20, 2015 02:29

dotdash force-pushed the fat_in_registers branch from a3d66ae to f777562 Compare June 20, 2015 16:59

bors merged commit f777562 into rust-lang:master Jun 20, 2015

bors mentioned this pull request Jun 20, 2015

[RFC] add IndexAssign trait #25628

Closed

dotdash mentioned this pull request Jun 22, 2015

Rust should use registers more aggressively #26494

Closed

brson added the relnotes Marks issues that should be documented in the release notes of the next release. label Jun 23, 2015

eddyb mentioned this pull request Jul 1, 2015

ICE with type-pruned complex smart pointer #26709

Closed

hanna-kruppe mentioned this pull request Jul 19, 2015

Manipulating slice through &mut parameter not optimized well #27130

Closed

dotdash deleted the fat_in_registers branch July 27, 2015 08:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pass fat pointers in two immediate arguments #26411

Pass fat pointers in two immediate arguments #26411

dotdash commented Jun 18, 2015

rust-highfive commented Jun 18, 2015

dotdash commented Jun 18, 2015

huonw commented Jun 18, 2015

dotdash commented Jun 18, 2015

petrochenkov commented Jun 19, 2015

Aatch commented Jun 19, 2015

dotdash commented Jun 19, 2015

dotdash commented Jun 19, 2015

dotdash commented Jun 19, 2015

bors commented Jun 19, 2015

dotdash commented Jun 20, 2015

Aatch commented Jun 20, 2015

bors commented Jun 20, 2015

bors commented Jun 20, 2015

bors commented Jun 20, 2015

dotdash commented Jun 20, 2015

bors commented Jun 20, 2015

bors commented Jun 20, 2015

bors commented Jun 20, 2015

Pass fat pointers in two immediate arguments #26411

Pass fat pointers in two immediate arguments #26411

Conversation

dotdash commented Jun 18, 2015

rust-highfive commented Jun 18, 2015

dotdash commented Jun 18, 2015

huonw commented Jun 18, 2015

dotdash commented Jun 18, 2015

petrochenkov commented Jun 19, 2015

Aatch commented Jun 19, 2015

dotdash commented Jun 19, 2015

dotdash commented Jun 19, 2015

dotdash commented Jun 19, 2015

bors commented Jun 19, 2015

dotdash commented Jun 20, 2015

Aatch commented Jun 20, 2015

bors commented Jun 20, 2015

bors commented Jun 20, 2015

bors commented Jun 20, 2015

dotdash commented Jun 20, 2015

bors commented Jun 20, 2015

bors commented Jun 20, 2015

bors commented Jun 20, 2015