Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tree Borrows: improved diagnostics #2828

Merged
merged 2 commits into from
Apr 28, 2023
Merged

Conversation

Vanille-N
Copy link
Contributor

@Vanille-N Vanille-N commented Mar 29, 2023

Better diagnostics for Tree Borrows violations.

  • Shows where the conflicting tags (the one that was accessed and the one that had a permission or protector that caused UB) were reborrowed, which is more readable than only <TAG>
  • Shows a small history of what happened for the faulty tag to get there (which lines caused it to lose read/write permissions)
  • Explains permissions and transitions in natural language (e.g. "does not have read permissions" instead of "is Disabled")

Not perfect, but at least testing TB will be less confusing.

Lack of range information from RangeMap makes selection of relevant events nontrivial: we reconstruct history from knowledge of (initial, final) and (offset, pi, p'i) so that initial -> final = p1 -> p1' = p2 -> p2' = p3 -> ... = final

TreeBorrowsUb { .. } => {
let helps = vec![
TreeBorrowsUb { trigger_history, faulty_history, .. } => {
let url = "https://perso.crans.org/vanille/treebor";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that we should link people to something helpful, and also that your write-up is helpful, but I'm not excited about linking someone to a site that's not managed by the Rust project.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very understandable. I'll remove that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW the readme still links there.

We could do what we did with SB, and have a page in https://github.com/rust-lang/unsafe-code-guidelines/tree/master/wip and link there. For now that page would basically just link to Neven's blog, but in the future it could contain its own concise description of the rules (like we do for SB).

@@ -0,0 +1,39 @@
error: Undefined Behavior: read access through <TAG> is forbidden because it is a child of <TAG> which is Disabled.
Copy link
Member

@saethlin saethlin Mar 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"which is disabled" clashes with the rest of the terminology in this diagnostic now. Where was it disabled? Nothing in this diagnostic mentions a disable. (this is rhetorical)

Not to open up the terminology question again, but I feel like in light of the other details given below it makes more sense to say "because it is a child of {tag} which lost all permissions". It seems like we have two sets of terminology here, and the one used in the helps requires a few more words but is more obvious to a reader.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll update the diagnostic for protector violations too, it currently says "change from Active to Frozen"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree we should use the same set of terminology everywhere, but I think it should be the state names. Reserved, Active, Frozen, Disabled seems much clearer to me than having to talk about "future write permissions".

@RalfJung
Copy link
Member

RalfJung commented Apr 1, 2023

This required getting some more information from RangeMap: from just the offset to the actual range.

Please undo that, the range is meaningless. Instead do what we did for SB: the offset is the first byte where a conflict happens; no statement is made about later bytes. Conceptually this code iterates over bytes and checks each byte separately; the fact that RangeMap optimizes this is an implementation detail that should not be observable.

@Vanille-N
Copy link
Contributor Author

Yeah I had specifically tried to avoid SB's way of doing this, which is to reconstruct the previous range from the current offset, which led me to think I needed the range from RangeMap.
It is in fact not necessary, but I'll let you decide if you find the added complexity of extract_relevant worth it.

src/borrow_tracker/tree_borrows/diagnostics.rs Outdated Show resolved Hide resolved
@@ -0,0 +1,39 @@
error: Undefined Behavior: read access through <TAG> is forbidden because it is a child of <TAG> which is Disabled.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree we should use the same set of terminology everywhere, but I think it should be the state names. Reserved, Active, Frozen, Disabled seems much clearer to me than having to talk about "future write permissions".

LL | | n
LL | | });
| |______^
help: <TAG> lost read and write permissions due to a write access at offsets [0x0..0x8]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should tell a story, e.g.

  • TAG was created here in Reserved state by a reborrow (name parent tag, maybe?)
  • TAG later changed to Disabled state due to a Foreign write access at offsets...

I guess @saethlin will want us to avoid using special Tree Borrows terminology here, but I really don't see a good way to avoid that. Replacing state names by long permission descriptions doesn't really explain that much IMO, and this here currently describes Reserved as "with read and write permissions" which is confusing since that sounds like Active.

Maybe we have the help line be a short one using special terms, and then have a more detailed note/help below (or somewhere) that explains a bit more, and that people familiar with TB can skip?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replacing state names by long permission descriptions doesn't really explain that much IMO

🤷 I think it helps. It also gets us out of discussing disabled children.

Really we have a Stroustrup's Rule problem here: Everyone finds the new terminology confusing at first, but then later they want a terse notation for it. I've noticed that the pickup rate for changes in Miri can be amazingly slow, some rather experienced unsafe code users didn't notice the new span-based diagnostics until 6 months after they released.

So however you want to navigate all that is good with me. If we're going to put up all the Tree Borrows terminology, there needs to be an accessible way for people to learn what those terms mean. https://perso.crans.org/vanille/treebor/ is a great breakdown of the whole model, but it is quite intimidating for a newcomer who is just trying to figure out why Miri doesn't like their new data structure idea.

I've long wanted a sort of glossary built from common patterns that explains why they are invalid and how to fix them. https://miri.saethlin.dev/tree-borrows/ub exists now and will be complete once the upload finishes. I will work on such a glossary eventually if someone else doesn't beat me to it.

Copy link
Member

@RalfJung RalfJung Apr 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stroustrup's Rule

Definitely. But we also have the problem that there's a lot going on and trying to cram that all into a single sentence will not result in anything readable.

If you have good suggestions for how to avoid the state names, I am all ears. But the PR as-is describes Reserved as "having read and write permissions" and that's just misleading -- and the more correct "having read and future-write permissions" is no better than just saying "Reserved". (In fact it is worse; having a concise name for a concept is very useful for documentation and teaching.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It also gets us out of discussing disabled children.

It'll look more like "child of , which is Disabled", I think. We could also consider renaming Disabled to Inactive, Invalid, Deactivated, something like that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you have good suggestions for how to avoid the state names, I am all ears.

I don't. I wish I did.

I was a bit surprised by the direction @Vanille-N took my comment, but as I dislike terse notation (because with my route through grad school I was constantly reading papers outside my subfield) I am not inclined to get in anyone's way when they are removing it. My initial comment was focused on the consistency or logical flow of terminology/phrasing because that's a clearer topic. If it turns out that addressing my comment that requires opening up a lot more lines of discussion let's do the incremental thing and add all the extra notes now and decide on the English that surrounds them later.

We could also consider renaming Disabled to Inactive, Invalid, Deactivated, something like that.

Please not another use of Invalid :p

I don't particularly care what color the shed is, but I want us to be happy with the term before Tree Borrows becomes a default.

@Vanille-N
Copy link
Contributor Author

Speaking of "cram that all into a single sentence", I hope this is more readable ?
I used help: by default because it's not obvious how to have that part of the code use another marker.

@bors
Copy link
Contributor

bors commented Apr 14, 2023

☔ The latest upstream changes (presumably #2843) made this pull request unmergeable. Please resolve the merge conflicts.

Copy link
Member

@RalfJung RalfJung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice, just some nits for the messages!

And one more concern: could you benchmark this against tree-borrows without the history tracking? In SB we had to work a bit to avoid regressing perf too much.

src/borrow_tracker/tree_borrows/diagnostics.rs Outdated Show resolved Hide resolved
tests/fail/tree-borrows/read-to-local.stderr Outdated Show resolved Hide resolved
|
= help: the accessed tag <TAG> is a foreign tag for the conflicting tag <TAG>.
= help: the conflicting tag <TAG> is protected, and the access would cause it to transition from Reserved to Disabled, which is a loss of read and write permissions.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is hard to follow. If the conflicting tag is protected, does its relation with the accessed tag even matter? Or could we say something more like "this access conflicts with protected tag "..."the access would cause the protected tag to transition..."..."the protected tag was created here"...? Somewhere it should also say something like "protected tags must not lose permissions" or so; it already says that for deallocation but not for these errors here.

src/borrow_tracker/tree_borrows/perms.rs Outdated Show resolved Hide resolved
src/borrow_tracker/tree_borrows/diagnostics.rs Outdated Show resolved Hide resolved
src/diagnostics.rs Outdated Show resolved Hide resolved
@Vanille-N
Copy link
Contributor Author

Vanille-N commented Apr 20, 2023

And one more concern: could you benchmark this against tree-borrows without the history tracking? In SB we had to work a bit to avoid regressing perf too much.

Tested just now. The easy comparison is between recording or ignoring the events: the only diagnostics-related operation that is done more than once per allocation is pushing an Event to the history. (The other operations are allocating a history (once per allocation), and filtering the history (once per execution))

With this measure, the two versions of the code with and without diagnostics have their mean ± σ that intersect on every single ./miri benchmark, so the cost of recording every event on every range is no more than 2%. I'll try a more in-depth comparison by also removing the computation of Span and stuff.

@RalfJung
Copy link
Member

Are there any situations where we can avoid storing an event, or discard events that we won't need any more?

@Vanille-N
Copy link
Contributor Author

Discard, yes. Events are garbage collected with the tag they are relevant to.

Avoid storing, no. An event is not an access, Event = (access, range, transition) where !transition.is_noop(), so there are by design at most 4 events (1 creation + ≤3 transitions) per tag per location.

All events can possibly be relevant to reconstruct the history of a tag, unless we're fine with having holes in the history, such as if we don't care where a tag went from Active to Frozen if the error is triggered because it is now Disabled.
I like having the current complete history of the permission, but it is true that we don't need all of it.

@RalfJung
Copy link
Member

Well I guess let's just go with this for now and maybe if @saethlin optimizes TB and this becomes an actual bottleneck we decide to only keep some of that history around. ;)

@@ -15,7 +15,7 @@ LL | *y = 1;
| ^^^^^^ write access through <TAG> (y, callee:y, caller:y) is forbidden.
|
= help: the accessed tag <TAG> (y, callee:y, caller:y) is a foreign tag for the protected tag <TAG> (callee:x).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the "foreign" part relevant here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "foreign" part is guaranteed, since protector violations outside of deallocations only ever occur on foreign accesses. If you think it distracts from the rest of the diagnostic it's easy to remove.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think it is a distraction, to understand the error it is not relevant that the pointer is foreign.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm okay that also reads awkwardly... I have tweaked the wording a bit more, time to land this. :)

@saethlin
Copy link
Member

saethlin commented Apr 20, 2023

FWIW in Stacked Borrows I was rather shocked by how cheap the history is compared to the actual stacks. I think the reason this is the case is that these events are not split up per byte like aliasing properties are. In real programs I think a lot of the reborrows are many bytes in size and the history events are not split no matter what aliasing activity is below them.

@RalfJung
Copy link
Member

@bors r+

@bors
Copy link
Contributor

bors commented Apr 28, 2023

📌 Commit 47059d4 has been approved by RalfJung

It is now in the queue for this repository.

@bors
Copy link
Contributor

bors commented Apr 28, 2023

⌛ Testing commit 47059d4 with merge 81eda2a...

@bors
Copy link
Contributor

bors commented Apr 28, 2023

☀️ Test successful - checks-actions
Approved by: RalfJung
Pushing 81eda2a to master...

@bors bors merged commit 81eda2a into rust-lang:master Apr 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants