Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Off-by-one spans in MIR borrowck errors #46885

Closed
davidtwco opened this issue Dec 20, 2017 · 6 comments
Closed

Off-by-one spans in MIR borrowck errors #46885

davidtwco opened this issue Dec 20, 2017 · 6 comments
Assignees
Labels
A-diagnostics Area: Messages for errors, warnings, and lints A-NLL Area: Non-lexical lifetimes (NLL)

Comments

@davidtwco
Copy link
Member

davidtwco commented Dec 20, 2017

This is a minor thing, but you can see in the below snippet that just below line 18, the borrowed value only lives until here is a character to the right of where it is in the Ast equivalent.

error[E0597]: `x` does not live long enough (Ast)
  --> $DIR/issue-46471.rs:15:6
   |
15 |     &x
   |      ^ does not live long enough
...
18 | }
   | - borrowed value only lives until here
   |
   = note: borrowed value must be valid for the static lifetime...

error[E0597]: `x` does not live long enough (Mir)
  --> $DIR/issue-46471.rs:15:5
   |
15 |     &x
   |     ^^ does not live long enough
...
18 | }
   |  - borrowed value only lives until here
   |
   = note: borrowed value must be valid for the static lifetime...

error: aborting due to 2 previous errors

I've seen it in the following tests and there are probably a handful of others:

@estebank estebank added the A-diagnostics Area: Messages for errors, warnings, and lints label Dec 20, 2017
@nikomatsakis nikomatsakis added this to the NLL Future Compat Warnings milestone Jan 4, 2018
@pnkfelix pnkfelix added the A-NLL Area: Non-lexical lifetimes (NLL) label Jan 10, 2018
@davidtwco
Copy link
Member Author

I'll tackle this one.

@davidtwco
Copy link
Member Author

Here's what I've figured out so far:

I've been tracing the span in the context of the issue-46472 test. The errors in the examples given originate from the report_borrowed_value_does_not_live_long_enough function, specifically the drop_span argument:

pub(super) fn report_borrowed_value_does_not_live_long_enough(
&mut self,
context: Context,
borrow: &BorrowData<'tcx>,
drop_span: Span,
borrows: &ActiveBorrows<'cx, 'gcx, 'tcx>
) {

This function is called (at least in the example given) by the check_access_for_conflict function:

WriteKind::StorageDeadOrDrop => {
error_reported = true;
this.report_borrowed_value_does_not_live_long_enough(
context,
borrow,
place_span.1,
flow_state.borrows.operator(),
);
}

The span is passed to that by its caller, which is the access_place function:

let conflict_error =
self.check_access_for_conflict(context, place_span, sd, rw, flow_state);

The span is passed to that function by its caller, which is the visit_statement_entry function:

StatementKind::StorageDead(local) => {
self.access_place(
ContextKind::StorageDead.new(location),
(&Place::Local(local), span),
(Shallow(None), Write(WriteKind::StorageDeadOrDrop)),
LocalMutationIsAllowed::Yes,
flow_state,
);
}

We can see from the above snippet that the span is part of a Statement, specifically it is a StatementKind::StorageDrop statement. Going further, that function is given the statement by the process_basic_block function:

fn process_basic_block(&mut self, bb: BasicBlock, flow_state: &mut Self::FlowState) {
let BasicBlockData { ref statements, ref terminator, is_cleanup: _ } =
self.mir()[bb];
let mut location = Location { block: bb, statement_index: 0 };
for stmt in statements.iter() {
flow_state.reconstruct_statement_effect(location);
self.visit_statement_entry(location, stmt, flow_state);
flow_state.apply_local_effect(location);
location.statement_index += 1;
}

By adding debug! statements anywhere in rustc_mir::build that mentions StorageDrop, we can find that it is created in the build_scope_drops function:

match drop_data.location {
Place::Local(index) if index.index() > arg_count => {
cfg.push(block, Statement {
source_info,
kind: StatementKind::StorageDead(index)
});
}
_ => continue
}

At the top of this function, we see that the span originates from the scope argument.

fn build_scope_drops<'tcx>(cfg: &mut CFG<'tcx>,
resume_block: BasicBlock,
scope: &Scope<'tcx>,
earlier_scopes: &[Scope<'tcx>],
mut block: BasicBlock,
arg_count: usize,
generator_drop: bool)
-> BlockAnd<()> {
debug!("build_scope_drops({:?} -> {:?})", block, scope);
let mut iter = scope.drops.iter().rev();
while let Some(drop_data) = iter.next() {
let source_info = scope.source_info(drop_data.span);

The caller of this function is pop_scope, where we see the scope being taken from the Builder field self.scopes:

let scope = self.scopes.pop().unwrap();
assert_eq!(scope.region_scope, region_scope.0);
self.cfg.push_end_region(self.hir.tcx(), block, region_scope.1, scope.region_scope);
let resume_block = self.resume_block();
unpack!(block = build_scope_drops(&mut self.cfg,
resume_block,
&scope,
&self.scopes,
block,
self.arg_count,
false));

If we print the value of the scope variable, we get the following:

DEBUG:<unknown>: DTW 1: Scope { visibility_scope: scope[0], region_scope: Node(ItemLocalId(4)), region_scope_span: /mnt/c/Users/David/Projects/personal/rust/src/test/ui/issue-46472.rs:13:29: 17:2, needs_cleanup: false, drops: [DropData { span: /mnt/c/Users/David/Projects/personal/rust/src/test/ui/issue-46472.rs:17:2: 17:2, location: _3, kind: Storage }, DropData { span: /mnt/c/Users/David/Projects/personal/rust/src/test/ui/issue-46472.rs:17:2: 17:2, location: _2, kind: Storage }, DropData { span: /mnt/c/Users/David/Projects/personal/rust/src/test/ui/issue-46472.rs:17:2: 17:2, location: _1, kind: Storage }], cached_exits: {}, cached_generator_drop: None, cached_unwind: CachedBlock { unwind: None, generator_drop: None } }

We can see the drops field contains the span that we're looking for. That's as far as I've gotten at the moment.

@nikomatsakis
Copy link
Contributor

So @davidtwco did some more investigation and narrowed the problem to the span produced by this line:

span: lo.to(self.prev_span),

On gitter, they wrote:

@nikomatsakis So, we have prev_span in the previous snippet. prev_span is only replaced by the new previous span in two places. These spans come from the TokenCursor::next function, which when poping over the frames from the stack, if it detects a closing brace, will create a new span that covers just this individual token.

In the closing delimiter case, it does this by taking the span from the frame (which in this case seems to be the whole function), and then finds the endpoint (which it considers to be the space after the brace) - it constructs the new span by taking the endpoint (space after function) as hi and then removes what it considers the length of the DelimToken::Brace (which is one) from that (giving us the brace symbol) and sets that as lo. ie. "} "

I think that the length of the DelimToken::Brace being one makes sense, it is one character. I think that for the logic we found with the lo.to(..) to make sense, we'd need to dig a bit deeper and find out where the stack of frames is being created, since the issue would be the closing delim frame ending after the character. We'd then end up with the lo being the character behind the brace (column 0) and the hi value being the brace (column 1) - and then lo.to(..) makes sense.

Now, it is possible that the rationale was that a span starting at column 0 makes no sense (which would happen if we modified the frame), and neither does a span over a character that doesn't actually span any BytePoss (which would happen if we modified the DelimToken length) - therefore the best option is to have the span extend out from the character by one like it does. In which case, changing to lo.until(..) is the best option.

I'm not sure, whatever you think is the best option. I've convinced myself both ways back and forth in writing this. I think that we should go with the lo.until(..) solution, unless there is precedent for spans at column 0 or with the lo and hi being the same BytePos over one character, I don't think that solution makes sense. If you consider the empty space after the brace that we're seeing a newline character, then the way the span is currently does make sense. Or not, I'm confusing myself.

It seems to me that the root problem is actually the span for end brace being "} ", but I thought I would cc @estebank, @nrc, and @petrochenkov to see if any of you has an opinion.

@davidtwco
Copy link
Member Author

davidtwco commented Jan 13, 2018

After running all the ui tests with the change to lo.until(..), it's pretty clear that this isn't a suitable solution:

---- [ui] ui/suggestions/str-array-assignment.rs stdout ----
        diff of stderr:

2         --> $DIR/str-array-assignment.rs:13:11
3          |
4       13 |   let t = if true { s[..2] } else { s };
+          |           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected str, found &str
-          |           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected str, found &str
6          |
7          = note: expected type `str`
8                     found type `&str`

There are lots of test results like this. We'll need to dig a bit deeper and find out what is specifically happening in the cases we're seeing an issue with.

It seems like in some cases, the span is " }" - I believe this is the intended situation. In the issue-46472 example, if I add some logging to the close_tt function where the span is created:

/// Returns the closing delimiter as a token tree.
pub fn close_tt(&self, span: Span) -> TokenTree {
let close_span = if span == DUMMY_SP {
DUMMY_SP
} else {
span.with_lo(span.hi() - BytePos(self.delim.len() as u32))
};
TokenTree::Token(close_span, self.close_token())
}

Then we find that the variables have the following values:

span=`Span { lo: BytePos(554), hi: BytePos(708), ctxt: #0 }` 
delim.len()=`1`
delim=`Brace`

Given this, I think the issue is that the span variable has hi: BytePos(708) rather than hi: BytePos(707). This function is called by the TokenStream::next function that was mentioned in the previous quoted message:

fn next(&mut self) -> TokenAndSpan {
loop {
let tree = if !self.frame.open_delim {
self.frame.open_delim = true;
Delimited { delim: self.frame.delim, tts: TokenStream::empty().into() }
.open_tt(self.frame.span)
} else if let Some(tree) = self.frame.tree_cursor.next() {
tree
} else if !self.frame.close_delim {
self.frame.close_delim = true;
Delimited { delim: self.frame.delim, tts: TokenStream::empty().into() }
.close_tt(self.frame.span)
} else if let Some(frame) = self.stack.pop() {
self.frame = frame;
continue
} else {
return TokenAndSpan { tok: token::Eof, sp: syntax_pos::DUMMY_SP }
};
match self.frame.last_token {
LastToken::Collecting(ref mut v) => v.push(tree.clone()),
LastToken::Was(ref mut t) => *t = Some(tree.clone()),
}
match tree {
TokenTree::Token(sp, tok) => return TokenAndSpan { tok: tok, sp: sp },
TokenTree::Delimited(sp, ref delimited) => {
let frame = TokenCursorFrame::new(sp, delimited);
self.stack.push(mem::replace(&mut self.frame, frame));
}
}
}
}

Here we can see that self.frame.span is what is being passed into close_tt and that self.frame is from self.stack.pop.

@davidtwco
Copy link
Member Author

So, a little more progress. In the advance_token function below, start_bytepos is BytePos(707) (the position of the } in the issue-46472 example.

fn advance_token(&mut self) -> Result<(), ()> {
match self.scan_whitespace_or_comment() {
Some(comment) => {
self.peek_span = comment.sp;
self.peek_tok = comment.tok;
}
None => {
if self.is_eof() {
self.peek_tok = token::Eof;
self.peek_span = self.mk_sp(self.filemap.end_pos, self.filemap.end_pos);
} else {
let start_bytepos = self.pos;
self.peek_tok = self.next_token_inner()?;
self.peek_span = self.mk_sp(start_bytepos, self.pos);
};
}
}
Ok(())
}

The call to next_token_inner returns the CloseDelim token we expect and advances self.pos to BytePos(708). This means that the span ends up being Span { lo: BytePos(707), hi: BytePos(708) } - ie. "} ".

If we modify the function to check if self.ch (the next character, also updated by next_token_inner) is whitespace, and if it is make the span Span { lo: BytePos(707), hi: BytePos(707) } then it fixes the issue. This feels like an ugly solution.

error[E0597]: borrowed value does not live long enough (Mir)
  --> /mnt/c/Users/David/Projects/personal/rust/src/test/ui/issue-46472.rs:14:10
   |
14 |     &mut 4
   |          ^ temporary value does not live long enough
...
17 | }
   | - temporary value only lives until here
   |

However, in doing this, we end up modifying the borrowck error to:

error[E0597]: borrowed value does not live long enough (Ast)
  --> /mnt/c/Users/David/Projects/personal/rust/src/test/ui/issue-46472.rs:14:10
   |
14 |       &mut 4
   |            ^ temporary value does not live long enough
15 |       //~^ ERROR borrowed value does not live long enough (Ast) [E0597]
16 |       //~| ERROR borrowed value does not live long enough (Mir) [E0597]
   |  ______________________________________________________________________-
17 | | }
   | |_ temporary value only lives until here
   |

This seems to suggest that this is dealt with somewhere in the borrowck code (or elsewhere) to move the BytePos back a position. Given that, I've looked through the borrowck code and found this:

fn region_end_span(&self, region: ty::Region<'tcx>) -> Option<Span> {
match *region {
ty::ReScope(scope) => {
Some(scope.span(self.tcx, &self.region_scope_tree).end_point())
}
_ => None
}
}

Notice the call to end_point on the span. The rest of the function is similar to the following code in the Mir borrow checker:

let region_scope_span = region_scope.span(self.hir.tcx(),
&self.hir.region_scope_tree);

Except, there's no call to end_point! I've submitted #47420 that fixes this issue - I really wish I'd noticed that difference earlier.

@estebank
Copy link
Contributor

@davidtwco your write up on the sleuthing you've had to do is amazingly detailed! Thank you so much for taking the time to figuring this out!

GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this issue Jan 19, 2018
@nikomatsakis nikomatsakis modified the milestones: NLL run-pass without ICEs, NLL soundness violations Jan 19, 2018
@davidtwco davidtwco self-assigned this Jan 25, 2018
bors added a commit that referenced this issue Jan 27, 2018
Fix off-by-one spans in MIR borrowck errors

Fixes #46885.

r? @nikomatsakis
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-diagnostics Area: Messages for errors, warnings, and lints A-NLL Area: Non-lexical lifetimes (NLL)
Projects
None yet
Development

No branches or pull requests

5 participants