Unreadable documents no longer exist in the Compilation #41190

jasonmalinowski · 2020-01-24T01:11:17Z

#40044 made a change to our API behavior. Previously, if a Document could not be read, a syntax tree would still be placed in the Compilation that was empty. Now, simply no syntax tree exists anymore. This means that this assertion is no longer valid:

var syntaxTree = await document.GetSyntaxTreeAsync();
var compilation = await document.Project.GetCompilationAsync();

Assert.True(compilation.ContainsSyntaxTree(syntaxTree))

I did have some unit tests that implicitly depended on this which were broken by the change, which is what alerted me to the break. @tmat's motivation for the change was to prevent an analyzer (say "you need a file header") from running on files that couldn't be loaded, but I'm wondering if that's better achieved by leaving the syntax tree there and just not running diagnostics (i.e. treat it as a generated file, since yes, the fake empty file is "generated!").

There is other code out there that is doing ContainsSyntaxTree() and in some cases asserting if the tree doesn't exist in the compilation, for example here:

roslyn/src/Features/Core/Portable/LanguageServices/SymbolDisplayService/AbstractSymbolDisplayService.AbstractSymbolDescriptionBuilder.cs

Lines 122 to 141 in 6cb854a

    
           var model = _semanticModel.GetOriginalSemanticModel(); 
        
           if (model.Compilation.ContainsSyntaxTree(tree)) 
        
           { 
        
               return model.Compilation.GetSemanticModel(tree); 
        
           } 
        
           // it is from one of its p2p references 
        
           foreach (var referencedCompilation in model.Compilation.GetReferencedCompilations()) 
        
           { 
        
               // find the reference that contains the given tree 
        
               if (referencedCompilation.ContainsSyntaxTree(tree)) 
        
               { 
        
                   return referencedCompilation.GetSemanticModel(tree); 
        
               } 
        
           } 
        
           // the tree, a source symbol is defined in, doesn't exist in universe 
        
           // how this can happen? 
        
           Debug.Assert(false, "How?"); 
        
           return null;

Other code making similar assumptions is now broken if there's an unreadable file. I'm not sure how widespread that might be, but I'm not sure if we need to do a different approach to prevent the breaks being seen.

As a background, the original motivation for stubbing in the empty file was to ensure that the exceptional case doesn't result in Compilation inconsistencies: anybody processing a single snapshot might not be processing "real" source but won't see some inconsistency which results in crashes.

The text was updated successfully, but these errors were encountered:

CyrusNajmabadi · 2020-01-24T01:17:01Z

This means that this assertion is no longer valid:

Yeah... the part that worries me the most is that this seems like making major inconsistencies between things that used to be consistent.

If we can't read in a document, what do we represents it's text value as?
not having the text+tree+compilation be in sync really scares me.
i think it's fine to want to be resilient ot unreadable files. but i would like the invariants and models to be consistent. To that end, i would prefer that these just be empty files, potentially with empty syntax tree with diagnostics attached to them saying "file was unreadable", or something to that effect.

CyrusNajmabadi · 2020-01-24T01:18:06Z

but I'm wondering if that's better achieved by leaving the syntax tree there and just not running diagnostics (i.e. treat it as a generated file, since yes, the fake empty file is "generated!").

Yes. I would probably model this as a SyntaxTree either with a bit saying "i'm borked" or some critical diagnostic on it saying "i'm borked".

jasonmalinowski · 2020-01-24T01:18:53Z

So if you can't read a file, the Document's text is empty. Previously you had no good way to figure that out, but @tmat's change does add a sane API to ask that which is fantastic. It's more the "what does this mean for the Compilation?" that's the question.

CyrusNajmabadi · 2020-01-24T01:19:45Z

I'm not sure how widespread that might be

I don't imagine it's hugely widespread. That said, this sort of inconsistency is pretty awful thing to absorb. Primarily because it's just so trivial for people to hold this expectation in their head and not have a landmind waiting to go off.

CyrusNajmabadi · 2020-01-24T01:20:29Z

It's more the "what does this mean for the Compilation?" that's the question.

Representing an unreadable Document as having 'empty text' seems totally sensible to me. You'll get an empty tree, and you'll have a compilation that contains that empty tree.

CyrusNajmabadi · 2020-01-24T01:23:29Z

motivation for the change was to prevent an analyzer (say "you need a file header") from running on files that couldn't be loaded, but I'm wondering if that's better achieved by leaving the syntax tree there and just not running diagnostics (i.e. treat it as a generated file, since yes, the fake empty file is "generated!").

what even is the problem if the analyer runs. Say it's the analyzer that wants to then offer something? is it a problem?
that said, i would also be fine to not run analyzers on broken files. they're broken. seems pretty sane to avoid any work on them.
even if we did work and reported some issue, it would also be fine for me to then filter after the fact. after all, waht would it mean to apply a fix to a document that didn't even read in correctly?

IMO, the right level of abstraction here is all at the Document level (and the tools tha tprocess that). The compiler layer should be blissfully unaware of this and should just expose a model where all its constituent parts still make sense together.

tmat · 2020-01-24T01:37:55Z

The IDE reports diagnostic for the unreadable document. Running analyzers on the empty syntax tree might result in errors that are confusing to the user. We can re-establish the invariant, which was not documented or tested anywhere BTW, by marking the tree as "generated" or in some other way making the analyzer driver to skip it.

CyrusNajmabadi · 2020-01-24T01:39:26Z

We can re-establish the invariant, which was not documented or tested anywhere BTW

Definitely a pity. I think that likely fell into the: if this changes, we're screwed camp. We should be better about actually ensuring there is some sort of test for that sort of thing. Several people who worked on these designs aren't around anymore. So it's hard to tell what actually has testing and what does not but just became a baked in assumption over time.

CyrusNajmabadi · 2020-01-24T01:39:46Z

or in some other way making the analyzer driver to skip it.

That seems a worthwhile place to go for me.

jasonmalinowski · 2020-01-29T18:26:34Z

Moving to 16.5.P3 now that we also have internal crashes that may be related.

jasonmalinowski · 2020-01-29T18:46:36Z

The design review on Monday concluded that we should keep the tree there, and mark it as generated (which, in a very pedantic sense, the fake empty file is generated).

jasonmalinowski added the Need Design Review The end user experience design needs to be reviewed and approved. label Jan 24, 2020

jinujoseph added Area-IDE Bug IDE-Project Project system and MSBuild interactions labels Jan 28, 2020

jinujoseph added this to the Backlog milestone Jan 28, 2020

jasonmalinowski modified the milestones: Backlog, 16.5.P3 Jan 29, 2020

jasonmalinowski assigned tmat Jan 29, 2020

jasonmalinowski added Urgency-Now and removed Need Design Review The end user experience design needs to be reviewed and approved. labels Jan 29, 2020

tmat mentioned this issue Jan 31, 2020

Include syntax trees of files that can't be read in compilation. #41297

Merged

tmat closed this as completed Jan 31, 2020

sharwell added this to IDE: Design review Aug 22, 2023

sharwell moved this to Complete in IDE: Design review Aug 22, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unreadable documents no longer exist in the Compilation #41190

Unreadable documents no longer exist in the Compilation #41190

jasonmalinowski commented Jan 24, 2020 •

edited

Loading

CyrusNajmabadi commented Jan 24, 2020

CyrusNajmabadi commented Jan 24, 2020

jasonmalinowski commented Jan 24, 2020

CyrusNajmabadi commented Jan 24, 2020

CyrusNajmabadi commented Jan 24, 2020

CyrusNajmabadi commented Jan 24, 2020

tmat commented Jan 24, 2020

CyrusNajmabadi commented Jan 24, 2020

CyrusNajmabadi commented Jan 24, 2020

jasonmalinowski commented Jan 29, 2020

jasonmalinowski commented Jan 29, 2020

Unreadable documents no longer exist in the Compilation #41190

Unreadable documents no longer exist in the Compilation #41190

Comments

jasonmalinowski commented Jan 24, 2020 • edited Loading

CyrusNajmabadi commented Jan 24, 2020

CyrusNajmabadi commented Jan 24, 2020

jasonmalinowski commented Jan 24, 2020

CyrusNajmabadi commented Jan 24, 2020

CyrusNajmabadi commented Jan 24, 2020

CyrusNajmabadi commented Jan 24, 2020

tmat commented Jan 24, 2020

CyrusNajmabadi commented Jan 24, 2020

CyrusNajmabadi commented Jan 24, 2020

jasonmalinowski commented Jan 29, 2020

jasonmalinowski commented Jan 29, 2020

jasonmalinowski commented Jan 24, 2020 •

edited

Loading