-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prototype: Storing compilation outputs in Project #34698
Conversation
@jasonmalinowski @DustinCampbell @CyrusNajmabadi @heejaechang PTAL. The implementation is incomplete. Just wanted to put together a proposal. |
/cc @jinujoseph |
{ | ||
// The project doesn't produce a debuggable binary, we can't read it or it's not stored in a file on disk. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Requirement for storing the file on disk is temporary. It will be removed in the new EnC manager implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we went ahead with this obviously add a comment to that effect.
Will do! Note: i haven't looked, but i'm always wary of things like |
What i would probably like are more involved lifetime docs on things like: /// <summary>
/// Opens an assembly file produced by the compiler (corresponds to OutputAssembly build task parameter).
/// </summary>
public abstract Stream OpenOutputAssembly(); i.e. will this only ever be called once? if it could be called multiple times, what is the expected behavior. Should hte same instance of this type be passed to multiple compilation versions? what happens then in terms of them not stompin on each other? :) I have no protest about anything in this PR. I'm just always scared of streams/IO and how to best represent them. So any and all clarity here would be appreciated! |
The API shape is the same as An instance of |
Sure. |
Oh ok! That makes me feel much better now. If this is following patterns that have worked well for us that def makes me feel more comfortable. Will try to review in the next day. |
Cool! Thanks for the info. It's not on you to necessarily add that info. but it probably would be useful to have this in the docs. The part about the caller having to dispose is especially relevant. |
Agreed. This is just a quick prototype implementation to see how the API looks and is used, will definitely document in more details once we accept the design. |
@@ -43,9 +44,14 @@ public void SetWin32Resource(string filename) | |||
// This file is used only during emit. Since we no longer use our in-proc workspace to emit, we can ignore this value. | |||
} | |||
|
|||
public CompilationOutputFiles CompilationOutputFiles | |||
=> (CompilationOutputFiles)VisualStudioProject.CompilationOutputs; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would we ever expect VisualStudioProject to not be using a CompilationOutputFiles?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So when redoing the project system code I was trying to keep VisualSudioProject fairly abstract and still making sure it doesn't assume:
- That projects necessarily exist on disk.
- They come from a particular source like MSBuild
- They're C# or VB.
Right now ASP.NET is doing some work where they'll be creating projects that represent in-memory compilation results, and so theirs might be using the base type in this if we enable E&C for that. But for the legacy C#/VB shims that wrap all that, it's safe to assume. This cast makes total sense to me, other than wondering out loud if Tomas wants to move it to AbstractLegacyProject.
@@ -53,8 +53,8 @@ internal sealed class VisualStudioProject | |||
private string _filePath; | |||
private CompilationOptions _compilationOptions; | |||
private ParseOptions _parseOptions; | |||
private CompilationOutputs _compilationOutputs; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it just to be on the safe sid that this is the base type, and not the derived type? Or is this very intentional because you'll have a different impl sometimes? For example does EnC have it's own project type that would do things differently?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's intentional. I assume we may have projects that compile in-memory at some point (e.g. ASP.NET).
can it copy it over to memory map file first time it is opened and give stream to it if it is read only? or is it writable as well? if it is read only, one can give out multiple stream at the same time from memory map file. we already have all code for it. and code to sync it to OOP. |
if this become part of ProjectAttribute synchronizing with OOP is straight forward since ProjectAttribute is currently the thing we consider data we need to sync between VS. if we want it to live outside of ProjectAttribute, then it require a bit more changes in OOP sync code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not signing off since this is indeed a prototype and is obviously missing some polish (doc comments, etc.) but I think this is on the right direction. My only concern is the snapshot behavior here -- the implicit contract of stuff you get from Solution/Project/Document are snapshots, but the streams you get either reflect the current snapshot nor do they behave as snapshots. I think moving the "get me the options" API to a workspace service might be the right approach, leaving all the rest of the work in place. The API is still public and there's still plenty of layers of indirection available.
{ | ||
// The project doesn't produce a debuggable binary, we can't read it or it's not stored in a file on disk. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we went ahead with this obviously add a comment to that effect.
_mvid = ReadMvid(outputPath); | ||
using (var outputAssemblyStream = outputFiles.OpenOutputAssembly()) | ||
{ | ||
if (outputAssemblyStream != null) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As an API question: would there be needs for people to know if there is an output assembly without actually triggering opening a stream?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could add bool HasOutputAssembly
etc. if needed
// Unlike OutputFilePath and OutputRefFilePath, the intermediate output path isn't represented in the workspace anywhere; | ||
// thus, we won't mutate the solution. We'll still call ChangeProjectOutputPath so we have the rest of the output path tracking | ||
// for any P2P reference conversion. | ||
ChangeProjectOutputPath(ref _intermediateOutputFilePath, value, s => s, w => { }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ChangeProjectOutputPath was also implicitly hooking up the ability to convert a file path given to us from the project system to a proper project-to-project reference. I don't expect other projects to be referencing the obj but did it out of paranoia. If that's something we'll want to maintain we'll need to fix that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. Why are we second guessing the project system on what references are p2p? msbuild knows. Shouldn't PS know as well?
@@ -121,11 +121,6 @@ public string BinOutputPath | |||
} | |||
} | |||
|
|||
internal string GetIntermediateOutputFilePath() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was CPS or F# not calling this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
F# does not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CPS does not have IVT.
string getAbsolutePath(string path) | ||
=> (outputDirectory != null && path != null) ? Path.Combine(outputDirectory, path) : path; | ||
|
||
// TODO: calculate default paths if not specified |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you want to? Or just leave it at null? The defaults should be computed by the MSBuild that's feeding into this, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure. Need to look into this. I think all these are initialized when the project is created from project system, while they are optional on the csc.exe command line. We should probably require them to be initialized when we receive the data from project system.
|
||
namespace Microsoft.CodeAnalysis | ||
{ | ||
public sealed class CompilationOutputFiles : CompilationOutputs, IEquatable<CompilationOutputFiles> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the "we have a base type, and then one that's file-specific".
string pdbFilePath = null, | ||
string documentationFilePath = null) | ||
{ | ||
string requireAbsolutePath(string path, string parameterName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we not have a helper for this?
public abstract class CompilationOutputs | ||
{ | ||
/// <summary> | ||
/// Opens an assembly file produced by the compiler (corresponds to OutputAssembly build task parameter). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may be worth calling out this is the obj thing, and might get further mutations? I admit it's a bit hard to write that since this this class is potentially pretty abstract and "obj path" isn't something really discussed here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
calling out this is the obj thing
We can give it as an example, but obj
is a construct of msbuild.
might get further mutations
Yes. It should say the "current content of". Although it depends what we decide to do about snapshotting.
id, version, name, assemblyName, language, | ||
filePath, outputFilePath, outputRefFilePath, defaultNamespace: null, compilationOptions, parseOptions, | ||
documents, projectReferences, metadataReferences, analyzerReferences, additionalDocuments, | ||
isSubmission, hostObjectType, hasAllInformation: true, CompilationOutputFiles.None); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why CompilationOutputFiles.None versus a no-op implementation of the base?
@@ -469,6 +472,16 @@ public ProjectState UpdateParseOptions(ParseOptions options) | |||
documentStates: docMap); | |||
} | |||
|
|||
public ProjectState UpdateCompilationOutputs(CompilationOutputs outputs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this is a bit funky -- this doesn't quite follow the usual snapshot model in that CompilationOutputs's stream methods actually return that doesn't match up with the current snapshot but something older -- it's implicitly the result of the last time we built, right? It also means if somebody does observe the value, we won't necessarily be persisting it for later calls to ensure immutability.
I can think of a few options:
- Don't care, and just do nothing.
- Instead make an ICompilationOutputsWorkspaceService (which would still be public) and that just has a GetCompilationOutputs method that takes a ProjectId. That way it's no longer strictly tied to the snapshot and doesn't come with the guarantees thereof.
I'm thinking 2 would be my preference. The CompilationOutputs type you've got itself I think would stay unchanged and all the work already done up in VisualStudioProject.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is this different from having OutputPath on Project? The CompilationOuputs instance is a an abstracted set of "paths".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking about this more I agree with going with [2].
Actually had one more thought: making this a workspace service potentially makes other things like the remote workspace a bit saner, since that could just implement the service in a fairly independent way without having to special case how this gets passed around? |
Some things are snapshot and some aren't - they are just "references" whose content is not snapshot. We already have The proposed API allows us to snapshot if we need to (and we might). You can write implementation of |
I'll send another PR implementing the workspace service approach as suggested in #34698 (comment) |
The EnC infrastructure needs to read data from the assembly and the PDB produced by the compiler.
Currently this information is not represented in Workspace layer, which is where we need to access it.
See #34371.
Proposal
Add new APIs that provide an access to the files that is abstracted from the file system, so that it allows projects to potentially compile in-memory and still provide the necessary data.
OpenXxx
opens a new stream each time it's called. It's up to the caller to dispose the stream.If the stream is backed by a file, the file is locked while the stream is being opened. The caller must make sure to close the stream as soon as it can to allow rebuild.
A file system specific implementation: