Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues with caching (discussion) #61

Closed
oleg-shilo opened this issue Apr 7, 2017 · 12 comments
Closed

Issues with caching (discussion) #61

oleg-shilo opened this issue Apr 7, 2017 · 12 comments

Comments

@oleg-shilo
Copy link
Owner

Migrated (Issue 1 topic) from broad discussion #50


When developing, the cache probing algorithm fails to detect that my hosting application's assemblies became newer and the script must therefore me rebuilt (in order to update the references to the latest hosting application assemblies):....

One more thing that might help me:
Could you roughly explain the purpose of the stamping / the binary difference between .exe/.dll and .compiled and tell me where I can find this in the source code? This might help me finding a solution (or an bug if there was one). Maybe, the best solution would be to just switch to .compiled instead of .dll for my application.

Assembly stamping is done for preserving all script execution context inside of the assembly file. Thus if the script engine finds already compiled script it can compare the current execution context with the embedded one and make a decision if the script needs recompilation or the compiled script can be executed straightaway.

Conceptually execution context is very similar to the file timestamp. Though it covers more than a time of the compilation (e.g. dependency assemblies).

CS-Script only injects execution context info into *.compiled assemblies but never into dll/exe.

The implementation can be found in csscript.MetaDataItems.StampFile(file) and csscript.MetaDataItems.ReadFileStamp(file).

@maettu-this
Copy link
Collaborator

Hi Oleg, I finally found the root cause! It's not the file's extension (only partly as I am just describing) and not the logic behind IsOutOfDate() (which works fine) and not stamping of the execution context. It's the timestamp of the compiled assembly file that is causing the issue! But the issue only applies when compiling to .dll (and probably .exe as well)...

By the way, as you can see from the screenshots below, I am now able to step into CSScript source code :-) The only thing had to change was explicitly setting the .NET 2.0 runtime compiler:
providerOptions["CompilerVersion"] = "v2.0".

Here's the details:

As described in #50 the exception happens here:
exception in asmbrowser

The exception is caused by the mismatching versions of my host application's assembly (v1.10.6338.21060 -vs- .21479 = same version but different build stamp). .21060 is the version referred to by the outdated assembly in the .NET cache whereas .21479 is the correct version currently loaded in the application domain.

I have then drilled down into the details of the source code and came to the conclusion that IsOutOfDate() and everything else works fine and the issue must indeed be caused by the .NET runtime's caching infrastructure.

It is. As soon as I disable the altering of the assembly's file timestamp, all works fine!
timestamp altering disabled

So the .NET runtime must get confused by the 'old' timestamp. If the timestamp of the resulting .dll is March 8th, when I last modified the script, I get the exception. If the timestamp remains unchanged, i.e. is the time of the compilation, all works fine.

What was the reason for changing the timestamp? If I remember correctly, that change was done quite recently, and I later posted #32 requesting to keep the .pdb consistent with the .dll/.exe. Depending on the reason for that change, I am requesting to either...

  • ...revert that change, i.e. keep the timestamp of the compilation.
  • ....only shift the timestamp if indeed none of the dependencies of the script as changed since then.
  • ...or make the behavior configurable.

PS1: CSScript actually stamps files with .dll extensions as well:
stamping

PS2: https://github.com/oleg-shilo/cs-script still refers to the outdated v3.24.2.

@oleg-shilo
Copy link
Owner Author

Great, at least we know what is triggering it.

There are a few points in your post I need to reflect on.

I am now able to step into CSScript source code...

I cannot be sure but it looks like it was the discrepancy between script and host assembly target runtime. And your providerOptions["CompilerVersion"] = ... just synchronized them. Makes sense.

...mismatching versions of my host application's assembly (v1.10.6338.21060 -vs- .21479 = same version but different build stamp...

As a small work around (not to fix problem but to make it less frequent). You may want to avoid using auto-versioning (e.g. [assembly: AssemblyVersion("1.0.*")]) and use explicit version instead.

...What was the reason for changing the timestamp? ...

The code fragment you are asking about was in the codebase for many many years :)
In Jan 19 the code was extended (4843d60) with setting timestamp of pdb files upon your request.

The purpose of timestamping is caching. It's timestamp based so the equality of script and compiled script timestamps constitutes a valid cache.

While a specific file timestamp may or may not affect .NET runtime caching (shadow copying), it certainly shouldn't. Changing the file timestamp has nothing to do with assembly probing. However there is no warranty that CLR may not have some undocummented feature that is compromised if a timestamp is changed.

I don't feel right about changing CS-Script to just accommodate some undocumented/ununderstood CLR behavior. However having an optional alternative script compilation seems an adequate compromise. But before I proceed with the changes can you please check for me if commenting out pdb-related code brings the same benefit:

if (scriptFile.Exists && asmFile.Exists)
{
    asmFile.LastWriteTimeUtc = scriptFile.LastWriteTimeUtc;
    //if (options.DBG && pdbFile.Exists)
    //    pdbFile.LastWriteTimeUtc = scriptFile.LastWriteTimeUtc;
}

oleg-shilo added a commit that referenced this issue May 11, 2017
@maettu-this
Copy link
Collaborator

maettu-this commented May 12, 2017

Commenting out .pdb-related code doesn't help, same behavior with and without. The only thing that makes a difference is whether the .dll has the current time stamp (time of compilation) or the script's time stamp.

Two things to consider:

  • The 'Changed' attribute of a file typically represents the moment when it got changed the last time. Thus, a user intuitively understands that this was the time of compilation or whatever else processing. But changing the time back to a point in the past is not very intuitive.
  • Alternatively to changing the time stamp, how about injecting the time stamp of the original script file into the generated assembly file (along with all other data that is injected? This way, the information would be available while at the same time no change of the file 'Changed' attribute would be necessary.

Regarding the assembly versioning, I am somewhat cautious to drop the * for at least the revision, since during deployment and testing things get built and rebuilt and might lead to some other issue if two versions of the same assembly have the same explicit version but are actually different. Microsoft products such as Internet Explorer or Visual Studio also have the build stamp in their version.

@oleg-shilo
Copy link
Owner Author

...Commenting out .pdb-related code doesn't help...

I expected that. Just wanted to check. Thxs. The conditional time stamping has already been implemented and will be delivered with the very next release (on this weekend):

CSScript.GlobalSettings.SuppressTimestamping = true;`

Thus, a user intuitively understands that this was the time of compilation...

This is not a problem at all as timestamped assemblies are cached scripts that are not to be accessed (or dealt with) by user but by CS-Script runtime only.

...how about injecting the time stamp...

This is actually a very good work around idea. The reason why time is set as a file attribute is performance. Reading the attribute is much faster than reading the file content. Though I am not convinced that difference would make any practical effect. Will need to access the impact of the potential change on the codebase.

... Microsoft products such as Internet Explorer or Visual Studio also have the build stamp in their version....

While this is a violation of semantic versioning it is still acceptable. What is not acceptable is the automatic increment of the last version segment on every build. It leads to the paradox when two assemblies are considered by CLR as different while being built from the same code.

It may seem like a small thing but it's not. Imagine that you lost and need to rebuild an assembly in all its glory. You cannot. The version will be different even if you use the same code as for the original assembly.

In .NET/VS at least it can be controlled but in WiX you have to go very deep to disable so called package code. I already had to deal with this in my Wix#. Was really shocked to find that it is a default behavior of WiX tools.

@maettu-this
Copy link
Collaborator

Thanks for the background info. Looking forward to integrating the upcoming release :-)

I understand that a file time stamp is much more effective than having to read its content. Since the time stamp was used as an input parameter for the caching algorithm, will caching become less efficient or even be disabled with CSScript.GlobalSettings.SuppressTimestamping = true;? Even if it does, I guess I could live with that for my hosting application. However, if it adds a significant amount of time on each execution, our full test runs (consisting of thousands of executions each night) could suffer in terms of execution time.

This is not a problem at all as timestamped assemblies are cached scripts that are not to be accessed (or dealt with) by user but by CS-Script runtime only.

I don't fully agree on that. I understand that this is the CS-Script's default behavior. But my hosting application actually compiles the assemblies to the very same path where the user's script is located, i.e. MyScript.cs becomes MyScript.dll.

Reasons for this: Possibility to reuse the assembly (not only reuse the script).

  • A developer can build a bunch of scripts into an assembly (script library) and then pass it to a college who can use the library with his own simple script. Advantage: The collegue (e.g. somebody not that experienced in programming) only has to deal with a very simple script. The complexity is hidden in the assembly and managed by the experienced developer.
  • A developer can commit the assembly into version control, so the automated test infrastructure can retrieve it without having to rebuild the script on each build or test run.

oleg-shilo added a commit that referenced this issue May 14, 2017
- Added support for resolving NuGet packages from the `netstandard` lib subfolders
- Added defaultRefAssemblies exchange in InitExecuteOptions()
- .NET Core support (wip/RC)
- Issue #61: Issues with caching; Sample: `CSScript.GlobalSettings.LegacyTimestampCahing = false;` (default is `false`)
- Added support for `//css_ac_end` to allow Extension methods in classless scripts
- Implemented caching for InMemory script assemblies
- Fixed collapsing spaces in the DefaulrRefAssemblies settings value
oleg-shilo added a commit that referenced this issue May 16, 2017
- Added support for resolving NuGet packages from the `netstandard` lib subfolders
- Added defaultRefAssemblies exchange in InitExecuteOptions()
- .NET Core support (wip/RC)
- Issue #61: Issues with caching; Sample: `CSScript.GlobalSettings.LegacyTimestampCahing = false;` (default is `false`)
- Added support for `//css_ac_end` to allow Extension methods in classless scripts
- Implemented caching for InMemory script assemblies
- Fixed collapsing spaces in the DefaulrRefAssemblies settings value
@oleg-shilo
Copy link
Owner Author

Done: Release v3.25.3.0

@maettu-this
Copy link
Collaborator

Minor:
legacyTimestampCahing should be legacyTimestampCaching

@maettu-this
Copy link
Collaborator

Verified, works :-) Thanks once more!

@oleg-shilo
Copy link
Owner Author

Txs. Fixed now.
It was just a typo in the release notes. In the code you should use LegacyTimestampCaching.

@oleg-shilo
Copy link
Owner Author

Cool. Txs for testing.

@maettu-this
Copy link
Collaborator

Actually, legacyTimestampCahing is code, fortunately only the name of the internal variable ;-)

@oleg-shilo
Copy link
Owner Author

Oh, yes, you are right. Txs.
Will be fixed. Won't not be rushing with this one though :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants