Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include PDB Checksum in Debug Directory #24711

Merged
merged 6 commits into from
Feb 20, 2018
Merged

Conversation

tmat
Copy link
Member

@tmat tmat commented Feb 8, 2018

Changes

  1. Generate Portable PDB ID deterministically based on the hash of the content, rather than from a timestamp, regardless of whether /deterministic is set or not.

  2. Store the full secure hash of the PDB content in a new Debug Directory entry. This entry is emitted by default unless emitting a script.

Details

The CodeView debug directory entry in PE/COFF file associates the PE file with one or more PDBs. The CodeView entry and the PDB both store the same PDB ID (for Portable PDB it's 20B for Windows PDB it's 16B of data). Debuggers, symbol servers and other tools use the PDB ID to match the PE file with the PDB.

Although the PDB ID is good enough for finding the right PDB for the PE file it is not good enough for validating that the PDB has not been maliciously modified. PDB Checksum is a new debug directory record that can be used for such validation.

PDB Checksum comprises of crypto hash algorithm name and the hash of the PDB content. See
Specification for details.

The implementation uses a bit of reflection as a workaround for APIs not yet implemented in System.Reflection.Metadata: https://github.com/dotnet/corefx/issues/26935. This workaround will be removed once we move to the latest SRM. Tracked by #24712.

Public API additions

Adds EmitOptions.PdbChecksumAlgorithm, EmitOptions.WithPdbChecksumAlgorithm and new EmitOptions constructor taking a hash algorithm.

Bugs this fixes

Workarounds, if any

None.

Risk

Small.

Performance impact

Small.

Is this a regression from a previous update?

Root cause analysis

How was the bug found?

Test documentation updated?

@tmat tmat requested a review from a team as a code owner February 8, 2018 19:17
@tmat tmat changed the title PDB Checksum WIP: PDB Checksum Feb 8, 2018
@tmat tmat changed the title WIP: PDB Checksum WIP: Use SHA256 instead of SHA1 and emit PDB Checksum Debug Directory entry Feb 9, 2018
@tmat tmat changed the title WIP: Use SHA256 instead of SHA1 and emit PDB Checksum Debug Directory entry WIP: Use SHA256 instead of SHA1 in emit and include PDB Checksum in Debug Directory Feb 9, 2018
@tmat tmat requested review from a team as code owners February 9, 2018 02:43
@tmat tmat changed the title WIP: Use SHA256 instead of SHA1 in emit and include PDB Checksum in Debug Directory Use SHA256 instead of SHA1 in emit and include PDB Checksum in Debug Directory Feb 9, 2018
@tmat
Copy link
Member Author

tmat commented Feb 9, 2018

@dotnet/compiler-api
@dotnet/roslyn-compiler
@ivanbasov @jinujoseph

Please review.

@gafter
Copy link
Member

gafter commented Feb 9, 2018

Can you please elaborate on the "Risk" identified? How many (and what) tools are likely to be affected?

runtimeMetadataVersion,
tolerateErrors,
includePrivateMembers,
instrumentationKinds: ImmutableArray<InstrumentationKind>.Empty,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instrumentationKinds [](start = 18, length = 20)

I think the value of this parameter should not be Empty, but rather the value of the incoming parameter.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. Apparently we are missing code coverage here.

{
IncrementalHash.CreateHash(HashAlgorithm).Dispose();
}
catch
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

catch [](start = 12, length = 5)

A catch-all like this is a bit scary.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are calling to external library (BCL). Any exception that happens means the hash algorithm is bad or there is a bug in BCL. Either way we report a diagnostic. Sure, we can hit OOM, or stack overflow, but such exceptions will be rethrown anyways even if we catch them here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add this explanation as a comment


In reply to: 167302067 [](ancestors = 167302067)

@tmat
Copy link
Member Author

tmat commented Feb 9, 2018

@gafter @MattGertz @jinujoseph

I hear some compat and perf concerns with switching to SHA256 by default. I think I can leave the default source hash be SHA1 and create another PR that switches it to SHA256, so to not block the PDB Checksum feature on this decision. I thought the SHA256 switch is a prereq but on second thought it doesn't need to be.

@tmat tmat changed the title Use SHA256 instead of SHA1 in emit and include PDB Checksum in Debug Directory WIP: Use SHA256 instead of SHA1 in emit and include PDB Checksum in Debug Directory Feb 9, 2018
Copy link
Member

@gafter gafter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me other than the issue in the EmitOptions ctor. I realize that this was a preexisting problem, but can you please fix it?

@tmat
Copy link
Member Author

tmat commented Feb 9, 2018

@gafter Yes, I'll address the EmitOptions bug.

@gafter
Copy link
Member

gafter commented Feb 9, 2018

I think splitting the PDB checksum feature from the change in default algorithm (i.e. have two PRs) is a good idea.

@MattGertz
Copy link
Contributor

@tmat That seems like a good beginning and we can debate the defaults later on if/as needed.

@MattGertz
Copy link
Contributor

@jac009 only as FYI since he was curious about the direction here.

@tmat tmat changed the title WIP: Use SHA256 instead of SHA1 in emit and include PDB Checksum in Debug Directory WIP: Include PDB Checksum in Debug Directory Feb 9, 2018
public string AlgorithmName { get; }

/// <summary>
/// GUID (Globally Unique Identifier) of the associated PDB.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GUID (Globally Unique Identifier) of the associated PDB. [](start = 12, length = 56)

Maybe improve the comment. I think it is a checksum and we could expect it to be unique but it maybe uncommon to refer it as GUID.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the comment is wrong.

Copy link
Contributor

@ivanbasov ivanbasov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@tmat tmat changed the title WIP: Include PDB Checksum in Debug Directory Include PDB Checksum in Debug Directory Feb 9, 2018
@tmat
Copy link
Member Author

tmat commented Feb 9, 2018

@dotnet/compiler-api
@dotnet/roslyn-compiler
@ivanbasov @jinujoseph

Updated the PR to only add PDB Checksum and leave usage of SHA1 as is (except for hashing PDBs).

@tmat
Copy link
Member Author

tmat commented Feb 10, 2018

Test failure: #24758

@tmat
Copy link
Member Author

tmat commented Feb 10, 2018

test windows_release_unit64_prtest please

@tmat
Copy link
Member Author

tmat commented Feb 13, 2018

@dotnet/roslyn-compiler Can I have a second review please?

@tmat
Copy link
Member Author

tmat commented Feb 13, 2018

The change to SHA256 is now in a separate PR: #24820

using Xunit;

namespace Roslyn.Utilities.UnitTests.InternalUtilities
{
using System.Linq;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move the using outside the namespace.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will fix. Somehow the "Add using" added it here :(

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, this using is required inside the namespace. Both Roslyn.Utilities and System.Linq namespaces have EnumerableExtensions. We want to test the latter here.

@@ -1309,7 +1310,9 @@ public new CSharpCommandLineArguments Parse(IEnumerable<string> args, string bas
fileAlignment: fileAlignment,
subsystemVersion: subsystemVersion,
runtimeMetadataVersion: runtimeMetadataVersion,
instrumentationKinds: instrumentationKinds.ToImmutableAndFree()
instrumentationKinds: instrumentationKinds.ToImmutableAndFree(),
// TODO: set from /checksumalgorithm (see https://github.com/dotnet/roslyn/issues/24735)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we holding off on this part, seems important here?

Copy link
Member Author

@tmat tmat Feb 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not that important. The algorithm used for PDB checksum in this PR is hardcoded to SHA256, which is good enough. Allowing to set the value to SHA384 or SHA512 will be important in future, but it can wait a bit.

The reason to hold off is the default value of this switch. Right now it's SHA1, which is not acceptable for PDB checksum. The default has to be at least SHA256. If we want to use the same switch for both source checksums and PDB checksums then we need to set the default for source checksums to SHA256 first.

@tmat tmat merged commit 32a11c4 into dotnet:dev15.7.x Feb 20, 2018
@tmat tmat deleted the PdbChecksum branch February 20, 2018 17:38
333fred added a commit to 333fred/roslyn that referenced this pull request Feb 24, 2018
* features/compiler: (118 commits)
  Fix up test failures
  Post merge cleanup
  SubstituteNoPiaLocalType - use comparison of SpecialTypes to match bases across distinct core libraries. (dotnet#24965)
  Respond to PR feedback
  PR feedback
  PR Feedback
  Fix tests
  Standardized on the doc mcomments
  Unified most helpers now
  Standardize the IL helpers
  Unify the WinRT overloads
  Unify the CompileAndVerify
  Unify overloads
  Fix a name
  Implement improved overload candidates, aka "Bestest Betterness". (dotnet#24756)
  Include PDB Checksum in Debug Directory (dotnet#24711)
  Disable multicore jitting
  More renames
  Fixed up the unit tests
  Fix a compilation issue
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants