Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Estimate file versions safely #2753

Merged
merged 7 commits into from
Mar 6, 2022
Merged

Conversation

pepeiborra
Copy link
Collaborator

@pepeiborra pepeiborra commented Mar 5, 2022

For a long time, defineEarlyCutoff has been accessing the Values store directly
to compute GetModificationTime values instead of calling use, breaking the
invariant. The values are used to associate the rule result to a file version,
which gets recorded in the Value as well as used as the key in the Diagnostics
store.

The problem here is that the GetModificationTime rule computes a new version and
mutates the Values store, so if defineEarlyCutoff peeks in the store before
GetModificationTime has run, it will grab the old version. This leads to lost
diagnostics and potentially to misversioned Values.

Fixing the problem is tricky, because we cannot simply use GetModificationTime
inside defineEarlyCutoff for all rules. There are three issues with this:

  1. Creating a dependency on GetModificationTime: if everything depends on it,
    then we lose the ability to do early cutoff
  2. Creating cycles in the build graph, since GetModificationTime has
    dependencies itself. Because hls-graph doesn't implement cycle detection (Shake
    did), it is a nightmare to debug these cycles.
  3. Creating overhead, since GetModificationTime calls the file system for non
    FOIs and in the past this was very expensive for projects with large cartesian
    product of module paths and source folders

To work around these I had to introduce a new hls-graph primitive,
applyWithoutDependency, as well as add a bunch more fragile type tests on the key
type to decide on whether to use GetModificationTime or peek into the values
store. The type casts could be cleaned up by introducing a type class, but I'm
not sure the end result would be any better.

To understand the issue and debug the implementation of the fix, I added a
number of opentelemety traces which I'm leaving in place in case they could be
useful in the future.

For a long time, defineEarlyCutoff has been accessing the Values store directly
to compute GetModificationTime values instead of calling use, breaking the
invariant. The values are used to associate the rule result to a file version,
which gets recorded in the Value as well as used as the key in the Diagnostics
store.

The problem here is that the GetModificationTime rule computes a new version and
mutates the Values store, so if defineEarlyCutoff peeks in the store before
GetModificationTime has run, it will grab the old version. This leads to lost
diagnostics and potentially to misversioned Values

Fixing the problem is tricky, because we cannot simply use GetModificationTime
inside defineEarlyCutoff for all rules. There are three issues:

1. Creating a dependency on GetModificationTime. If everything depends on it,
then we lose the ability to do early cutoff
2. Creating cycles in the build graph, since GetModificationTime has
dependencies itself. Because hls-graph doesn't implement cycle detection (Shake
did), it is a nightmare to debug these cycles.
3. Creating overhead, since GetModification time calls the file system for non
FOIs and in the past this was very expensive for projects with large cartesian
product of module paths and source folders

To work around these I had to introduce a new hls-graph primitive,
applyWithoutDependency, as well as do a bunch of fragile type tests on the key
type to decide on whether to use GetModificationTime or peek into the values
store. The type casts could be cleaned up by introducing a type class, but I'm
not sure the end result would be any better.

To understand the issue and debug the implementation of the fix, I added a
number of opentelemety traces which I'm leaving in place in case they could be
useful in the future.
@pepeiborra pepeiborra requested a review from wz1000 March 5, 2022 20:52
@pepeiborra pepeiborra marked this pull request as ready for review March 6, 2022 07:02
@pepeiborra pepeiborra added the merge me Label to trigger pull request merge label Mar 6, 2022
@mergify mergify bot merged commit b7f37ad into master Mar 6, 2022
@pepeiborra pepeiborra deleted the estimate-file-versions-safely branch March 6, 2022 12:22
@michaelpj
Copy link
Collaborator

It would be nice if the excellent description of the problem and the solution was recorded somewhere more durable!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merge me Label to trigger pull request merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants