Skip to content

Commit

Permalink
Implement signaled dep tracking for partial reevaluation, use for gen…
Browse files Browse the repository at this point in the history
…query

This CL teaches Skyframe to keep track of which previously requested
unfinished deps have completed, for parent SkyFunctions which have opted
into partial reevaluation. Those SkyFunction implementations can now take
advantage of that information to avoid wasted work.

This CL teaches the new genquery scope traversal SkyFunction to take
advantage of this new functionality.

Some terminology used in this CL ("PartialReevaluationMailbox") takes
inspiration from "actor models" of concurrency, such as Erlang and Akka.
Those frameworks associate a "mailbox" with each "actor" (e.g. concurrent
processor of work), to buffer incoming messages. In Skyframe, keys/nodes
correspond to these actors, and deps signaling their parents correspond
to message passing.

These mailboxes live alongside SkyKeyComputeState. SkyFunctions hoping to
take advantage of this signaling mechanism must store state somewhere,
and if they store it in the recommended way through the environment, they
must be robust in the case that state is discarded due to memory
pressure. Coupling mailboxes to that state makes implementation errors
less likely, because SkyFunctions won't need to implement "have dep
signals, have no compute state" recovery, because that state can't be
represented. Rather, their "initial evaluation" policy should apply, and
is almost certainly the right thing for them to do.

SkyFunctionEnvironment's implementation is inefficient for nodes opting
into partial reevaluation, especially when they take advantage of dep
signaling, because SkyFunctionEnvironment "batch prefetches" previously
requested deps before each reevaluation. Given regular (non-partial
reevaluation) SkyFunction evaluations, this is fine, because those
evaluations tend to reread previously requested deps on every restart.

However, SkyFunctions opting into partial reevaluation, especially those
that maintain SkyKeyComputeState, and doubly especially when they take
advantage of dep signaling, are highly likely to *not* need to reread
every previously requested dep. Prefetching them is wasteful. In an
adversarial sequence of SkyFunction reevaluations, this could result in
O(|deps|^2) work: if the SkyFunction yields after each new dep request,
and all previously requested deps are reread for each reevaluation.

This CL takes the first step towards fixing this inefficiency. Now, the
SkyFunctionEnvironment instantiated for partial reevaluations doesn't
prefetch previously requested deps from the graph. Instead, it reads them
only as the SkyFunction rerequests them. Unfortunately, the
SkyFunctionEnvironment still needs to be aware of whether a requested dep
is new, so even the specialized environment does O(|deps|) work,
constructing a Set of the node's previously requested deps on each
restart.

A subsequent CL will complete this effort.

Note that no attempt is made to reconcile the environment's
getMaxTransitiveSourceVersionSoFar functionality with partial
reevaluation. The only SkyFunction which depends on this method is
ActionSketchFunction, which has not opted into partial reevaluation. If
it did, then this could be made to work by doing a full fetch of
previously requested deps first, being aware of the performance penalty
of doing so.

However, maxTransitiveSourceVersion will be correct when a node gets
committed, because before any commit, AbstractParallelEvaluator fetches
previously requested deps -- because it needs to solve this version
problem, and because it needs to ensure that all previously requested
deps are still done, to be robust to rewinding.

PiperOrigin-RevId: 501015898
Change-Id: I43717aada069a0374a19e55574b101067e4b57db
  • Loading branch information
anakanemison authored and hvadehra committed Feb 14, 2023
1 parent 78d4be9 commit 6bdac9f
Show file tree
Hide file tree
Showing 13 changed files with 791 additions and 205 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -13,11 +13,14 @@
// limitations under the License.
package com.google.devtools.build.lib.rules.genquery;

import static com.google.common.base.Preconditions.checkState;

import com.google.common.base.Preconditions;
import com.google.common.collect.HashMultimap;
import com.google.common.collect.ImmutableList;
import com.google.common.collect.ImmutableMap;
import com.google.common.collect.Interner;
import com.google.common.collect.LinkedHashMultimap;
import com.google.devtools.build.lib.cmdline.Label;
import com.google.devtools.build.lib.cmdline.PackageIdentifier;
import com.google.devtools.build.lib.concurrent.BlazeInterners;
Expand All @@ -37,8 +40,12 @@
import com.google.devtools.build.lib.skyframe.TargetLoadingUtil.TargetAndErrorIfAny;
import com.google.devtools.build.lib.skyframe.serialization.autocodec.AutoCodec;
import com.google.devtools.build.skyframe.AbstractSkyKey;
import com.google.devtools.build.skyframe.PartialReevaluationMailbox;
import com.google.devtools.build.skyframe.PartialReevaluationMailbox.Causes;
import com.google.devtools.build.skyframe.PartialReevaluationMailbox.Mail;
import com.google.devtools.build.skyframe.SkyFunction;
import com.google.devtools.build.skyframe.SkyFunction.Environment;
import com.google.devtools.build.skyframe.SkyFunction.Environment.ClassToInstanceMapSkyKeyComputeState;
import com.google.devtools.build.skyframe.SkyFunction.Environment.SkyKeyComputeState;
import com.google.devtools.build.skyframe.SkyFunctionException;
import com.google.devtools.build.skyframe.SkyFunctionException.Transience;
Expand All @@ -58,7 +65,6 @@
* information.
*/
public class GenQueryDirectPackageProviderFactory implements GenQueryPackageProviderFactory {

public static final SkyFunctionName GENQUERY_SCOPE =
SkyFunctionName.createHermetic("GENQUERY_SCOPE");

Expand Down Expand Up @@ -142,19 +148,18 @@ protected BrokenQueryScopeSkyFunctionException(
* <p>([0] In the future, {@code collectedPackages} might also contain packages needed to evaluate
* "buildfiles" functions; see b/123795023.)
*
* <p>The {@code labelsToVisitNextRestart} field contains labels of targets belonging to
* <p>The {@code labelsToVisitInLaterRestart} field contains labels of targets belonging to
* previously unloaded packages, the "frontier" of the last Skyframe evaluation attempt's
* traversal.
*/
private static class ScopeTraversal implements SkyKeyComputeState {
private final LinkedHashMap<PackageIdentifier, Package> collectedPackages =
new LinkedHashMap<>();
private final LinkedHashMap<Label, Target> collectedTargets = new LinkedHashMap<>();
private final LinkedHashSet<Label> labelsToVisitNextRestart = new LinkedHashSet<>();

private ScopeTraversal(Collection<Label> initialScope) {
labelsToVisitNextRestart.addAll(initialScope);
}
private final LinkedHashMap<Label, SkyKey> labelsToVisitInLaterRestart = new LinkedHashMap<>();
private final LinkedHashMultimap<SkyKey, Label> labelsToVisitInverse =
LinkedHashMultimap.create();
}

@Nullable
Expand All @@ -172,28 +177,74 @@ public GenQueryPackageProvider constructPackageMap(Environment env, ImmutableLis
private static GenQueryPackageProvider constructPackageMapImpl(
Environment env, ImmutableList<Label> scope)
throws InterruptedException, BrokenQueryScopeException {
ScopeTraversal traversal = env.getState(() -> new ScopeTraversal(scope));

LinkedHashSet<Label> labelsToVisit = new LinkedHashSet<>(traversal.labelsToVisitNextRestart);
traversal.labelsToVisitNextRestart.clear();
ClassToInstanceMapSkyKeyComputeState computeState =
env.getState(ClassToInstanceMapSkyKeyComputeState::new);
Mail mail = PartialReevaluationMailbox.from(computeState).getMail();
ScopeTraversal traversal = computeState.getInstance(ScopeTraversal.class, ScopeTraversal::new);

LinkedHashSet<Label> labelsToVisit = null;
switch (mail.kind()) {
case FRESHLY_INITIALIZED:
// First evaluation, or, Skyframe compute state lost due to memory pressure or errors.
// Either way, start from scratch.
checkState(traversal.collectedPackages.isEmpty(), "expected empty collectedPackages");
checkState(traversal.collectedTargets.isEmpty(), "expected empty collectedTargets");
checkState(
traversal.labelsToVisitInLaterRestart.isEmpty(),
"expected empty labelsToVisitInLaterRestart");
checkState(traversal.labelsToVisitInverse.isEmpty(), "expected empty labelsToVisitInverse");
labelsToVisit = new LinkedHashSet<>(scope);
break;
case CAUSES:
Causes causes = mail.causes();
if (causes.other()) {
labelsToVisit = new LinkedHashSet<>(traversal.labelsToVisitInLaterRestart.keySet());
traversal.labelsToVisitInLaterRestart.clear();
traversal.labelsToVisitInverse.clear();
} else {
labelsToVisit = new LinkedHashSet<>();
for (SkyKey signaledDep : causes.signaledDeps()) {
Collection<Label> labels = traversal.labelsToVisitInverse.asMap().remove(signaledDep);
// We may have been signaled by a dep whose value was observed during a previous
// restart; if so, then skip it because there is no work to do for it.
if (labels != null) {
for (Label label : labels) {
traversal.labelsToVisitInLaterRestart.remove(label);
labelsToVisit.add(label);
}
}
}
}
break;
case EMPTY:
// This reevaluation may have been triggered by a dep which completed after our previous
// reevaluation started; another reevaluation gets scheduled in such a case.
//
// Adding that dep's key to our mailbox raced with our reading our mailbox in that previous
// reevaluation. If the add won, then we consumed the key last time, and our mailbox may now
// be empty. If so, then there's no work to do now, so we return.
return null;
}

// Constructing these here minimizes garbage creation. They're used in dep traversals below.
var attrDepConsumer =
new LabelProcessor() {
LinkedHashSet<Label> nextLabelsToVisitRef = null;

boolean attrDepNeedsRestart = false;
SkyKey keyForAttrDepNeedingRestart = null;
boolean attrDepUnvisited = false;
boolean hasAspects = false;
HashMultimap<Attribute, Label> transitions = null;

@Override
public void process(Target from, @Nullable Attribute attribute, Label to) {
if (hasAspects
&& !attrDepNeedsRestart
&& traversal.labelsToVisitNextRestart.contains(to)) {
attrDepNeedsRestart = true;
return;
if (hasAspects && keyForAttrDepNeedingRestart == null) {
SkyKey skyKey = traversal.labelsToVisitInLaterRestart.get(to);
if (skyKey != null) {
keyForAttrDepNeedingRestart = skyKey;
return;
}
}
if (!traversal.collectedTargets.containsKey(to)) {
attrDepUnvisited = true;
Expand All @@ -202,7 +253,7 @@ public void process(Target from, @Nullable Attribute attribute, Label to) {
}

if (hasAspects
&& !attrDepNeedsRestart
&& keyForAttrDepNeedingRestart == null
&& !attrDepUnvisited
&& attribute != null
&& DependencyFilter.NO_NODEP_ATTRIBUTES.test((Rule) from, attribute)) {
Expand Down Expand Up @@ -234,8 +285,8 @@ public void accept(Attribute aspectAttribute, Label aspectLabel) {
// 1) discover that there is a problem with the label's package. If so, this throws
// BrokenQueryScopeException to stop this genquery evaluation.
// 2) discover that needed package information has not been computed by Skyframe. If so,
// this records that label must be visited after the next Skyframe restart by adding it
// to labelsToVisitNextRestart; at that time that package information will have been
// this records that label must be visited in a later Skyframe restart by adding it
// to labelsToVisitInLaterRestart; at that time that package information will have been
// computed.
// 3) use the package information already computed by Skyframe to collect the label's target
// and package.
Expand All @@ -252,35 +303,37 @@ public void accept(Attribute aspectAttribute, Label aspectLabel) {
// 1) if all those dependency attributes' labels' targets have been collected, then this
// code will enqueue the rule's aspect dependencies' labels for visitation.
// 2) otherwise, at least one of those dependency attributes' labels must have been added to
// labelsToVisitNextRestart, so the rule's aspect dependencies can't be computed during
// this Skyframe restart, so the rule's label also must be visited after the next
// labelsToVisitInLaterRestart, so the rule's aspect dependencies can't be computed
// during this Skyframe restart, so the rule's label also must be visited in a later
// Skyframe restart.

Target target = traversal.collectedTargets.get(label);
if (target == null) {
TargetAndErrorIfAny targetAndErrorIfAny;
try {
targetAndErrorIfAny = TargetLoadingUtil.loadTarget(env, label);
Object o = TargetLoadingUtil.loadTarget(env, label);
if (o instanceof TargetAndErrorIfAny) {
TargetAndErrorIfAny targetAndErrorIfAny = (TargetAndErrorIfAny) o;
if (!targetAndErrorIfAny.isPackageLoadedSuccessfully()) {
throw new BrokenQueryScopeException(
"errors were encountered while computing transitive closure of the scope.");
}

target = targetAndErrorIfAny.getTarget();
traversal.collectedTargets.put(label, target);
traversal.collectedPackages.put(label.getPackageIdentifier(), target.getPackage());
} else {
SkyKey missingKey = (SkyKey) o;
traversal.labelsToVisitInLaterRestart.put(label, missingKey);
traversal.labelsToVisitInverse.put(missingKey, label);
continue;
}
} catch (NoSuchTargetException | NoSuchPackageException e) {
throw new BrokenQueryScopeException(
"errors were encountered while computing transitive closure of the scope.", e);
}

if (targetAndErrorIfAny == null) {
traversal.labelsToVisitNextRestart.add(label);
continue;
}
if (!targetAndErrorIfAny.isPackageLoadedSuccessfully()) {
throw new BrokenQueryScopeException(
"errors were encountered while computing transitive closure of the scope.");
}

target = targetAndErrorIfAny.getTarget();
traversal.collectedTargets.put(label, target);
traversal.collectedPackages.put(label.getPackageIdentifier(), target.getPackage());
}

attrDepConsumer.attrDepNeedsRestart = false;
attrDepConsumer.keyForAttrDepNeedingRestart = null;
attrDepConsumer.attrDepUnvisited = false;
attrDepConsumer.hasAspects = target instanceof Rule && ((Rule) target).hasAspects();
attrDepConsumer.transitions = attrDepConsumer.hasAspects ? HashMultimap.create() : null;
Expand All @@ -291,8 +344,10 @@ public void accept(Attribute aspectAttribute, Label aspectLabel) {
continue;
}

if (attrDepConsumer.attrDepNeedsRestart) {
traversal.labelsToVisitNextRestart.add(label);
if (attrDepConsumer.keyForAttrDepNeedingRestart != null) {
traversal.labelsToVisitInLaterRestart.put(
label, attrDepConsumer.keyForAttrDepNeedingRestart);
traversal.labelsToVisitInverse.put(attrDepConsumer.keyForAttrDepNeedingRestart, label);
continue;
} else if (attrDepConsumer.attrDepUnvisited) {
// This schedules label to be visited a second time during this Skyframe restart. Because
Expand All @@ -319,7 +374,7 @@ public void accept(Attribute aspectAttribute, Label aspectLabel) {
}
labelsToVisit = nextLabelsToVisit;
}
if (env.valuesMissing() || !traversal.labelsToVisitNextRestart.isEmpty()) {
if (env.valuesMissing() || !traversal.labelsToVisitInLaterRestart.isEmpty()) {
return null;
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -38,10 +38,10 @@ private TargetLoadingUtil() {}
*
* <p>Establishes all Skyframe dependencies needed for incremental correctness.
*
* <p>Returns {@code null} if {@code env.valuesMissing()}.
* <p>Returns {@link TargetAndErrorIfAny} if no dep was mising; otherwise, returns the {@link
* SkyKey} specifying the missing dep.
*/
@Nullable
public static TargetAndErrorIfAny loadTarget(Environment env, Label label)
public static Object loadTarget(Environment env, Label label)
throws NoSuchTargetException, NoSuchPackageException, InterruptedException {
if (label.getName().contains("/")) {
// This target is in a subdirectory, therefore it could potentially be invalidated by
Expand All @@ -50,18 +50,19 @@ public static TargetAndErrorIfAny loadTarget(Environment env, Label label)
PackageIdentifier newPkgId =
PackageIdentifier.create(label.getRepository(), containingDirectory);
ContainingPackageLookupValue containingPackageLookupValue;
SkyKey containingPackageKey = ContainingPackageLookupValue.key(newPkgId);
try {
containingPackageLookupValue =
(ContainingPackageLookupValue)
env.getValueOrThrow(
ContainingPackageLookupValue.key(newPkgId),
containingPackageKey,
BuildFileNotFoundException.class,
InconsistentFilesystemException.class);
} catch (InconsistentFilesystemException e) {
throw new NoSuchTargetException(label, e.getMessage());
}
if (containingPackageLookupValue == null) {
return null;
return containingPackageKey;
}

if (!containingPackageLookupValue.hasContainingPackage()) {
Expand All @@ -88,7 +89,7 @@ public static TargetAndErrorIfAny loadTarget(Environment env, Label label)
PackageValue packageValue =
(PackageValue) env.getValueOrThrow(packageKey, NoSuchPackageException.class);
if (packageValue == null) {
return null;
return packageKey;
}

Package pkg = packageValue.getPackage();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -220,7 +220,8 @@ private boolean hasDepThatSatisfies(
@Nullable
TargetAndErrorIfAny loadTarget(Environment env, Label label)
throws NoSuchTargetException, NoSuchPackageException, InterruptedException {
return TargetLoadingUtil.loadTarget(env, label);
Object o = TargetLoadingUtil.loadTarget(env, label);
return o instanceof TargetAndErrorIfAny ? (TargetAndErrorIfAny) o : null;
}

/**
Expand Down
Loading

0 comments on commit 6bdac9f

Please sign in to comment.