Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[compiler][rewrite] PropagateScopeDeps hir rewrite #30894

Merged
merged 11 commits into from
Sep 12, 2024

Conversation

mofeiZ
Copy link
Contributor

@mofeiZ mofeiZ commented Sep 6, 2024

Stack from ghstack (oldest at bottom):

Resubmission of #30079 -- core logic unchanged, but needed to rebase past #30573

Quick background

Temporaries

The compiler currently treats temporaries and named variables (e.g. x) differently in this pass.

  • named variables may be reassigned (in fact, since we're running after LeaveSSA, a single named identifier's IdentifierId may map to multiple Identifier instances -- each with its own scope and mutable range)
  • temporaries are replaced with their represented expressions during codegen. This is correct (mostly correct, see [compiler][fixtures] test repros: codegen, alignScope, phis #29878) as we're careful to always lower the correct evaluation semantics. However, since we rewrite reactive scopes entirely (to if/else blocks), we need to track temporaries that a scope produces in ReactiveScope.declarations and later promote them to named variables.
    In the same example, $4, $5, and $6 need to be promoted: $2 ->t0, $5 ->t1, and $6 ->t2.
[1] $2 = LoadGlobal(global) foo
[2] $3 = LoadLocal bar$1
scope 0:
  [3] $4 = Call $2(<unknown> $3)
scope 1:
  [4] $5 = Object {  }
scope 2:
  [5] $6 = Object { a: $4, b: $5 }
[6] $8 = StoreLocal Const x$7 = $6

Dependencies

ReactiveScope.dependencies records the set of (read-only) values that a reactive scope is dependent on. This is currently limited to just variables (named variables from source and promoted temporaries) and property-loads.
All dependencies we record need to be hoistable -- i.e. reordered to just before the ReactiveScope begins. Not all PropertyLoads are hoistable.

In this example, we should not evaluate obj.a.b without before creating x and checking objIsNull.

// reduce-reactive-deps/no-uncond.js
function useFoo({ obj, objIsNull }) {
  const x = [];
  if (isFalse(objIsNull)) {
    x.push(obj.a.b);
  }
  return x;
}

While other memoization strategies with different constraints exist, the current compiler requires that ReactiveScope.dependencies be re-orderable to the beginning of the reactive scope. But.. PropertyLoads from null values will throw TypeError. This means that evaluating hoisted dependencies should throw if and only if the source program throws. (It is also a bug if source throws and compiler output does not throw. See https://github.com/facebook/react-forget/pull/2709)


Rough high level overview

  1. Pass 1
    Walk over instructions to gather every temporary used outside of its defining scope (same as ReactiveFunction version). These determine the sidemaps we produce, as temporaries used outside of their declaring scopes get promoted to named variables later (and are not considered hoistable rvals).
  2. Pass 2 (collectTemporariesSidemap)
    Walk over instructions to generate a sidemap of temporary identifier -> named variable and property path (e.g. $3 -> {obj: props, path: ["a", "b"]})
  3. Pass 2 (collectHoistablePropertyLoads)
    a. Build a sidemap of block -> accessed variables and properties (e.g. bb0 -> [ {obj: props, path: ["a", "b"]} ])
    b. Propagate "non-nullness" i.e. variables and properties for which we can safely evaluate PropertyLoad.
    A basic block can unconditionally read from identifier X if any of the following applies:
    • the block itself reads from identifier X
    • all predecessors of the block read from identifier X
    • all successors of the block read from identifier X
  4. Pass 3: (collectDependencies)
    Walks over instructions again to record dependencies and declarations, using the previously produced sidemaps. We do not record any control-flow here
  5. Merge every scope's recorded dependencies with the set of hoistable PropertyLoads

Tested by syncing internally and (1) checking compilation output differences (internal link), running internally e2e tests (internal link)


Followups:

  1. Rewrite function expression deps
    This change produces much more optimal output as the compiler now uses the function CFG to understand which variables / paths are assumed to be non-null. However, it may exacerbate this function-expr hoisting bug. A short term fix here is to simply call some form of collectNonNullObjects on every function expression to find hoistable variable / paths. In the longer term, we should refactor out FunctionExpression.deps.

  2. Enable optional paths
    (a) don't count optional load temporaries as dependencies (e.g. collectOptionalLoadRValues(...)).
    (b) record optional paths in both collectHoistablePropertyLoads and dependency collection

[ghstack-poisoned]
Copy link

vercel bot commented Sep 6, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
react-compiler-playground ✅ Ready (Inspect) Visit Preview 💬 Add feedback Sep 12, 2024 9:04pm

mofeiZ added a commit that referenced this pull request Sep 6, 2024
ghstack-source-id: e0710f2e0056f1cdfba9b132c14cd96127603187
Pull Request resolved: #30894
@facebook-github-bot facebook-github-bot added CLA Signed React Core Team Opened by a member of the React Core Team labels Sep 6, 2024
[ghstack-poisoned]
mofeiZ added a commit that referenced this pull request Sep 6, 2024
ghstack-source-id: 4d056b9ef2f5954b772272f7e5fb7509b11698ec
Pull Request resolved: #30894
mofeiZ added a commit that referenced this pull request Sep 6, 2024
Adding new feature flag in preparation for #30894

ghstack-source-id: 59278028cf178d6b2c28732ded751e9c838183d2
Pull Request resolved: #30893
[ghstack-poisoned]
mofeiZ added a commit that referenced this pull request Sep 6, 2024
ghstack-source-id: cfff1094aebf1af04631940b9570cabafc709b02
Pull Request resolved: #30894
[ghstack-poisoned]
mofeiZ added a commit that referenced this pull request Sep 6, 2024
ghstack-source-id: 6bfd001061ec78cb289ef3e5f2dbf8c9aec0317a
Pull Request resolved: #30894
[ghstack-poisoned]
mofeiZ added a commit that referenced this pull request Sep 6, 2024
ghstack-source-id: aa31210b4a30d4219e1c13d2911552abf0796720
Pull Request resolved: #30894
[ghstack-poisoned]
mofeiZ added a commit that referenced this pull request Sep 6, 2024
ghstack-source-id: 4cfb5f606b1575c08dfba58d0fda485a735271e4
Pull Request resolved: #30894
assumedNonNullObjects: Set<PropertyLoadNode>;
};

export function getProperty(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getProperty and resolveTemporary were essentially moved from PropagateScopeDeps (reactive scopes version). Understanding hoistable property loads required the same properties and temporaries sidemaps as the core dependency-extraction logic. Now, we first compute the sidemaps, then use them (as ReadOnlyMaps) in both CollectHoistablePropertyLoads and PropagateScopeDeps

propertyLoadInfo: Map<BlockId, Array<PropertyLoadInfo>>;
};

function collectSidemap(
Copy link
Contributor Author

@mofeiZ mofeiZ Sep 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(copied temporaries / properties tracking logic from PropagateScopeDeps)

Comment on lines +298 to +313
/**
* Due to current limitations of mutable range inference, there are edge cases in
* which we infer known-immutable values (e.g. props or hook params) to have a
* mutable range and scope.
* (see `destructure-array-declaration-to-context-var` fixture)
* We track known immutable identifiers to reduce regressions (as PropagateScopeDeps
* is being rewritten to HIR).
*/
const knownImmutableIdentifiers = new Set<Identifier>();
if (fn.fnType === 'Component' || fn.fnType === 'Hook') {
for (const p of fn.params) {
if (p.kind === 'Identifier') {
knownImmutableIdentifiers.add(p.identifier);
}
}
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This patches the the regression from fixtures like destructure-array-declaration-to-context-var (see comment on original PR)

Comment on lines 337 to 340
const isMutableAtInstr =
object.mutableRange.end > object.mutableRange.start + 1 &&
object.scope != null &&
inRange(instr, object.scope.range);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a little awkward, and also an example of being able to make use of more granular mutable range (instead of scope-range info)

Note that mutableRanges are no longer valid at this point, but we're only comparing relative start/ends (end === start + 1 indicates nonmutable value)

Comment on lines +119 to +143
enum PropertyAccessType {
Access = 'Access',
NonNullAccess = 'NonNullAccess',
Dependency = 'Dependency',
NonNullDependency = 'NonNullDependency',
}

const MIN_ACCESS_TYPE = PropertyAccessType.Access;
/**
* "NonNull" means that PropertyReads from a node are side-effect free,
* as the node is (1) immutable and (2) has unconditional propertyloads
* somewhere in the cfg.
*/
function isNonNull(access: PropertyAccessType): boolean {
return (
access === PropertyAccessType.NonNullAccess ||
access === PropertyAccessType.NonNullDependency
);
}
function isDependency(access: PropertyAccessType): boolean {
return (
access === PropertyAccessType.Dependency ||
access === PropertyAccessType.NonNullDependency
);
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note this is slightly different from the original DeriveMinimalDeps.

}
}

function deriveMinimalDependenciesInSubtree(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this logic is now much simpler than source DeriveMinimalDependencies. The key change is that we know the tree is a 'well formed tree' -- if any child nodes of a node are non-null (or unconditional), that node itself must be non-null.

}
}

function findTemporariesUsedOutsideDeclaringScope(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copied from PropagateScopeDeps

#declarations: Map<DeclarationId, Decl> = new Map();
#reassignments: Map<Identifier, Decl> = new Map();
// Reactive dependencies used in the current reactive scope.
#dependencies: Stack<Array<ReactiveScopeDependency>> = empty();
Copy link
Contributor Author

@mofeiZ mofeiZ Sep 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that dependencies is now a stack since we no longer are recursing into scopes. We only ever use the top entry of this, so I'm happy to add/use a new array based utility class that only exposes push, pop, and top

@mvitousek
Copy link
Contributor

However, it may exacerbate this function-expr hoisting bug.

Link is dead

Copy link
Contributor

@josephsavona josephsavona left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bunch of questions/comments from reading through. Overall it makes sense, definitely very complex. One thing that I think might be a source of incidental complexity is the recursive nature of the approach: propagateScopeDeps() delegates to collectHoistables() which then does collectNodes() and computeSidemaps(), but those inner values from collectHoistables() are used later.

Seeing it as a sequence of a few high level steps (flatten one level of the function call graph, eg flatten away collectHoistables()) might make this easier to follow.

That said, since this is going to be off by default while you add the remaining pieces, i'm okay with shipping and iterating.

let x;
if ($[0] !== props.y || $[1] !== props.e) {
if ($[0] !== props) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm that makes sense, but what i'm getting at is that props is non-nullable, so it's safe to take props.y and props.e as deps. I think this might be the same cause as the other spot - this isn't getting inferred as a component bc there's no jsx or hook calls

let x;
if ($[0] !== props.y || $[1] !== props.e) {
if ($[0] !== props) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that the case with the hook is a bug though, there we can't assume that a is non-null

Comment on lines 195 to 198
export function inRange({id}: Instruction, range: MutableRange): boolean {
return id >= range.start && id < range.end;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: implement isMutable via inRange?

Comment on lines +1163 to +1171
blockInfos: Map<
BlockId,
| {
kind: 'end';
scope: ReactiveScope;
pruned: boolean;
}
| {
kind: 'begin';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't a single block both end a previous block and start a new one?

Copy link
Contributor Author

@mofeiZ mofeiZ Sep 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By construction, a block can be either the start or fallthrough but not both. This is because BuildReactiveScopeTerminals inserts a scope start block and a scope fallthrough block for every scope.

// before BuildReactiveScopeTerminals
bb0:
[0]
[1]  -┐
[2]   | scope@0
[3]  -┘
...

// after
bb0:
  [0]
  ScopeTerminal block=bb1 fallthr=bb2

bb1: <-- scope begin block
  [1]
  [2]
  Goto bb2

bb2: <-- scope end block
  [3]
  ...


addDependency(dep: ReactiveScopePropertyDependency): void {
const {path} = dep;
let currNode = this.#getOrCreateRoot(dep.identifier, false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we pass false for isNonNull here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't know whether the root node (i.e. a named identifier or promoted temporary) is non-null. A node being 'NonNull' means that it's safe to hoist reads from property of that node. (see recursive deriveMinimalDeps case)

export type CollectHoistablePropertyLoadsResult = {
nodes: ReadonlyMap<ScopeId, BlockInfo>;
temporaries: ReadonlyMap<Identifier, Identifier>;
properties: ReadonlyMap<Identifier, ReactiveScopeDependency>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Plus one, ideally we'd stop using Identifier as a key everywhere. With LeaveSSA gone, it should always be equivalent to use IdentifierId, if not let me know and we can fix.

Comment on lines 77 to 78
const sidemap = collectSidemap(fn, usedOutsideDeclaringScope);
const nodes = collectNodes(fn, sidemap);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If possible it would be helpful to name these descriptively. I have no idea what these two things do w/o reading them (which i haven't gotten to yet)

fn: HIRFunction,
nodes: ReadonlyMap<BlockId, BlockInfo>,
): void {
const succ = new Map<BlockId, Set<BlockId>>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: use full names

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to use full names

Comment on lines 400 to 425
let changed = false;
for (const pred of neighbors) {
if (!traversalState.has(pred)) {
const neighborChanged = recursivelyDeriveNonNull(
pred,
direction,
traversalState,
nonNullObjectsByBlock,
);
changed ||= neighborChanged;
}
}
/**
* Active neighbors can be filtered out as we're solving for the following
* relation.
* X = Intersect(X_neighbors, X)
* Non-active neighbors with no recorded results can occur due to backedges.
* it's not safe to assume they can be filtered out (e.g. not intersected)
*/
const neighborAccesses = Set_intersect([
...(Array.from(neighbors)
.filter(n => traversalState.get(n) === 'done')
.map(n => nonNullObjectsByBlock.get(n) ?? new Set()) as Array<
Set<PropertyLoadNode>
>),
]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think you can do flatMap instead of .map and avoid the [...(innerExpr)] wrapper

Comment on lines 400 to 425
let changed = false;
for (const pred of neighbors) {
if (!traversalState.has(pred)) {
const neighborChanged = recursivelyDeriveNonNull(
pred,
direction,
traversalState,
nonNullObjectsByBlock,
);
changed ||= neighborChanged;
}
}
/**
* Active neighbors can be filtered out as we're solving for the following
* relation.
* X = Intersect(X_neighbors, X)
* Non-active neighbors with no recorded results can occur due to backedges.
* it's not safe to assume they can be filtered out (e.g. not intersected)
*/
const neighborAccesses = Set_intersect([
...(Array.from(neighbors)
.filter(n => traversalState.get(n) === 'done')
.map(n => nonNullObjectsByBlock.get(n) ?? new Set()) as Array<
Set<PropertyLoadNode>
>),
]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also just curious, the Set_intersect is using object identity but what is ensuring that equivalent property loads in different blocks have the same identity?

[ghstack-poisoned]
mofeiZ added a commit that referenced this pull request Sep 11, 2024
Resubmission of #30079 -- core logic unchanged, but needed to rebase past #30573

### Quick background
#### Temporaries

The compiler currently treats temporaries and named variables (e.g. `x`) differently in this pass.
- named variables may be reassigned (in fact, since we're running after LeaveSSA, a single named identifier's IdentifierId may map to multiple `Identifier` instances -- each with its own scope and mutable range)
- temporaries are replaced with their represented expressions during codegen. This is correct (mostly correct, see #29878) as we're careful to always lower the correct evaluation semantics. However, since we rewrite reactive scopes entirely (to if/else blocks), we need to track temporaries that a scope produces in `ReactiveScope.declarations` and later promote them to named variables.
In the same example, $4, $5, and $6 need to be promoted: $2 ->`t0`,  $5 ->`t1`, and $6 ->`t2`.
```js
[1] $2 = LoadGlobal(global) foo
[2] $3 = LoadLocal bar$1
scope 0:
  [3] $4 = Call $2(<unknown> $3)
scope 1:
  [4] $5 = Object {  }
scope 2:
  [5] $6 = Object { a: $4, b: $5 }
[6] $8 = StoreLocal Const x$7 = $6
```

#### Dependencies
`ReactiveScope.dependencies` records the set of (read-only) values that a reactive scope is dependent on. This is currently limited to just variables (named variables from source and promoted temporaries) and property-loads.
All dependencies we record need to be hoistable -- i.e. reordered to just before the ReactiveScope begins. Not all PropertyLoads are hoistable.

In this example, we should not evaluate `obj.a.b` without before creating x and checking `objIsNull`.
```js
// reduce-reactive-deps/no-uncond.js
function useFoo({ obj, objIsNull }) {
  const x = [];
  if (isFalse(objIsNull)) {
    x.push(obj.a.b);
  }
  return x;
}
```

While other memoization strategies with different constraints exist, the current compiler requires that `ReactiveScope.dependencies` be re-orderable to the beginning of the reactive scope. But.. `PropertyLoad`s from null values will throw `TypeError`. This means that evaluating hoisted dependencies should throw if and only if the source program throws. (It is also a bug if source throws and compiler output does not throw. See facebook/react-forget#2709)

---
### Rough high level overview
1. Pass 1
Walk over instructions to gather every temporary used outside of its defining scope (same as ReactiveFunction version). These determine the sidemaps we produce, as temporaries used outside of their declaring scopes get promoted to named variables later (and are not considered hoistable rvals).
2. Pass 2 (collectTemporariesSidemap)
Walk over instructions to generate a sidemap of temporary identifier -> named variable and property path (e.g. `$3 -> {obj: props, path: ["a", "b"]}`)
2. Pass 2 (collectHoistablePropertyLoads)
  a. Build a sidemap of block -> accessed variables and properties (e.g. `bb0 -> [ {obj: props, path: ["a", "b"]} ]`)
  b. Propagate "non-nullness" i.e. variables and properties for which we can safely evaluate `PropertyLoad`.
  A basic block can unconditionally read from identifier X if any of the following applies:
    - the block itself reads from identifier X
    - all predecessors of the block read from identifier X
    - all successors of the block read from identifier X
4. Pass 3: (collectDependencies)
Walks over instructions again to record dependencies and declarations, using the previously produced sidemaps. We do not record any control-flow here
5. Merge every scope's recorded dependencies with the set of hoistable PropertyLoads

Tested by syncing internally and (1) checking compilation output differences ([internal link](https://www.internalfb.com/intern/everpaste/?handle=GPCfUBt_HCoy_S4EAJDVFJyJJMR0bsIXAAAB)), running internally e2e tests ([internal link](https://fburl.com/sandcastle/cs5mlkxq))

---
### Followups:
1. Rewrite function expression deps
This change produces much more optimal output as the compiler now uses the function CFG to understand which variables / paths are assumed to be non-null. However, it may exacerbate [this function-expr hoisting bug](https://github.com/facebook/react/blob/main/compiler/packages/babel-plugin-react-compiler/src/__tests__/fixtures/compiler/bug-invalid-hoisting-functionexpr.tsx). A short term fix here is to simply call some form of `collectNonNullObjects` on every function expression to find hoistable variable / paths. In the longer term, we should refactor out `FunctionExpression.deps`.

2. Enable optional paths
(a) don't count optional load temporaries as dependencies (e.g. `collectOptionalLoadRValues(...)`).
(b) record optional paths in both collectHoistablePropertyLoads and dependency collection

ghstack-source-id: 470ab175a901772567139bd0bb93b3e3c71c9dc7
Pull Request resolved: #30894
[ghstack-poisoned]
mofeiZ added a commit that referenced this pull request Sep 11, 2024
Resubmission of #30079 -- core logic unchanged, but needed to rebase past #30573

### Quick background
#### Temporaries

The compiler currently treats temporaries and named variables (e.g. `x`) differently in this pass.
- named variables may be reassigned (in fact, since we're running after LeaveSSA, a single named identifier's IdentifierId may map to multiple `Identifier` instances -- each with its own scope and mutable range)
- temporaries are replaced with their represented expressions during codegen. This is correct (mostly correct, see #29878) as we're careful to always lower the correct evaluation semantics. However, since we rewrite reactive scopes entirely (to if/else blocks), we need to track temporaries that a scope produces in `ReactiveScope.declarations` and later promote them to named variables.
In the same example, $4, $5, and $6 need to be promoted: $2 ->`t0`,  $5 ->`t1`, and $6 ->`t2`.
```js
[1] $2 = LoadGlobal(global) foo
[2] $3 = LoadLocal bar$1
scope 0:
  [3] $4 = Call $2(<unknown> $3)
scope 1:
  [4] $5 = Object {  }
scope 2:
  [5] $6 = Object { a: $4, b: $5 }
[6] $8 = StoreLocal Const x$7 = $6
```

#### Dependencies
`ReactiveScope.dependencies` records the set of (read-only) values that a reactive scope is dependent on. This is currently limited to just variables (named variables from source and promoted temporaries) and property-loads.
All dependencies we record need to be hoistable -- i.e. reordered to just before the ReactiveScope begins. Not all PropertyLoads are hoistable.

In this example, we should not evaluate `obj.a.b` without before creating x and checking `objIsNull`.
```js
// reduce-reactive-deps/no-uncond.js
function useFoo({ obj, objIsNull }) {
  const x = [];
  if (isFalse(objIsNull)) {
    x.push(obj.a.b);
  }
  return x;
}
```

While other memoization strategies with different constraints exist, the current compiler requires that `ReactiveScope.dependencies` be re-orderable to the beginning of the reactive scope. But.. `PropertyLoad`s from null values will throw `TypeError`. This means that evaluating hoisted dependencies should throw if and only if the source program throws. (It is also a bug if source throws and compiler output does not throw. See facebook/react-forget#2709)

---
### Rough high level overview
1. Pass 1
Walk over instructions to gather every temporary used outside of its defining scope (same as ReactiveFunction version). These determine the sidemaps we produce, as temporaries used outside of their declaring scopes get promoted to named variables later (and are not considered hoistable rvals).
2. Pass 2 (collectTemporariesSidemap)
Walk over instructions to generate a sidemap of temporary identifier -> named variable and property path (e.g. `$3 -> {obj: props, path: ["a", "b"]}`)
2. Pass 2 (collectHoistablePropertyLoads)
  a. Build a sidemap of block -> accessed variables and properties (e.g. `bb0 -> [ {obj: props, path: ["a", "b"]} ]`)
  b. Propagate "non-nullness" i.e. variables and properties for which we can safely evaluate `PropertyLoad`.
  A basic block can unconditionally read from identifier X if any of the following applies:
    - the block itself reads from identifier X
    - all predecessors of the block read from identifier X
    - all successors of the block read from identifier X
4. Pass 3: (collectDependencies)
Walks over instructions again to record dependencies and declarations, using the previously produced sidemaps. We do not record any control-flow here
5. Merge every scope's recorded dependencies with the set of hoistable PropertyLoads

Tested by syncing internally and (1) checking compilation output differences ([internal link](https://www.internalfb.com/intern/everpaste/?handle=GPCfUBt_HCoy_S4EAJDVFJyJJMR0bsIXAAAB)), running internally e2e tests ([internal link](https://fburl.com/sandcastle/cs5mlkxq))

---
### Followups:
1. Rewrite function expression deps
This change produces much more optimal output as the compiler now uses the function CFG to understand which variables / paths are assumed to be non-null. However, it may exacerbate [this function-expr hoisting bug](https://github.com/facebook/react/blob/main/compiler/packages/babel-plugin-react-compiler/src/__tests__/fixtures/compiler/bug-invalid-hoisting-functionexpr.tsx). A short term fix here is to simply call some form of `collectNonNullObjects` on every function expression to find hoistable variable / paths. In the longer term, we should refactor out `FunctionExpression.deps`.

2. Enable optional paths
(a) don't count optional load temporaries as dependencies (e.g. `collectOptionalLoadRValues(...)`).
(b) record optional paths in both collectHoistablePropertyLoads and dependency collection

ghstack-source-id: e1fe5d80076df15c5277c6761138f45f267e11dd
Pull Request resolved: #30894
mofeiZ added a commit that referenced this pull request Sep 11, 2024
Resubmission of #30079 -- core logic unchanged, but needed to rebase past #30573

### Quick background
#### Temporaries

The compiler currently treats temporaries and named variables (e.g. `x`) differently in this pass.
- named variables may be reassigned (in fact, since we're running after LeaveSSA, a single named identifier's IdentifierId may map to multiple `Identifier` instances -- each with its own scope and mutable range)
- temporaries are replaced with their represented expressions during codegen. This is correct (mostly correct, see #29878) as we're careful to always lower the correct evaluation semantics. However, since we rewrite reactive scopes entirely (to if/else blocks), we need to track temporaries that a scope produces in `ReactiveScope.declarations` and later promote them to named variables.
In the same example, $4, $5, and $6 need to be promoted: $2 ->`t0`,  $5 ->`t1`, and $6 ->`t2`.
```js
[1] $2 = LoadGlobal(global) foo
[2] $3 = LoadLocal bar$1
scope 0:
  [3] $4 = Call $2(<unknown> $3)
scope 1:
  [4] $5 = Object {  }
scope 2:
  [5] $6 = Object { a: $4, b: $5 }
[6] $8 = StoreLocal Const x$7 = $6
```

#### Dependencies
`ReactiveScope.dependencies` records the set of (read-only) values that a reactive scope is dependent on. This is currently limited to just variables (named variables from source and promoted temporaries) and property-loads.
All dependencies we record need to be hoistable -- i.e. reordered to just before the ReactiveScope begins. Not all PropertyLoads are hoistable.

In this example, we should not evaluate `obj.a.b` without before creating x and checking `objIsNull`.
```js
// reduce-reactive-deps/no-uncond.js
function useFoo({ obj, objIsNull }) {
  const x = [];
  if (isFalse(objIsNull)) {
    x.push(obj.a.b);
  }
  return x;
}
```

While other memoization strategies with different constraints exist, the current compiler requires that `ReactiveScope.dependencies` be re-orderable to the beginning of the reactive scope. But.. `PropertyLoad`s from null values will throw `TypeError`. This means that evaluating hoisted dependencies should throw if and only if the source program throws. (It is also a bug if source throws and compiler output does not throw. See facebook/react-forget#2709)

---
### Rough high level overview
1. Pass 1
Walk over instructions to gather every temporary used outside of its defining scope (same as ReactiveFunction version). These determine the sidemaps we produce, as temporaries used outside of their declaring scopes get promoted to named variables later (and are not considered hoistable rvals).
2. Pass 2 (collectTemporariesSidemap)
Walk over instructions to generate a sidemap of temporary identifier -> named variable and property path (e.g. `$3 -> {obj: props, path: ["a", "b"]}`)
2. Pass 2 (collectHoistablePropertyLoads)
  a. Build a sidemap of block -> accessed variables and properties (e.g. `bb0 -> [ {obj: props, path: ["a", "b"]} ]`)
  b. Propagate "non-nullness" i.e. variables and properties for which we can safely evaluate `PropertyLoad`.
  A basic block can unconditionally read from identifier X if any of the following applies:
    - the block itself reads from identifier X
    - all predecessors of the block read from identifier X
    - all successors of the block read from identifier X
4. Pass 3: (collectDependencies)
Walks over instructions again to record dependencies and declarations, using the previously produced sidemaps. We do not record any control-flow here
5. Merge every scope's recorded dependencies with the set of hoistable PropertyLoads

Tested by syncing internally and (1) checking compilation output differences ([internal link](https://www.internalfb.com/intern/everpaste/?handle=GPCfUBt_HCoy_S4EAJDVFJyJJMR0bsIXAAAB)), running internally e2e tests ([internal link](https://fburl.com/sandcastle/cs5mlkxq))

---
### Followups:
1. Rewrite function expression deps
This change produces much more optimal output as the compiler now uses the function CFG to understand which variables / paths are assumed to be non-null. However, it may exacerbate [this function-expr hoisting bug](https://github.com/facebook/react/blob/main/compiler/packages/babel-plugin-react-compiler/src/__tests__/fixtures/compiler/bug-invalid-hoisting-functionexpr.tsx). A short term fix here is to simply call some form of `collectNonNullObjects` on every function expression to find hoistable variable / paths. In the longer term, we should refactor out `FunctionExpression.deps`.

2. Enable optional paths
(a) don't count optional load temporaries as dependencies (e.g. `collectOptionalLoadRValues(...)`).
(b) record optional paths in both collectHoistablePropertyLoads and dependency collection

ghstack-source-id: e1fe5d80076df15c5277c6761138f45f267e11dd
Pull Request resolved: #30894
@mofeiZ
Copy link
Contributor Author

mofeiZ commented Sep 11, 2024

However, it may exacerbate this function-expr hoisting bug.

Link is dead

Updated link (ts -> tsx 🤦‍♀️)

@mofeiZ
Copy link
Contributor Author

mofeiZ commented Sep 11, 2024

Updated per feedback from @mvitousek and @josephsavona. I amended this existing one to keep the blame history clean, but please let me know if you would prefer that I make separate feedback PRs in the future.

Double checked that these changes were no-ops by syncing internally and validating compilation output of 100+k files stayed the same.

Changes:

  • rebased on [compiler] Fork fixtures for enablePropagateDepsInHIR #30949 which forked test fixtures changed by this PR (previously, enablePropagateScopeDepsInHIR was enabled by default for all fixtures)
  • refactored to combine properties: Map<IdentifierId, PropertyLoadNode> and temporaries: Map<IdentifierId, Identifier> into a single temporaries: Map<IdentifierId, PropertyLoadNode>. This simplifies a lot of logic as previously we were checking and trying to merge entries from both maps. This should be correct as an lvalue (the identifier id) can only be mapped to a single instruction.
  • tweaked recursion logic for recursivelyDeriveNonNull.
    • previously, we initially set nonNullObjects = {}, then recursively calculated nonNullObjects = Union(nonNullObjects_own_block, Intersect(nonNullObjects of neighbors))
    • now, initially set nonNullObjects = nonNullObjects_own_block, then recursively calculate nonNullObjects = Union(nonNullObjects, Intersect(nonNullObjects of neighbors))

One thing that I think might be a source of incidental complexity is the recursive nature of the approach: propagateScopeDeps() delegates to collectHoistables() which then does collectNodes() and computeSidemaps(), but those inner values from collectHoistables() are used later.

  • moved collectSidemap call out of collectHoistables.
  • renamed collectNodes -> collectPropertyLoadsInBlocks, improved comments, changed recursion to use do-while loop, etc
  • added TODO for Destructuring instructions

Should be ready to merge. Thanks again for all the great feedback!

[ghstack-poisoned]
mofeiZ added a commit that referenced this pull request Sep 11, 2024
Resubmission of #30079 -- core logic unchanged, but needed to rebase past #30573

### Quick background
#### Temporaries

The compiler currently treats temporaries and named variables (e.g. `x`) differently in this pass.
- named variables may be reassigned (in fact, since we're running after LeaveSSA, a single named identifier's IdentifierId may map to multiple `Identifier` instances -- each with its own scope and mutable range)
- temporaries are replaced with their represented expressions during codegen. This is correct (mostly correct, see #29878) as we're careful to always lower the correct evaluation semantics. However, since we rewrite reactive scopes entirely (to if/else blocks), we need to track temporaries that a scope produces in `ReactiveScope.declarations` and later promote them to named variables.
In the same example, $4, $5, and $6 need to be promoted: $2 ->`t0`,  $5 ->`t1`, and $6 ->`t2`.
```js
[1] $2 = LoadGlobal(global) foo
[2] $3 = LoadLocal bar$1
scope 0:
  [3] $4 = Call $2(<unknown> $3)
scope 1:
  [4] $5 = Object {  }
scope 2:
  [5] $6 = Object { a: $4, b: $5 }
[6] $8 = StoreLocal Const x$7 = $6
```

#### Dependencies
`ReactiveScope.dependencies` records the set of (read-only) values that a reactive scope is dependent on. This is currently limited to just variables (named variables from source and promoted temporaries) and property-loads.
All dependencies we record need to be hoistable -- i.e. reordered to just before the ReactiveScope begins. Not all PropertyLoads are hoistable.

In this example, we should not evaluate `obj.a.b` without before creating x and checking `objIsNull`.
```js
// reduce-reactive-deps/no-uncond.js
function useFoo({ obj, objIsNull }) {
  const x = [];
  if (isFalse(objIsNull)) {
    x.push(obj.a.b);
  }
  return x;
}
```

While other memoization strategies with different constraints exist, the current compiler requires that `ReactiveScope.dependencies` be re-orderable to the beginning of the reactive scope. But.. `PropertyLoad`s from null values will throw `TypeError`. This means that evaluating hoisted dependencies should throw if and only if the source program throws. (It is also a bug if source throws and compiler output does not throw. See facebook/react-forget#2709)

---
### Rough high level overview
1. Pass 1
Walk over instructions to gather every temporary used outside of its defining scope (same as ReactiveFunction version). These determine the sidemaps we produce, as temporaries used outside of their declaring scopes get promoted to named variables later (and are not considered hoistable rvals).
2. Pass 2 (collectTemporariesSidemap)
Walk over instructions to generate a sidemap of temporary identifier -> named variable and property path (e.g. `$3 -> {obj: props, path: ["a", "b"]}`)
2. Pass 2 (collectHoistablePropertyLoads)
  a. Build a sidemap of block -> accessed variables and properties (e.g. `bb0 -> [ {obj: props, path: ["a", "b"]} ]`)
  b. Propagate "non-nullness" i.e. variables and properties for which we can safely evaluate `PropertyLoad`.
  A basic block can unconditionally read from identifier X if any of the following applies:
    - the block itself reads from identifier X
    - all predecessors of the block read from identifier X
    - all successors of the block read from identifier X
4. Pass 3: (collectDependencies)
Walks over instructions again to record dependencies and declarations, using the previously produced sidemaps. We do not record any control-flow here
5. Merge every scope's recorded dependencies with the set of hoistable PropertyLoads

Tested by syncing internally and (1) checking compilation output differences ([internal link](https://www.internalfb.com/intern/everpaste/?handle=GPCfUBt_HCoy_S4EAJDVFJyJJMR0bsIXAAAB)), running internally e2e tests ([internal link](https://fburl.com/sandcastle/cs5mlkxq))

---
### Followups:
1. Rewrite function expression deps
This change produces much more optimal output as the compiler now uses the function CFG to understand which variables / paths are assumed to be non-null. However, it may exacerbate [this function-expr hoisting bug](https://github.com/facebook/react/blob/main/compiler/packages/babel-plugin-react-compiler/src/__tests__/fixtures/compiler/bug-invalid-hoisting-functionexpr.tsx). A short term fix here is to simply call some form of `collectNonNullObjects` on every function expression to find hoistable variable / paths. In the longer term, we should refactor out `FunctionExpression.deps`.

2. Enable optional paths
(a) don't count optional load temporaries as dependencies (e.g. `collectOptionalLoadRValues(...)`).
(b) record optional paths in both collectHoistablePropertyLoads and dependency collection

ghstack-source-id: 6c62db14213284bc61869b8db57198546c399f7f
Pull Request resolved: #30894
@josephsavona
Copy link
Contributor

awesome, shipit!

Copy link
Contributor

@mvitousek mvitousek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heck yeah

[ghstack-poisoned]
mofeiZ added a commit that referenced this pull request Sep 12, 2024
Resubmission of #30079 -- core logic unchanged, but needed to rebase past #30573

### Quick background
#### Temporaries

The compiler currently treats temporaries and named variables (e.g. `x`) differently in this pass.
- named variables may be reassigned (in fact, since we're running after LeaveSSA, a single named identifier's IdentifierId may map to multiple `Identifier` instances -- each with its own scope and mutable range)
- temporaries are replaced with their represented expressions during codegen. This is correct (mostly correct, see #29878) as we're careful to always lower the correct evaluation semantics. However, since we rewrite reactive scopes entirely (to if/else blocks), we need to track temporaries that a scope produces in `ReactiveScope.declarations` and later promote them to named variables.
In the same example, $4, $5, and $6 need to be promoted: $2 ->`t0`,  $5 ->`t1`, and $6 ->`t2`.
```js
[1] $2 = LoadGlobal(global) foo
[2] $3 = LoadLocal bar$1
scope 0:
  [3] $4 = Call $2(<unknown> $3)
scope 1:
  [4] $5 = Object {  }
scope 2:
  [5] $6 = Object { a: $4, b: $5 }
[6] $8 = StoreLocal Const x$7 = $6
```

#### Dependencies
`ReactiveScope.dependencies` records the set of (read-only) values that a reactive scope is dependent on. This is currently limited to just variables (named variables from source and promoted temporaries) and property-loads.
All dependencies we record need to be hoistable -- i.e. reordered to just before the ReactiveScope begins. Not all PropertyLoads are hoistable.

In this example, we should not evaluate `obj.a.b` without before creating x and checking `objIsNull`.
```js
// reduce-reactive-deps/no-uncond.js
function useFoo({ obj, objIsNull }) {
  const x = [];
  if (isFalse(objIsNull)) {
    x.push(obj.a.b);
  }
  return x;
}
```

While other memoization strategies with different constraints exist, the current compiler requires that `ReactiveScope.dependencies` be re-orderable to the beginning of the reactive scope. But.. `PropertyLoad`s from null values will throw `TypeError`. This means that evaluating hoisted dependencies should throw if and only if the source program throws. (It is also a bug if source throws and compiler output does not throw. See facebook/react-forget#2709)

---
### Rough high level overview
1. Pass 1
Walk over instructions to gather every temporary used outside of its defining scope (same as ReactiveFunction version). These determine the sidemaps we produce, as temporaries used outside of their declaring scopes get promoted to named variables later (and are not considered hoistable rvals).
2. Pass 2 (collectTemporariesSidemap)
Walk over instructions to generate a sidemap of temporary identifier -> named variable and property path (e.g. `$3 -> {obj: props, path: ["a", "b"]}`)
2. Pass 2 (collectHoistablePropertyLoads)
  a. Build a sidemap of block -> accessed variables and properties (e.g. `bb0 -> [ {obj: props, path: ["a", "b"]} ]`)
  b. Propagate "non-nullness" i.e. variables and properties for which we can safely evaluate `PropertyLoad`.
  A basic block can unconditionally read from identifier X if any of the following applies:
    - the block itself reads from identifier X
    - all predecessors of the block read from identifier X
    - all successors of the block read from identifier X
4. Pass 3: (collectDependencies)
Walks over instructions again to record dependencies and declarations, using the previously produced sidemaps. We do not record any control-flow here
5. Merge every scope's recorded dependencies with the set of hoistable PropertyLoads

Tested by syncing internally and (1) checking compilation output differences ([internal link](https://www.internalfb.com/intern/everpaste/?handle=GPCfUBt_HCoy_S4EAJDVFJyJJMR0bsIXAAAB)), running internally e2e tests ([internal link](https://fburl.com/sandcastle/cs5mlkxq))

---
### Followups:
1. Rewrite function expression deps
This change produces much more optimal output as the compiler now uses the function CFG to understand which variables / paths are assumed to be non-null. However, it may exacerbate [this function-expr hoisting bug](https://github.com/facebook/react/blob/main/compiler/packages/babel-plugin-react-compiler/src/__tests__/fixtures/compiler/bug-invalid-hoisting-functionexpr.tsx). A short term fix here is to simply call some form of `collectNonNullObjects` on every function expression to find hoistable variable / paths. In the longer term, we should refactor out `FunctionExpression.deps`.

2. Enable optional paths
(a) don't count optional load temporaries as dependencies (e.g. `collectOptionalLoadRValues(...)`).
(b) record optional paths in both collectHoistablePropertyLoads and dependency collection

ghstack-source-id: 2507f6ea751dce09ad1dccd353ae6fc7cf411582
Pull Request resolved: #30894
@mofeiZ mofeiZ merged commit 9e5adea into gh/mofeiZ/18/base Sep 12, 2024
19 checks passed
mofeiZ added a commit that referenced this pull request Sep 12, 2024
- flip `enablePropagateDepsInHIR` to off by default
- fork fixtures which produce compilation differences in #30894 to separate directory `propagate-scope-deps-hir-fork`, to be cleaned up when we remove this flag

ghstack-source-id: 7d5b8dc29788a65c272c846af9877b09fbf2cd60
Pull Request resolved: #30949
mofeiZ added a commit that referenced this pull request Sep 12, 2024
Resubmission of #30079 -- core logic unchanged, but needed to rebase past #30573

### Quick background
#### Temporaries

The compiler currently treats temporaries and named variables (e.g. `x`) differently in this pass.
- named variables may be reassigned (in fact, since we're running after LeaveSSA, a single named identifier's IdentifierId may map to multiple `Identifier` instances -- each with its own scope and mutable range)
- temporaries are replaced with their represented expressions during codegen. This is correct (mostly correct, see #29878) as we're careful to always lower the correct evaluation semantics. However, since we rewrite reactive scopes entirely (to if/else blocks), we need to track temporaries that a scope produces in `ReactiveScope.declarations` and later promote them to named variables.
In the same example, $4, $5, and $6 need to be promoted: $2 ->`t0`,  $5 ->`t1`, and $6 ->`t2`.
```js
[1] $2 = LoadGlobal(global) foo
[2] $3 = LoadLocal bar$1
scope 0:
  [3] $4 = Call $2(<unknown> $3)
scope 1:
  [4] $5 = Object {  }
scope 2:
  [5] $6 = Object { a: $4, b: $5 }
[6] $8 = StoreLocal Const x$7 = $6
```

#### Dependencies
`ReactiveScope.dependencies` records the set of (read-only) values that a reactive scope is dependent on. This is currently limited to just variables (named variables from source and promoted temporaries) and property-loads.
All dependencies we record need to be hoistable -- i.e. reordered to just before the ReactiveScope begins. Not all PropertyLoads are hoistable.

In this example, we should not evaluate `obj.a.b` without before creating x and checking `objIsNull`.
```js
// reduce-reactive-deps/no-uncond.js
function useFoo({ obj, objIsNull }) {
  const x = [];
  if (isFalse(objIsNull)) {
    x.push(obj.a.b);
  }
  return x;
}
```

While other memoization strategies with different constraints exist, the current compiler requires that `ReactiveScope.dependencies` be re-orderable to the beginning of the reactive scope. But.. `PropertyLoad`s from null values will throw `TypeError`. This means that evaluating hoisted dependencies should throw if and only if the source program throws. (It is also a bug if source throws and compiler output does not throw. See facebook/react-forget#2709)

---
### Rough high level overview
1. Pass 1
Walk over instructions to gather every temporary used outside of its defining scope (same as ReactiveFunction version). These determine the sidemaps we produce, as temporaries used outside of their declaring scopes get promoted to named variables later (and are not considered hoistable rvals).
2. Pass 2 (collectTemporariesSidemap)
Walk over instructions to generate a sidemap of temporary identifier -> named variable and property path (e.g. `$3 -> {obj: props, path: ["a", "b"]}`)
2. Pass 2 (collectHoistablePropertyLoads)
  a. Build a sidemap of block -> accessed variables and properties (e.g. `bb0 -> [ {obj: props, path: ["a", "b"]} ]`)
  b. Propagate "non-nullness" i.e. variables and properties for which we can safely evaluate `PropertyLoad`.
  A basic block can unconditionally read from identifier X if any of the following applies:
    - the block itself reads from identifier X
    - all predecessors of the block read from identifier X
    - all successors of the block read from identifier X
4. Pass 3: (collectDependencies)
Walks over instructions again to record dependencies and declarations, using the previously produced sidemaps. We do not record any control-flow here
5. Merge every scope's recorded dependencies with the set of hoistable PropertyLoads

Tested by syncing internally and (1) checking compilation output differences ([internal link](https://www.internalfb.com/intern/everpaste/?handle=GPCfUBt_HCoy_S4EAJDVFJyJJMR0bsIXAAAB)), running internally e2e tests ([internal link](https://fburl.com/sandcastle/cs5mlkxq))

---
### Followups:
1. Rewrite function expression deps
This change produces much more optimal output as the compiler now uses the function CFG to understand which variables / paths are assumed to be non-null. However, it may exacerbate [this function-expr hoisting bug](https://github.com/facebook/react/blob/main/compiler/packages/babel-plugin-react-compiler/src/__tests__/fixtures/compiler/bug-invalid-hoisting-functionexpr.tsx). A short term fix here is to simply call some form of `collectNonNullObjects` on every function expression to find hoistable variable / paths. In the longer term, we should refactor out `FunctionExpression.deps`.

2. Enable optional paths
(a) don't count optional load temporaries as dependencies (e.g. `collectOptionalLoadRValues(...)`).
(b) record optional paths in both collectHoistablePropertyLoads and dependency collection

ghstack-source-id: 2507f6ea751dce09ad1dccd353ae6fc7cf411582
Pull Request resolved: #30894
@mofeiZ mofeiZ deleted the gh/mofeiZ/18/head branch September 12, 2024 21:41
mofeiZ added a commit that referenced this pull request Sep 30, 2024
Followup from #30894 , not sure how these got missed. Note that this PR just copies the fixtures without adding `@enablePropagateDepsInHIR`. #31032 follows and actually enables the HIR-version of propagateScopeDeps to run. I split this out into two PRs to make snapshot differences easier to review, but also happy to merge

Fixtures found from locally setting snap test runner to default to `enablePropagateDepsInHIR: 'enabled_baseline'` and forking fixtures files with different output.

ghstack-source-id: 7d7cf41aa923d83ad49f89079171b0411923ce6b
Pull Request resolved: #31030
mofeiZ added a commit that referenced this pull request Sep 30, 2024
Followup from #30894.
This adds a new flagged mode `enablePropagateScopeDepsInHIR: "enabled_with_optimizations"`, under which we infer more hoistable loads:
- it's always safe to evaluate loads from `props` (i.e. first parameter of a `component`)
- destructuring sources are safe to evaluate loads from (e.g. given `{x} = obj`, we infer that it's safe to evaluate obj.y)
- computed load sources are safe to evaluate loads from (e.g. given `arr[0]`, we can infer that it's safe to evaluate arr.length)

ghstack-source-id: 32f3bb72e9f85922825579bd785d636f4ccf724d
Pull Request resolved: #31033
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed React Core Team Opened by a member of the React Core Team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants