[compiler] Stay in SSA form through entire pipeline #30573

josephsavona · 2024-08-01T18:36:52Z

Stack from ghstack (oldest at bottom):

This PR updates to use SSA form through the entire compilation pipeline. This means that in both HIR form and ReactiveFunction form, Identifier instances map 1:1 to IdentifierId values. If two identifiers have the same IdentifierId, they are the same instance. What this means is that all our passes can use this more precise information to determine if two particular identifiers are not just the same variable, but the same SSA "version" of that variable.

However, some parts of our analysis really care about program variables as opposed to SSA versions, and were relying on LeaveSSA to reset identifiers such that all Identifier instances for a particular program variable would have the same IdentifierId (though not necessarily the same Identifier instance). With LeaveSSA removed, those analysis passes can now use DeclarationId instead to uniquely identify a program variable.

Note that this PR surfaces some opportunties to improve edge-cases around reassigned values being declared/reassigned/depended-upon across multiple scopes. Several passes could/should use IdentifierId to more precisely identify exactly which values are accessed - for example, a scope that reassigns x but doesn't use x prior to reassignment doesn't have to take a dependency on x. But today we take a dependnecy.

My approach for these cases was to add a "TODO LeaveSSA" comment with notes and the name of the fixture demonstrating the difference, but to intentionally preserve the existing behavior (generally, switching to use DeclarationId when IdentifierId would have been more precise).

Beyond updating passes to use DeclarationId instead of Identifier/IdentifierId, the other change here is to extract out the remaining necessary bits of LeaveSSA into a new pass that rewrites InstructionKind (const/let/reassign/etc) based on whether a value is actually const or has reassignments and should be let.

[ghstack-poisoned]

vercel · 2024-08-01T18:36:54Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Comments	Updated (UTC)
react-compiler-playground	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Aug 6, 2024 6:29pm

ghstack-source-id: b3c84364193cc58d405edda235252336d0aefab5 Pull Request resolved: #30573

[ghstack-poisoned]

ghstack-source-id: d15a711d164cac86e5c80148c759b259ce45a810 Pull Request resolved: #30573

[ghstack-poisoned]

ghstack-source-id: b6ec82a461d95283dea3b040e81f100b29c7cc4e Pull Request resolved: #30573

[ghstack-poisoned]

ghstack-source-id: 9962038ebeeb650d5a07146308c9ee1cfb26853b Pull Request resolved: #30573

[ghstack-poisoned]

ghstack-source-id: 7197375fff0345a5b4d2520ef863488bb1dc1d15 Pull Request resolved: #30573

[ghstack-poisoned]

ghstack-source-id: 1b653587414b4103f33f415ab6574e8dbedd90b1 Pull Request resolved: #30573

[ghstack-poisoned]

ghstack-source-id: 27d08cc1ee0cd4459078e25415da0c54518759ee Pull Request resolved: #30573

[ghstack-poisoned]

ghstack-source-id: e54a9d44a89a11cd9d65939da17222d85be05712 Pull Request resolved: #30573

[ghstack-poisoned]

ghstack-source-id: 1b4da810d5086a6b90d43effcfedb2ecbb15e573 Pull Request resolved: #30573

josephsavona · 2024-08-06T00:12:45Z

Nice find, i'll add a fixture for that and fix

mofeiZ · 2024-08-06T00:21:59Z

Compare the memoization in master and this pr.

I was curious so looked into this (I'm sure someone else probably has already found the bug). I think the issue is that we're not checking that operands are mutable when extending scope ranges (fbtScope.range.start = Math.min(...))

compiler/scripts/anonymize.js

This PR updates to use SSA form through the entire compilation pipeline. This means that in both HIR form and ReactiveFunction form, `Identifier` instances map 1:1 to `IdentifierId` values. If two identifiers have the same IdentifierId, they are the same instance. What this means is that all our passes can use this more precise information to determine if two particular identifiers are not just the same variable, but the same SSA "version" of that variable. However, some parts of our analysis really care about program variables as opposed to SSA versions, and were relying on LeaveSSA to reset identifiers such that all Identifier instances for a particular program variable would have the same IdentifierId (though not necessarily the same Identifier instance). With LeaveSSA removed, those analysis passes can now use DeclarationId instead to uniquely identify a program variable. Note that this PR surfaces some opportunties to improve edge-cases around reassigned values being declared/reassigned/depended-upon across multiple scopes. Several passes could/should use IdentifierId to more precisely identify exactly which values are accessed - for example, a scope that reassigns `x` but doesn't use `x` prior to reassignment doesn't have to take a dependency on `x`. But today we take a dependnecy. My approach for these cases was to add a "TODO LeaveSSA" comment with notes and the name of the fixture demonstrating the difference, but to intentionally preserve the existing behavior (generally, switching to use DeclarationId when IdentifierId would have been more precise). Beyond updating passes to use DeclarationId instead of Identifier/IdentifierId, the other change here is to extract out the remaining necessary bits of LeaveSSA into a new pass that rewrites InstructionKind (const/let/reassign/etc) based on whether a value is actually const or has reassignments and should be let. [ghstack-poisoned]

This PR updates to use SSA form through the entire compilation pipeline. This means that in both HIR form and ReactiveFunction form, `Identifier` instances map 1:1 to `IdentifierId` values. If two identifiers have the same IdentifierId, they are the same instance. What this means is that all our passes can use this more precise information to determine if two particular identifiers are not just the same variable, but the same SSA "version" of that variable. However, some parts of our analysis really care about program variables as opposed to SSA versions, and were relying on LeaveSSA to reset identifiers such that all Identifier instances for a particular program variable would have the same IdentifierId (though not necessarily the same Identifier instance). With LeaveSSA removed, those analysis passes can now use DeclarationId instead to uniquely identify a program variable. Note that this PR surfaces some opportunties to improve edge-cases around reassigned values being declared/reassigned/depended-upon across multiple scopes. Several passes could/should use IdentifierId to more precisely identify exactly which values are accessed - for example, a scope that reassigns `x` but doesn't use `x` prior to reassignment doesn't have to take a dependency on `x`. But today we take a dependnecy. My approach for these cases was to add a "TODO LeaveSSA" comment with notes and the name of the fixture demonstrating the difference, but to intentionally preserve the existing behavior (generally, switching to use DeclarationId when IdentifierId would have been more precise). Beyond updating passes to use DeclarationId instead of Identifier/IdentifierId, the other change here is to extract out the remaining necessary bits of LeaveSSA into a new pass that rewrites InstructionKind (const/let/reassign/etc) based on whether a value is actually const or has reassignments and should be let. ghstack-source-id: 4b9b1d5f83b4421f6a49ed3ccf5c46da01cfa5f6 Pull Request resolved: #30573

josephsavona · 2024-08-06T18:05:59Z

I think the issue is that we're not checking that operands are mutable when extending scope ranges

Thanks @mofeiZ for saving me some time debugging, that was it! I updated that pass to use a new helper which only extends the range if the new range has a non-zero start

This PR updates to use SSA form through the entire compilation pipeline. This means that in both HIR form and ReactiveFunction form, `Identifier` instances map 1:1 to `IdentifierId` values. If two identifiers have the same IdentifierId, they are the same instance. What this means is that all our passes can use this more precise information to determine if two particular identifiers are not just the same variable, but the same SSA "version" of that variable. However, some parts of our analysis really care about program variables as opposed to SSA versions, and were relying on LeaveSSA to reset identifiers such that all Identifier instances for a particular program variable would have the same IdentifierId (though not necessarily the same Identifier instance). With LeaveSSA removed, those analysis passes can now use DeclarationId instead to uniquely identify a program variable. Note that this PR surfaces some opportunties to improve edge-cases around reassigned values being declared/reassigned/depended-upon across multiple scopes. Several passes could/should use IdentifierId to more precisely identify exactly which values are accessed - for example, a scope that reassigns `x` but doesn't use `x` prior to reassignment doesn't have to take a dependency on `x`. But today we take a dependnecy. My approach for these cases was to add a "TODO LeaveSSA" comment with notes and the name of the fixture demonstrating the difference, but to intentionally preserve the existing behavior (generally, switching to use DeclarationId when IdentifierId would have been more precise). Beyond updating passes to use DeclarationId instead of Identifier/IdentifierId, the other change here is to extract out the remaining necessary bits of LeaveSSA into a new pass that rewrites InstructionKind (const/let/reassign/etc) based on whether a value is actually const or has reassignments and should be let. [ghstack-poisoned]

This PR updates to use SSA form through the entire compilation pipeline. This means that in both HIR form and ReactiveFunction form, `Identifier` instances map 1:1 to `IdentifierId` values. If two identifiers have the same IdentifierId, they are the same instance. What this means is that all our passes can use this more precise information to determine if two particular identifiers are not just the same variable, but the same SSA "version" of that variable. However, some parts of our analysis really care about program variables as opposed to SSA versions, and were relying on LeaveSSA to reset identifiers such that all Identifier instances for a particular program variable would have the same IdentifierId (though not necessarily the same Identifier instance). With LeaveSSA removed, those analysis passes can now use DeclarationId instead to uniquely identify a program variable. Note that this PR surfaces some opportunties to improve edge-cases around reassigned values being declared/reassigned/depended-upon across multiple scopes. Several passes could/should use IdentifierId to more precisely identify exactly which values are accessed - for example, a scope that reassigns `x` but doesn't use `x` prior to reassignment doesn't have to take a dependency on `x`. But today we take a dependnecy. My approach for these cases was to add a "TODO LeaveSSA" comment with notes and the name of the fixture demonstrating the difference, but to intentionally preserve the existing behavior (generally, switching to use DeclarationId when IdentifierId would have been more precise). Beyond updating passes to use DeclarationId instead of Identifier/IdentifierId, the other change here is to extract out the remaining necessary bits of LeaveSSA into a new pass that rewrites InstructionKind (const/let/reassign/etc) based on whether a value is actually const or has reassignments and should be let. ghstack-source-id: 69afdaee5fadf3fdc98ce97549da805f288218b4 Pull Request resolved: #30573

josephsavona · 2024-08-06T19:02:29Z

Latest diff is P1515916566, looks like the fbt issues are fixed, i went through the majority of the remaining differences manually. They are nearly all due to ternaries or optionals. There were a couple where deps changed and i debugged them and confirmed that the previous results were unnecessarily including a dependency that wasn't actually reactive.

This PR updates to use SSA form through the entire compilation pipeline. This means that in both HIR form and ReactiveFunction form, `Identifier` instances map 1:1 to `IdentifierId` values. If two identifiers have the same IdentifierId, they are the same instance. What this means is that all our passes can use this more precise information to determine if two particular identifiers are not just the same variable, but the same SSA "version" of that variable. However, some parts of our analysis really care about program variables as opposed to SSA versions, and were relying on LeaveSSA to reset identifiers such that all Identifier instances for a particular program variable would have the same IdentifierId (though not necessarily the same Identifier instance). With LeaveSSA removed, those analysis passes can now use DeclarationId instead to uniquely identify a program variable. Note that this PR surfaces some opportunties to improve edge-cases around reassigned values being declared/reassigned/depended-upon across multiple scopes. Several passes could/should use IdentifierId to more precisely identify exactly which values are accessed - for example, a scope that reassigns `x` but doesn't use `x` prior to reassignment doesn't have to take a dependency on `x`. But today we take a dependnecy. My approach for these cases was to add a "TODO LeaveSSA" comment with notes and the name of the fixture demonstrating the difference, but to intentionally preserve the existing behavior (generally, switching to use DeclarationId when IdentifierId would have been more precise). Beyond updating passes to use DeclarationId instead of Identifier/IdentifierId, the other change here is to extract out the remaining necessary bits of LeaveSSA into a new pass that rewrites InstructionKind (const/let/reassign/etc) based on whether a value is actually const or has reassignments and should be let. ghstack-source-id: 69afdaee5fadf3fdc98ce97549da805f288218b4 Pull Request resolved: #30573

I forgot to clean this up before landing #30573. [ghstack-poisoned]

I forgot to clean this up before landing #30573. ghstack-source-id: 2141471912e410aa12545dcf7989f45447007ba9 Pull Request resolved: #30632

Resubmission of #30079 -- core logic unchanged, but needed to rebase past #30573 ### Quick background #### Temporaries The compiler currently treats temporaries and named variables (e.g. `x`) differently in this pass. - named variables may be reassigned (in fact, since we're running after LeaveSSA, a single named identifier's IdentifierId may map to multiple `Identifier` instances -- each with its own scope and mutable range) - temporaries are replaced with their represented expressions during codegen. This is correct (mostly correct, see #29878) as we're careful to always lower the correct evaluation semantics. However, since we rewrite reactive scopes entirely (to if/else blocks), we need to track temporaries that a scope produces in `ReactiveScope.declarations` and later promote them to named variables. In the same example, $4, $5, and $6 need to be promoted: $2 ->`t0`, $5 ->`t1`, and $6 ->`t2`. ```js [1] $2 = LoadGlobal(global) foo [2] $3 = LoadLocal bar$1 scope 0: [3] $4 = Call $2(<unknown> $3) scope 1: [4] $5 = Object { } scope 2: [5] $6 = Object { a: $4, b: $5 } [6] $8 = StoreLocal Const x$7 = $6 ``` #### Dependencies `ReactiveScope.dependencies` records the set of (read-only) values that a reactive scope is dependent on. This is currently limited to just variables (named variables from source and promoted temporaries) and property-loads. All dependencies we record need to be hoistable -- i.e. reordered to just before the ReactiveScope begins. Not all PropertyLoads are hoistable. In this example, we should not evaluate `obj.a.b` without before creating x and checking `objIsNull`. ```js // reduce-reactive-deps/no-uncond.js function useFoo({ obj, objIsNull }) { const x = []; if (isFalse(objIsNull)) { x.push(obj.a.b); } return x; } ``` While other memoization strategies with different constraints exist, the current compiler requires that `ReactiveScope.dependencies` be re-orderable to the beginning of the reactive scope. But.. `PropertyLoad`s from null values will throw `TypeError`. This means that evaluating hoisted dependencies should throw if and only if the source program throws. (It is also a bug if source throws and compiler output does not throw. See facebook/react-forget#2709) --- ### Rough high level overview 1. Pass 1 Walk over instructions to gather every temporary used outside of its defining scope (same as ReactiveFunction version). These determine the sidemaps we produce, as temporaries used outside of their declaring scopes get promoted to named variables later (and are not considered hoistable rvals). 2. Pass 2 (collectHoistablePropertyLoads) Walk over instructions to generate sidemaps: a. temporary identifier -> named variable and property path (e.g. `$3 -> {obj: props, path: ["a", "b"]}`) b. block -> accessed variables and properties (e.g. `bb0 -> [ {obj: props, path: ["a", "b"]} ]`) c. Walk over control flow graph to understand the set of object and property paths that can be read by each basic block. This analysis: - relies on post-dominator trees - traverses the CFG from entry (producing the set of variables/paths unconditionally evaluated *before* a block). - traverses the CFG from exit (producing the set of variables/paths unconditionally evaluated *after* a block). 4. Pass 3: (collectDependencies) Walks over instructions again to record dependencies and declarations, using the previously produced sidemaps. We do not record any control-flow here 5. Merge every scope's recorded dependencies with the set of hoistable PropertyLoads Tested by syncing internally and (1) checking compilation output differences ([internal link](https://www.internalfb.com/intern/everpaste/?handle=GPCfUBt_HCoy_S4EAJDVFJyJJMR0bsIXAAAB)), running internally e2e tests ([internal link](https://fburl.com/sandcastle/cs5mlkxq)) --- ### Followups: 1. Rewrite function expression deps This change produces much more optimal output as the compiler now uses the function CFG to understand which variables / paths are assumed to be non-null. However, it may exacerbate [this function-expr hoisting bug](https://github.com/facebook/react/blob/main/compiler/packages/babel-plugin-react-compiler/src/__tests__/fixtures/compiler/bug-invalid-hoisting-functionexpr.ts). A short term fix here is to simply call some form of `collectNonNullObjects` on every function expression to find hoistable variable / paths. In the longer term, we should refactor out `FunctionExpression.deps`. 2. Enable optional paths (a) don't count optional load temporaries as dependencies (e.g. `collectOptionalLoadRValues(...)`). (b) record optional paths in both collectHoistablePropertyLoads and dependency collection ghstack-source-id: 51e0759c68c931c19fee02665bd37a7274a45572 Pull Request resolved: #30894

Resubmission of #30079 -- core logic unchanged, but needed to rebase past #30573 ### Quick background #### Temporaries The compiler currently treats temporaries and named variables (e.g. `x`) differently in this pass. - named variables may be reassigned (in fact, since we're running after LeaveSSA, a single named identifier's IdentifierId may map to multiple `Identifier` instances -- each with its own scope and mutable range) - temporaries are replaced with their represented expressions during codegen. This is correct (mostly correct, see #29878) as we're careful to always lower the correct evaluation semantics. However, since we rewrite reactive scopes entirely (to if/else blocks), we need to track temporaries that a scope produces in `ReactiveScope.declarations` and later promote them to named variables. In the same example, $4, $5, and $6 need to be promoted: $2 ->`t0`, $5 ->`t1`, and $6 ->`t2`. ```js [1] $2 = LoadGlobal(global) foo [2] $3 = LoadLocal bar$1 scope 0: [3] $4 = Call $2(<unknown> $3) scope 1: [4] $5 = Object { } scope 2: [5] $6 = Object { a: $4, b: $5 } [6] $8 = StoreLocal Const x$7 = $6 ``` #### Dependencies `ReactiveScope.dependencies` records the set of (read-only) values that a reactive scope is dependent on. This is currently limited to just variables (named variables from source and promoted temporaries) and property-loads. All dependencies we record need to be hoistable -- i.e. reordered to just before the ReactiveScope begins. Not all PropertyLoads are hoistable. In this example, we should not evaluate `obj.a.b` without before creating x and checking `objIsNull`. ```js // reduce-reactive-deps/no-uncond.js function useFoo({ obj, objIsNull }) { const x = []; if (isFalse(objIsNull)) { x.push(obj.a.b); } return x; } ``` While other memoization strategies with different constraints exist, the current compiler requires that `ReactiveScope.dependencies` be re-orderable to the beginning of the reactive scope. But.. `PropertyLoad`s from null values will throw `TypeError`. This means that evaluating hoisted dependencies should throw if and only if the source program throws. (It is also a bug if source throws and compiler output does not throw. See facebook/react-forget#2709) --- ### Rough high level overview 1. Pass 1 Walk over instructions to gather every temporary used outside of its defining scope (same as ReactiveFunction version). These determine the sidemaps we produce, as temporaries used outside of their declaring scopes get promoted to named variables later (and are not considered hoistable rvals). 2. Pass 2 (collectTemporariesSidemap) Walk over instructions to generate a sidemap of temporary identifier -> named variable and property path (e.g. `$3 -> {obj: props, path: ["a", "b"]}`) 2. Pass 2 (collectHoistablePropertyLoads) a. Build a sidemap of block -> accessed variables and properties (e.g. `bb0 -> [ {obj: props, path: ["a", "b"]} ]`) b. Propagate "non-nullness" i.e. variables and properties for which we can safely evaluate `PropertyLoad`. A basic block can unconditionally read from identifier X if any of the following applies: - the block itself reads from identifier X - all predecessors of the block read from identifier X - all successors of the block read from identifier X 4. Pass 3: (collectDependencies) Walks over instructions again to record dependencies and declarations, using the previously produced sidemaps. We do not record any control-flow here 5. Merge every scope's recorded dependencies with the set of hoistable PropertyLoads Tested by syncing internally and (1) checking compilation output differences ([internal link](https://www.internalfb.com/intern/everpaste/?handle=GPCfUBt_HCoy_S4EAJDVFJyJJMR0bsIXAAAB)), running internally e2e tests ([internal link](https://fburl.com/sandcastle/cs5mlkxq)) --- ### Followups: 1. Rewrite function expression deps This change produces much more optimal output as the compiler now uses the function CFG to understand which variables / paths are assumed to be non-null. However, it may exacerbate [this function-expr hoisting bug](https://github.com/facebook/react/blob/main/compiler/packages/babel-plugin-react-compiler/src/__tests__/fixtures/compiler/bug-invalid-hoisting-functionexpr.tsx). A short term fix here is to simply call some form of `collectNonNullObjects` on every function expression to find hoistable variable / paths. In the longer term, we should refactor out `FunctionExpression.deps`. 2. Enable optional paths (a) don't count optional load temporaries as dependencies (e.g. `collectOptionalLoadRValues(...)`). (b) record optional paths in both collectHoistablePropertyLoads and dependency collection ghstack-source-id: 470ab175a901772567139bd0bb93b3e3c71c9dc7 Pull Request resolved: #30894

Resubmission of #30079 -- core logic unchanged, but needed to rebase past #30573 ### Quick background #### Temporaries The compiler currently treats temporaries and named variables (e.g. `x`) differently in this pass. - named variables may be reassigned (in fact, since we're running after LeaveSSA, a single named identifier's IdentifierId may map to multiple `Identifier` instances -- each with its own scope and mutable range) - temporaries are replaced with their represented expressions during codegen. This is correct (mostly correct, see #29878) as we're careful to always lower the correct evaluation semantics. However, since we rewrite reactive scopes entirely (to if/else blocks), we need to track temporaries that a scope produces in `ReactiveScope.declarations` and later promote them to named variables. In the same example, $4, $5, and $6 need to be promoted: $2 ->`t0`, $5 ->`t1`, and $6 ->`t2`. ```js [1] $2 = LoadGlobal(global) foo [2] $3 = LoadLocal bar$1 scope 0: [3] $4 = Call $2(<unknown> $3) scope 1: [4] $5 = Object { } scope 2: [5] $6 = Object { a: $4, b: $5 } [6] $8 = StoreLocal Const x$7 = $6 ``` #### Dependencies `ReactiveScope.dependencies` records the set of (read-only) values that a reactive scope is dependent on. This is currently limited to just variables (named variables from source and promoted temporaries) and property-loads. All dependencies we record need to be hoistable -- i.e. reordered to just before the ReactiveScope begins. Not all PropertyLoads are hoistable. In this example, we should not evaluate `obj.a.b` without before creating x and checking `objIsNull`. ```js // reduce-reactive-deps/no-uncond.js function useFoo({ obj, objIsNull }) { const x = []; if (isFalse(objIsNull)) { x.push(obj.a.b); } return x; } ``` While other memoization strategies with different constraints exist, the current compiler requires that `ReactiveScope.dependencies` be re-orderable to the beginning of the reactive scope. But.. `PropertyLoad`s from null values will throw `TypeError`. This means that evaluating hoisted dependencies should throw if and only if the source program throws. (It is also a bug if source throws and compiler output does not throw. See facebook/react-forget#2709) --- ### Rough high level overview 1. Pass 1 Walk over instructions to gather every temporary used outside of its defining scope (same as ReactiveFunction version). These determine the sidemaps we produce, as temporaries used outside of their declaring scopes get promoted to named variables later (and are not considered hoistable rvals). 2. Pass 2 (collectTemporariesSidemap) Walk over instructions to generate a sidemap of temporary identifier -> named variable and property path (e.g. `$3 -> {obj: props, path: ["a", "b"]}`) 2. Pass 2 (collectHoistablePropertyLoads) a. Build a sidemap of block -> accessed variables and properties (e.g. `bb0 -> [ {obj: props, path: ["a", "b"]} ]`) b. Propagate "non-nullness" i.e. variables and properties for which we can safely evaluate `PropertyLoad`. A basic block can unconditionally read from identifier X if any of the following applies: - the block itself reads from identifier X - all predecessors of the block read from identifier X - all successors of the block read from identifier X 4. Pass 3: (collectDependencies) Walks over instructions again to record dependencies and declarations, using the previously produced sidemaps. We do not record any control-flow here 5. Merge every scope's recorded dependencies with the set of hoistable PropertyLoads Tested by syncing internally and (1) checking compilation output differences ([internal link](https://www.internalfb.com/intern/everpaste/?handle=GPCfUBt_HCoy_S4EAJDVFJyJJMR0bsIXAAAB)), running internally e2e tests ([internal link](https://fburl.com/sandcastle/cs5mlkxq)) --- ### Followups: 1. Rewrite function expression deps This change produces much more optimal output as the compiler now uses the function CFG to understand which variables / paths are assumed to be non-null. However, it may exacerbate [this function-expr hoisting bug](https://github.com/facebook/react/blob/main/compiler/packages/babel-plugin-react-compiler/src/__tests__/fixtures/compiler/bug-invalid-hoisting-functionexpr.tsx). A short term fix here is to simply call some form of `collectNonNullObjects` on every function expression to find hoistable variable / paths. In the longer term, we should refactor out `FunctionExpression.deps`. 2. Enable optional paths (a) don't count optional load temporaries as dependencies (e.g. `collectOptionalLoadRValues(...)`). (b) record optional paths in both collectHoistablePropertyLoads and dependency collection ghstack-source-id: e1fe5d80076df15c5277c6761138f45f267e11dd Pull Request resolved: #30894

Resubmission of #30079 -- core logic unchanged, but needed to rebase past #30573 ### Quick background #### Temporaries The compiler currently treats temporaries and named variables (e.g. `x`) differently in this pass. - named variables may be reassigned (in fact, since we're running after LeaveSSA, a single named identifier's IdentifierId may map to multiple `Identifier` instances -- each with its own scope and mutable range) - temporaries are replaced with their represented expressions during codegen. This is correct (mostly correct, see #29878) as we're careful to always lower the correct evaluation semantics. However, since we rewrite reactive scopes entirely (to if/else blocks), we need to track temporaries that a scope produces in `ReactiveScope.declarations` and later promote them to named variables. In the same example, $4, $5, and $6 need to be promoted: $2 ->`t0`, $5 ->`t1`, and $6 ->`t2`. ```js [1] $2 = LoadGlobal(global) foo [2] $3 = LoadLocal bar$1 scope 0: [3] $4 = Call $2(<unknown> $3) scope 1: [4] $5 = Object { } scope 2: [5] $6 = Object { a: $4, b: $5 } [6] $8 = StoreLocal Const x$7 = $6 ``` #### Dependencies `ReactiveScope.dependencies` records the set of (read-only) values that a reactive scope is dependent on. This is currently limited to just variables (named variables from source and promoted temporaries) and property-loads. All dependencies we record need to be hoistable -- i.e. reordered to just before the ReactiveScope begins. Not all PropertyLoads are hoistable. In this example, we should not evaluate `obj.a.b` without before creating x and checking `objIsNull`. ```js // reduce-reactive-deps/no-uncond.js function useFoo({ obj, objIsNull }) { const x = []; if (isFalse(objIsNull)) { x.push(obj.a.b); } return x; } ``` While other memoization strategies with different constraints exist, the current compiler requires that `ReactiveScope.dependencies` be re-orderable to the beginning of the reactive scope. But.. `PropertyLoad`s from null values will throw `TypeError`. This means that evaluating hoisted dependencies should throw if and only if the source program throws. (It is also a bug if source throws and compiler output does not throw. See facebook/react-forget#2709) --- ### Rough high level overview 1. Pass 1 Walk over instructions to gather every temporary used outside of its defining scope (same as ReactiveFunction version). These determine the sidemaps we produce, as temporaries used outside of their declaring scopes get promoted to named variables later (and are not considered hoistable rvals). 2. Pass 2 (collectTemporariesSidemap) Walk over instructions to generate a sidemap of temporary identifier -> named variable and property path (e.g. `$3 -> {obj: props, path: ["a", "b"]}`) 2. Pass 2 (collectHoistablePropertyLoads) a. Build a sidemap of block -> accessed variables and properties (e.g. `bb0 -> [ {obj: props, path: ["a", "b"]} ]`) b. Propagate "non-nullness" i.e. variables and properties for which we can safely evaluate `PropertyLoad`. A basic block can unconditionally read from identifier X if any of the following applies: - the block itself reads from identifier X - all predecessors of the block read from identifier X - all successors of the block read from identifier X 4. Pass 3: (collectDependencies) Walks over instructions again to record dependencies and declarations, using the previously produced sidemaps. We do not record any control-flow here 5. Merge every scope's recorded dependencies with the set of hoistable PropertyLoads Tested by syncing internally and (1) checking compilation output differences ([internal link](https://www.internalfb.com/intern/everpaste/?handle=GPCfUBt_HCoy_S4EAJDVFJyJJMR0bsIXAAAB)), running internally e2e tests ([internal link](https://fburl.com/sandcastle/cs5mlkxq)) --- ### Followups: 1. Rewrite function expression deps This change produces much more optimal output as the compiler now uses the function CFG to understand which variables / paths are assumed to be non-null. However, it may exacerbate [this function-expr hoisting bug](https://github.com/facebook/react/blob/main/compiler/packages/babel-plugin-react-compiler/src/__tests__/fixtures/compiler/bug-invalid-hoisting-functionexpr.tsx). A short term fix here is to simply call some form of `collectNonNullObjects` on every function expression to find hoistable variable / paths. In the longer term, we should refactor out `FunctionExpression.deps`. 2. Enable optional paths (a) don't count optional load temporaries as dependencies (e.g. `collectOptionalLoadRValues(...)`). (b) record optional paths in both collectHoistablePropertyLoads and dependency collection ghstack-source-id: 6c62db14213284bc61869b8db57198546c399f7f Pull Request resolved: #30894

Resubmission of #30079 -- core logic unchanged, but needed to rebase past #30573 ### Quick background #### Temporaries The compiler currently treats temporaries and named variables (e.g. `x`) differently in this pass. - named variables may be reassigned (in fact, since we're running after LeaveSSA, a single named identifier's IdentifierId may map to multiple `Identifier` instances -- each with its own scope and mutable range) - temporaries are replaced with their represented expressions during codegen. This is correct (mostly correct, see #29878) as we're careful to always lower the correct evaluation semantics. However, since we rewrite reactive scopes entirely (to if/else blocks), we need to track temporaries that a scope produces in `ReactiveScope.declarations` and later promote them to named variables. In the same example, $4, $5, and $6 need to be promoted: $2 ->`t0`, $5 ->`t1`, and $6 ->`t2`. ```js [1] $2 = LoadGlobal(global) foo [2] $3 = LoadLocal bar$1 scope 0: [3] $4 = Call $2(<unknown> $3) scope 1: [4] $5 = Object { } scope 2: [5] $6 = Object { a: $4, b: $5 } [6] $8 = StoreLocal Const x$7 = $6 ``` #### Dependencies `ReactiveScope.dependencies` records the set of (read-only) values that a reactive scope is dependent on. This is currently limited to just variables (named variables from source and promoted temporaries) and property-loads. All dependencies we record need to be hoistable -- i.e. reordered to just before the ReactiveScope begins. Not all PropertyLoads are hoistable. In this example, we should not evaluate `obj.a.b` without before creating x and checking `objIsNull`. ```js // reduce-reactive-deps/no-uncond.js function useFoo({ obj, objIsNull }) { const x = []; if (isFalse(objIsNull)) { x.push(obj.a.b); } return x; } ``` While other memoization strategies with different constraints exist, the current compiler requires that `ReactiveScope.dependencies` be re-orderable to the beginning of the reactive scope. But.. `PropertyLoad`s from null values will throw `TypeError`. This means that evaluating hoisted dependencies should throw if and only if the source program throws. (It is also a bug if source throws and compiler output does not throw. See facebook/react-forget#2709) --- ### Rough high level overview 1. Pass 1 Walk over instructions to gather every temporary used outside of its defining scope (same as ReactiveFunction version). These determine the sidemaps we produce, as temporaries used outside of their declaring scopes get promoted to named variables later (and are not considered hoistable rvals). 2. Pass 2 (collectTemporariesSidemap) Walk over instructions to generate a sidemap of temporary identifier -> named variable and property path (e.g. `$3 -> {obj: props, path: ["a", "b"]}`) 2. Pass 2 (collectHoistablePropertyLoads) a. Build a sidemap of block -> accessed variables and properties (e.g. `bb0 -> [ {obj: props, path: ["a", "b"]} ]`) b. Propagate "non-nullness" i.e. variables and properties for which we can safely evaluate `PropertyLoad`. A basic block can unconditionally read from identifier X if any of the following applies: - the block itself reads from identifier X - all predecessors of the block read from identifier X - all successors of the block read from identifier X 4. Pass 3: (collectDependencies) Walks over instructions again to record dependencies and declarations, using the previously produced sidemaps. We do not record any control-flow here 5. Merge every scope's recorded dependencies with the set of hoistable PropertyLoads Tested by syncing internally and (1) checking compilation output differences ([internal link](https://www.internalfb.com/intern/everpaste/?handle=GPCfUBt_HCoy_S4EAJDVFJyJJMR0bsIXAAAB)), running internally e2e tests ([internal link](https://fburl.com/sandcastle/cs5mlkxq)) --- ### Followups: 1. Rewrite function expression deps This change produces much more optimal output as the compiler now uses the function CFG to understand which variables / paths are assumed to be non-null. However, it may exacerbate [this function-expr hoisting bug](https://github.com/facebook/react/blob/main/compiler/packages/babel-plugin-react-compiler/src/__tests__/fixtures/compiler/bug-invalid-hoisting-functionexpr.tsx). A short term fix here is to simply call some form of `collectNonNullObjects` on every function expression to find hoistable variable / paths. In the longer term, we should refactor out `FunctionExpression.deps`. 2. Enable optional paths (a) don't count optional load temporaries as dependencies (e.g. `collectOptionalLoadRValues(...)`). (b) record optional paths in both collectHoistablePropertyLoads and dependency collection ghstack-source-id: 2507f6ea751dce09ad1dccd353ae6fc7cf411582 Pull Request resolved: #30894

[wip][compiler] Stop rewriting IdentifierId in LeaveSSA

5b3dd3d

[ghstack-poisoned]

josephsavona mentioned this pull request Aug 1, 2024

[compiler] Add Identifier.declarationId #30569

Merged

josephsavona added a commit that referenced this pull request Aug 1, 2024

[wip][compiler] Stop rewriting IdentifierId in LeaveSSA

4be1224

ghstack-source-id: b3c84364193cc58d405edda235252336d0aefab5 Pull Request resolved: #30573

facebook-github-bot added CLA Signed React Core Team Opened by a member of the React Core Team labels Aug 1, 2024

vercel bot deployed to Preview August 1, 2024 18:42 View deployment

Update on "[wip][compiler] Stop rewriting IdentifierId in LeaveSSA"

cf4bfb8

[ghstack-poisoned]

josephsavona added a commit that referenced this pull request Aug 1, 2024

[wip][compiler] Stop rewriting IdentifierId in LeaveSSA

43520aa

ghstack-source-id: d15a711d164cac86e5c80148c759b259ce45a810 Pull Request resolved: #30573

vercel bot deployed to Preview August 1, 2024 20:21 View deployment

Update on "[wip][compiler] Stop rewriting IdentifierId in LeaveSSA"

844b586

[ghstack-poisoned]

josephsavona added a commit that referenced this pull request Aug 1, 2024

[wip][compiler] Stop rewriting IdentifierId in LeaveSSA

6011c1e

ghstack-source-id: b6ec82a461d95283dea3b040e81f100b29c7cc4e Pull Request resolved: #30573

vercel bot deployed to Preview August 1, 2024 21:55 View deployment

Update on "[wip][compiler] Stop rewriting IdentifierId in LeaveSSA"

25b47d4

[ghstack-poisoned]

josephsavona added a commit that referenced this pull request Aug 1, 2024

[wip][compiler] Stop rewriting IdentifierId in LeaveSSA

3dfdc94

ghstack-source-id: 9962038ebeeb650d5a07146308c9ee1cfb26853b Pull Request resolved: #30573

vercel bot deployed to Preview August 1, 2024 22:24 View deployment

Update on "[wip][compiler] Stop rewriting IdentifierId in LeaveSSA"

5a30b29

[ghstack-poisoned]

josephsavona added a commit that referenced this pull request Aug 1, 2024

[wip][compiler] Stop rewriting IdentifierId in LeaveSSA

a3c5610

ghstack-source-id: 7197375fff0345a5b4d2520ef863488bb1dc1d15 Pull Request resolved: #30573

vercel bot deployed to Preview August 1, 2024 22:48 View deployment

Update on "[wip][compiler] Stop rewriting IdentifierId in LeaveSSA"

91d0553

[ghstack-poisoned]

josephsavona added a commit that referenced this pull request Aug 1, 2024

[wip][compiler] Stop rewriting IdentifierId in LeaveSSA

5464ce4

ghstack-source-id: 1b653587414b4103f33f415ab6574e8dbedd90b1 Pull Request resolved: #30573

vercel bot deployed to Preview August 1, 2024 22:58 View deployment

Update on "[wip][compiler] Stop rewriting IdentifierId in LeaveSSA"

f044d1a

[ghstack-poisoned]

josephsavona added a commit that referenced this pull request Aug 1, 2024

[wip][compiler] Stop rewriting IdentifierId in LeaveSSA

a935604

ghstack-source-id: 27d08cc1ee0cd4459078e25415da0c54518759ee Pull Request resolved: #30573

vercel bot deployed to Preview August 1, 2024 23:10 View deployment

Update on "[wip][compiler] Stop rewriting IdentifierId in LeaveSSA"

124f022

[ghstack-poisoned]

josephsavona added a commit that referenced this pull request Aug 1, 2024

[wip][compiler] Stop rewriting IdentifierId in LeaveSSA

656a7da

ghstack-source-id: e54a9d44a89a11cd9d65939da17222d85be05712 Pull Request resolved: #30573

vercel bot deployed to Preview August 1, 2024 23:22 View deployment

Update on "[wip][compiler] Stop rewriting IdentifierId in LeaveSSA"

cf3fc94

[ghstack-poisoned]

josephsavona added a commit that referenced this pull request Aug 1, 2024

[wip][compiler] Stop rewriting IdentifierId in LeaveSSA

bfb7d22

ghstack-source-id: 1b4da810d5086a6b90d43effcfedb2ecbb15e573 Pull Request resolved: #30573

mvitousek reviewed Aug 6, 2024

View reviewed changes

compiler/scripts/anonymize.js Show resolved Hide resolved

This was referenced Aug 6, 2024

[compiler][be] Less ambiguous error messages for validateMemo bailout #30601

Merged

[compiler][repro] ValidatePreserveMemo x useRef bug #30602

Merged

[compiler] Patch ValidatePreserveMemo to bailout correctly for refs #30603

Merged

vercel bot deployed to Preview August 6, 2024 18:07 View deployment

vercel bot deployed to Preview August 6, 2024 18:29 View deployment

josephsavona merged commit 512d6f4 into gh/josephsavona/34/base Aug 6, 2024
19 checks passed

josephsavona deleted the gh/josephsavona/34/head branch August 6, 2024 19:03

josephsavona added a commit that referenced this pull request Aug 8, 2024

[compiler][be] Cleanup class naming in PromoteUsedTemporaries

c4ac65c

I forgot to clean this up before landing #30573. [ghstack-poisoned]

josephsavona mentioned this pull request Aug 8, 2024

[compiler][be] Cleanup class naming in PromoteUsedTemporaries #30632

Merged

josephsavona added a commit that referenced this pull request Aug 8, 2024

[compiler][be] Cleanup class naming in PromoteUsedTemporaries

5d010b9

I forgot to clean this up before landing #30573. ghstack-source-id: 2141471912e410aa12545dcf7989f45447007ba9 Pull Request resolved: #30632

josephsavona added a commit that referenced this pull request Aug 9, 2024

[compiler][be] Cleanup class naming in PromoteUsedTemporaries

229038e

I forgot to clean this up before landing #30573. ghstack-source-id: 2141471912e410aa12545dcf7989f45447007ba9 Pull Request resolved: #30632

eps1lon mentioned this pull request Aug 14, 2024

Update React from 187dd6a7-20240806 to 49496d49-20240814 vercel/next.js#68898

Merged

mofeiZ mentioned this pull request Sep 6, 2024

[compiler][rewrite] PropagateScopeDeps hir rewrite #30894

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[compiler] Stay in SSA form through entire pipeline #30573

[compiler] Stay in SSA form through entire pipeline #30573

josephsavona commented Aug 1, 2024 •

edited

Loading

vercel bot commented Aug 1, 2024 •

edited

Loading

josephsavona commented Aug 6, 2024

mofeiZ commented Aug 6, 2024

josephsavona commented Aug 6, 2024

josephsavona commented Aug 6, 2024

[compiler] Stay in SSA form through entire pipeline #30573

[compiler] Stay in SSA form through entire pipeline #30573

Conversation

josephsavona commented Aug 1, 2024 • edited Loading

vercel bot commented Aug 1, 2024 • edited Loading

josephsavona commented Aug 6, 2024

mofeiZ commented Aug 6, 2024

josephsavona commented Aug 6, 2024

josephsavona commented Aug 6, 2024

josephsavona commented Aug 1, 2024 •

edited

Loading

vercel bot commented Aug 1, 2024 •

edited

Loading