Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(traverse): support clone_identifier_reference method in TraverseCtx #4880

Conversation

Dunqing
Copy link
Member

@Dunqing Dunqing commented Aug 13, 2024

related: #4804

needs from: #4876

The clone_identifier_reference method is used to clone an IdentifierReference and create a Reference and insert it to SymbolTable's resolved_references.

The reason we need this is because we need to make sure that IdentifierReference's reference_id is unique

Copy link
Member Author

Dunqing commented Aug 13, 2024

This stack of pull requests is managed by Graphite. Learn more about stacking.

Join @Dunqing and the rest of your teammates on Graphite Graphite

Copy link

codspeed-hq bot commented Aug 13, 2024

CodSpeed Performance Report

Merging #4880 will not alter performance

Comparing 08-13-feat_traverse_support_clone_identifier_reference_method_in_traversectx (72a37fc) with main (47c9552)

Summary

✅ 29 untouched benchmarks

@Dunqing Dunqing requested a review from overlookmotel August 13, 2024 15:09
Copy link

graphite-app bot commented Aug 13, 2024

Your org has enabled the Graphite merge queue for merging into main

Add the label “merge” to the PR and Graphite will automatically add it to the merge queue when it’s ready to merge. Or use the label “hotfix” to add to the merge queue as a hot fix.

You must have a Graphite account and log in to Graphite in order to use the merge queue. Sign up using this link.

@rzvxa
Copy link
Contributor

rzvxa commented Aug 13, 2024

Sorry if I'm a little out of the loop here, Out of interest I was wondering where we need 2 copies of the exact same node.
Since it is related to #4804 I'm interested to know about our use cases.
Doesn't it make more sense to have a generic API that would accept both BindingIdentifier and IdentifierReference and can output an IdentifierReference? Something like ctx.create_reference_to(it).

Edit:

Is there a reason we have something like this in our code:

            let ident = ctx.create_reference_id(
                param.span,
                param.name.clone(),
                param.symbol_id.get(),
                ReferenceFlag::Read,
            );

instead of passing param directly?

@Dunqing
Copy link
Member Author

Dunqing commented Aug 13, 2024

Sorry if I'm a little out of the loop here, Out of interest I was wondering where we need 2 copies of the exact same node. Since it is related to #4804 I'm interested to know about our use cases.

For example:

// In
a.b **= 0

// Out
var _a;
(_a = a), (_a["b"] = Math.pow(_a["b"], 0));

The final result, in this case, is the _a at least needs to clone three times, and a need to clone once

Doesn't it make more sense to have a generic API that would accept both BindingIdentifier and IdentifierReference and can output an IdentifierReference? Something like ctx.create_reference_to(it).

It would be better if we have an API to accept both BindingIdentifier and IdentifierReference and return anIdentifierReference which has the same symbol_id.

Edit:

Is there a reason we have something like this in our code:

            let ident = ctx.create_reference_id(
                param.span,
                param.name.clone(),
                param.symbol_id.get(),
                ReferenceFlag::Read,
            );

instead of passing param directly?

IdentifierReference only has reference_id. If we want to get a symbol_id through reference_id, so we need to get a reference by reference_id first, and then we can get symbol_id through reference.symbol_id()

@Dunqing
Copy link
Member Author

Dunqing commented Aug 13, 2024

Sorry if I'm a little out of the loop here, Out of interest I was wondering where we need 2 copies of the exact same node. Since it is related to #4804 I'm interested to know about our use cases.

For example:

// In
a.b **= 0

// Out
var _a;
(_a = a), (_a["b"] = Math.pow(_a["b"], 0));

The final result, in this case, is the _a at least needs to clone three times, and a need to clone once

Doesn't it make more sense to have a generic API that would accept both BindingIdentifier and IdentifierReference and can output an IdentifierReference? Something like ctx.create_reference_to(it).

It would be better if we have an API to accept both BindingIdentifier and IdentifierReference and return anIdentifierReference which has the same symbol_id. But different parameters have different implementation details

Edit:

Is there a reason we have something like this in our code:

            let ident = ctx.create_reference_id(
                param.span,
                param.name.clone(),
                param.symbol_id.get(),
                ReferenceFlag::Read,
            );

instead of passing param directly?

IdentifierReference only has reference_id. If we want to get a symbol_id through reference_id, so we need to get a reference by reference_id first, and then we can get symbol_id through reference.symbol_id()

@Dunqing Dunqing marked this pull request as draft August 14, 2024 02:08
@Dunqing
Copy link
Member Author

Dunqing commented Aug 14, 2024

According to use case #4876, only clone_identifier_reference is not enough. clone_expression can be quite common.

All of these Expressions include IdentifierReference

a.b // MemberExpression
a[b] // MemberExpression
`123 ${variable} 321` // TemplateLiteralExpression

Currently, I use move_expression instead of clone_in, but when an Expression may used to multi-place, then we must clone it

The API is best provided by Traverse, so we have to pass ScopeTree and SymbolTable

@Boshen Boshen added the blocker label Aug 14, 2024
@Dunqing
Copy link
Member Author

Dunqing commented Aug 14, 2024

I have written this code inside plugins, so does not block any yet

@Dunqing Dunqing removed the blocker label Aug 14, 2024
@overlookmotel
Copy link
Contributor

overlookmotel commented Aug 14, 2024

The problem with cloning an IdentifierReference like this is that it require an unnecessary lookup - you have to get the Reference from the reference_id to then get the SymbolId.

When you're generating multiple IdentifierReferences for the same symbol, it's better to look up the SymbolId only once, and then reuse it. Or, in this case, _a is a UID, so we have the SymbolId from the start.

Another API

There is an API hidden in helpers which is good for this use case:

/// Store for a created binding identifier
#[derive(Clone)]
pub struct BoundIdentifier<'a> {
pub name: Atom<'a>,
pub symbol_id: SymbolId,
}

// In
a.b **= 0

// Out
var _a;
(_a = a), (_a["b"] = Math.pow(_a["b"], 0));

In this case you could use it as follows:

// NB: `BindingIdentifier` + `IdentifierReference` type annotations are not required
// - just for clarity in this example

// Create `_a`
let binding = BoundIdentifier::new_uid(
    "a", // Name *without* `_` prefix
    ctx.current_scope_id(),
    SymbolFlags::FunctionScopedVariable,
    ctx
);
let id: BindingIdentifier = binding.create_binding_identifier();

// Then create your `IdentifierReference`s
let ref1: IdentifierReference = binding.create_read_reference(ctx);
let ref2: IdentifierReference = binding.create_read_reference(ctx);
let ref3: IdentifierReference = binding.create_read_reference(ctx);

// Now use them: `var <id>; (<ref1> = a), (<ref2>["b"] = Math.pow(<ref3>["b"], 0))`

Actually, this is not correct - ref1 should have ReferenceFlag::Write flag not Read. We should add a BoundIdentifier::create_write_reference method. I'll make a PR for that now (edit: here it is: #4897).

A couple of notes:

  • The reason for having a BoundIdentifier type rather than just using BindingIdentifier is that:
    • BoundIdentifier is smaller as it doesn't include Span.
    • BoundIdentifier can have symbol_id stored as pure SymbolId, not Cell<Option<SymbolId>>.
  • This BoundIdentifier API could probably be better - it could be methods on TraverseCtx instead - but I'm not sure if we should change it right now when we're in the middle of intense work on the transformer. What do you think?

Should we merge this PR?

clone_identifier_reference is good if you only need to make a single clone, and you don't know the SymbolId already.

But the danger of having this API is that it's then tempting to use it in places where it'd be better to use a more efficient method like BoundIdentifier.

Side note

The final result, in this case, is the _a at least needs to clone three times, and a need to clone once.

I don't think a should be cloned at all. It should just be moved.

@rzvxa
Copy link
Contributor

rzvxa commented Aug 14, 2024

@Dunqing Thanks for your explanation, Now it makes perfect sense to me.

I have an idea to introduce an Identifier trait that can unify these.
We can pass in an Identifier and get a BoundIdentifier from it, That way should be able to use it wherever we please.

To build upon @overlookmotel snippet it would look something like this:

// NB: `BindingIdentifier` + `IdentifierReference` type annotations are not required
// - just for clarity in this example

// where we already have `_a` binding identifier as `ident`:

let a_prime: IdentifierReference = ident.scoped_ref(ctx.scoping());

// when we want multiple copies from the same identifier
let binding: BoundIdentifier = ident.scoped_bound(ctx.scoping());

let ref1: IdentifierReference = binding.create_read_reference(ctx);
let ref2: IdentifierReference = binding.create_read_reference(ctx);
let ref3: IdentifierReference = binding.create_read_reference(ctx);


// Similarly if you have access to `_a` as `IdentfierReference` we can still write the exact same code:

let a_double_prime: IdentifierReference = ident_ref.scoped_ref(ctx.scoping());

// when we want multiple copies from the same identifier
let binding: BoundIdentifier = ident_ref.scoped_bound(ctx.scoping());

let ref1: IdentifierReference = binding.create_read_reference(ctx);
let ref2: IdentifierReference = binding.create_read_reference(ctx);
let ref3: IdentifierReference = binding.create_read_reference(ctx);

I'm not sure if it is the right time to do such a thing but I can't help but feel like having an Identifier trait can simplify a lot of hassles.

@overlookmotel
Copy link
Contributor

overlookmotel commented Aug 14, 2024

@rzvxa Your proposed trait is very elegant and flexible. But... my only worry about it is that it's maybe too flexible. Same as what I said above about clone_identifier_reference method - it makes it too easy to do the wrong (expensive) thing and create a BoundIdentifier from an IdentifierReference when you already have the SymbolId.

I actually think use cases for cloning existing IdentifierReferences are quite rare. The transform we've been talking about here is quite typical:

  • The original IdentifierReference is just moved, and doesn't need to be cloned.
  • The only binding that we do need to create multiple IdentifierReferences for is a UID which we have newly created. So BoundIdentifier::new_uid covers us for this.

Here's another transform which follows same pattern (only UIDs needs cloning):

// In
const [x, y, z] = arr;

// Out
var _arr = _slicedToArray(arr, 3); // Helper call
const x = _arr[0], _arr[1], _arr[2];

Therefore, adding an API which makes it easy to do the wrong thing, and has few use cases where its the right thing may not on balance be a benefit overall.

The background context is that once we've laid the groundwork, we hope to open up the transformer to contributions (similar to the linter). People who are less familiar with Oxc will tend to just grab whatever API they find that makes the transform work, so there's an argument for limiting our API surface in order to guide people towards the "good path".

I could be wrong that there aren't many places where we'll need to clone existing IdentifierReferences. But I suggest we wait and see if we need this API, and only add it if we do.

@overlookmotel
Copy link
Contributor

overlookmotel commented Aug 14, 2024

@Dunqing Conversation here has broadened out beyond the content of this specific PR. Just to bring it back to the specific:

Do you think BoundIdentifier (especially once #4897 is merged) can cover this, so maybe we don't need clone_identifier_reference too?

Sorry I hid BoundIdentifier away in a silly place and didn't let anyone know it was there. I got lazy!

@overlookmotel
Copy link
Contributor

overlookmotel commented Aug 14, 2024

I was wrong! The exponent operator transform does requiring cloning IdentifierReferences in some cases. So we probably should add this clone_identifier_reference method for that.

Concerning the Ident trait idea: Personally I think it's better for this to be an explicitly-named method just for IdentifierReferences, rather than a cover-all trait, so potentially bad usages of this method are easier to spot.

@Dunqing Dunqing requested a review from overlookmotel August 14, 2024 16:29
@Dunqing Dunqing marked this pull request as ready for review August 14, 2024 16:30
@overlookmotel overlookmotel added the 0-merge Merge with Graphite Merge Queue label Aug 15, 2024
Copy link

graphite-app bot commented Aug 15, 2024

Merge activity

  • Aug 15, 8:02 AM EDT: The merge label 'merge' was detected. This PR will be added to the Graphite merge queue once it meets the requirements.
  • Aug 15, 8:02 AM EDT: overlookmotel added this pull request to the Graphite merge queue.
  • Aug 15, 8:05 AM EDT: overlookmotel merged this pull request with the Graphite merge queue.

…rseCtx` (#4880)

related: #4804

needs from: #4876

The `clone_identifier_reference` method is used to clone an `IdentifierReference` and create a `Reference` and insert it to `SymbolTable`'s `resolved_references`.

The reason we need this is because we need to make sure that `IdentifierReference`'s `reference_id` is unique
@overlookmotel overlookmotel force-pushed the 08-13-feat_traverse_support_clone_identifier_reference_method_in_traversectx branch from ec89b5d to 72a37fc Compare August 15, 2024 12:02
@graphite-app graphite-app bot merged commit 72a37fc into main Aug 15, 2024
24 checks passed
@graphite-app graphite-app bot deleted the 08-13-feat_traverse_support_clone_identifier_reference_method_in_traversectx branch August 15, 2024 12:05
@oxc-bot oxc-bot mentioned this pull request Aug 18, 2024
Boshen added a commit that referenced this pull request Aug 18, 2024
## [0.24.3] - 2024-08-18

### Features

- d49fb16 oxc_codegen: Support generate range leading comments (#4898)
(IWANABETHATGUY)
- 80d0d1f semantic: Check for invalid interface heritage clauses (#4928)
(DonIsaac)
- 48821c0 semantic,syntax: Add SymbolFlags::ArrowFunction (#4946)
(DonIsaac)
- f1fcdde transformer: Support react fast refresh (#4587) (Dunqing)
- 0d79122 transformer: Support logical-assignment-operators plugin
(#4890) (Dunqing)
- ab1d08c transformer: Support `optional-catch-binding` plugin (#4885)
(Dunqing)
- 69da9fd transformer: Support nullish-coalescing-operator plugin
(#4884) (Dunqing)
- 3a66e58 transformer: Support exponentiation operator plugin (#4876)
(Dunqing)
- f88cbcd transformer: Add `BoundIdentifier::new_uid_in_current_scope`
method (#4903) (overlookmotel)
- 1e6d0fe transformer: Add methods to `BoundIdentifier` (#4897)
(overlookmotel)
- fd34640 traverse: Support `generate_uid_based_on_node` method in
`TraverseCtx` (#4940) (Dunqing)
- 72a37fc traverse: Support `clone_identifier_reference` method in
`TraverseCtx` (#4880) (Dunqing)

### Bug Fixes

- c0b26f4 ast: Do not include `scope_id` fields in JSON AST (#4858)
(overlookmotel)
- bbf9ec0 codegen: Add missing `declare` to `PropertyDefinition` (#4937)
(Boshen)
- f210cf7 codegen: Print `TSSatisfiesExpression` and
`TSInstantiationExpression` (#4936) (Boshen)
- 21f5762 codegen: Minify large numbers (#4889) (Boshen)
- e8de4bd codegen: Fix whitespace issue when minifying `x in new
Error()` (#4886) (Boshen)
- a226962 codegen: Print `TSNonNullExpression` (#4869) (Boshen)
- 3da33d3 codegen: Missing parenthesis for `PrivateInExpression` (#4865)
(Boshen)
- 1808529 codegen: Dedupe pure annotation comments (#4862)
(IWANABETHATGUY)
- d3bbc62 isolated-declarations: Declare modifier of PropertyDefinition
should not be retained (#4941) (Dunqing)
- 8e80f59 isolated_declarations: Class properties should still be lifted
from private constructors (#4934) (michaelm)
- b3ec9e5 isolated_declarations: Always emit module declarations that
perform augmentation (#4919) (michaelm)
- 0fb0b71 isolated_declarations: Always emit module declarations (#4911)
(michaelm)
- 4a16916 isolated_declarations: Support expando functions (#4910)
(michaelm)
- 508644a linter/tree-shaking: Correct the calculation of `>>`, `<<` and
`>>>` (#4932) (mysteryven)
- 46cb1c1 minifier: Handle `Object.definedPropert(exports` for
@babel/types/lib/index.js (#4933) (Boshen)
- 81fd637 minifier: Do not fold `0 && (module.exports = {})` for
`cjs-module-lexer` (#4878) (Boshen)
- 879a271 minifier: Do not join `require` calls for `cjs-module-lexer`
(#4875) (Boshen)
- 1bdde2c parser: Detect @flow in `/** @flow */ comment (#4861) (Boshen)
- 2476dce transformer: Remove an `ast.copy` from
`NullishCoalescingOperator` transform (#4913) (overlookmotel)
- 248a757 transformer/typescript: Typescript syntax within
`SimpleAssignmentTarget` with `MemberExpressions` is not stripped
(#4920) (Dunqing)

### Documentation

- 47c9552 ast, ast_macros, ast_tools: Better documentation for `Ast`
helper attributes. (#4856) (rzvxa)
- 0a01a47 semantic: Improve documentation (#4850) (DonIsaac)
- 9c700ed transformer: Add README including style guide (#4899)
(overlookmotel)

### Refactor

- a6967b3 allocator: Correct code comment (#4904) (overlookmotel)
- 90d0b2b allocator, ast, span, ast_tools: Use `allocator` as var name
for `Allocator` (#4900) (overlookmotel)
- 1eb59d2 ast, isolated_declarations, transformer: Mark
`AstBuilder::copy` as an unsafe function (#4907) (overlookmotel)
- 8e8fcd0 ast_tools: Rename `oxc_ast_codegen` to `oxc_ast_tools`.
(#4846) (rzvxa)
- 786bf07 index: Shorten code and correct comment (#4905)
(overlookmotel)
- ea1e64a semantic: Make SemanticBuilder opaque (#4851) (DonIsaac)
- 5fd1701 sourcemap: Lower the `msrv`. (#4873) (rzvxa)
- 48a1c32 syntax: Inline trivial bitflags methods (#4877)
(overlookmotel)
- 452187a transformer: Rename `BoundIdentifier::new_uid_in_root_scope`
(#4902) (overlookmotel)
- 707a01f transformer: Re-order `BoundIdentifier` methods (#4896)
(overlookmotel)
- 117dff2 transformer: Improve comments for `BoundIdentifier` helper
(#4895) (overlookmotel)

Co-authored-by: Boshen <1430279+Boshen@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0-merge Merge with Graphite Merge Queue
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants