Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(transformer): support all /regex/ to new RegExp transforms #5387

Merged

Conversation

Dunqing
Copy link
Member

@Dunqing Dunqing commented Sep 1, 2024

related: #4754

The implementation port from esbuild. And cover all babel's regexp plugins


The following description was generated by Graphite 😋

TL;DR

Added support for transforming various RegExp features to ensure compatibility with older JavaScript environments.

What changed?

  • Implemented a new RegExp transformer to handle unsupported RegExp literal features
  • Added options to control different RegExp transformations (e.g., sticky flag, unicode flag, dot-all flag, etc.)
  • Updated the transformer to convert unsupported RegExp literals into new RegExp() constructor calls
  • Added test cases for different RegExp transformations
  • Integrated the new RegExp transformer into the existing transformation pipeline

How to test?

  1. Run the existing test suite to ensure no regressions
  2. Execute the new RegExp-specific tests in the tasks/transform_conformance/tests/esbuild-tests/test/fixtures/regexp/ directory
  3. Try transforming code with various RegExp features using different target environments to verify correct transformations

Copy link

graphite-app bot commented Sep 1, 2024

Your org has enabled the Graphite merge queue for merging into main

Add the label “merge” to the PR and Graphite will automatically add it to the merge queue when it’s ready to merge. Or use the label “hotfix” to add to the merge queue as a hot fix.

You must have a Graphite account and log in to Graphite in order to use the merge queue. Sign up using this link.

@Dunqing Dunqing mentioned this pull request Sep 1, 2024
31 tasks
Copy link
Member Author

Dunqing commented Sep 1, 2024

@github-actions github-actions bot added the A-transformer Area - Transformer / Transpiler label Sep 1, 2024
@Dunqing Dunqing force-pushed the 09-01-feat_transformer_support_all_regexp_plugins branch from 8140457 to f4726ba Compare September 1, 2024 15:21
Copy link

codspeed-hq bot commented Sep 1, 2024

CodSpeed Performance Report

Merging #5387 will degrade performances by 13.86%

Comparing 09-01-feat_transformer_support_all_regexp_plugins (c59d8b3) with main (91b39c4)

Summary

❌ 2 regressions
✅ 27 untouched benchmarks

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

Benchmark main 09-01-feat_transformer_support_all_regexp_plugins Change
transformer[antd.js] 27.7 ms 32.2 ms -13.86%
transformer[pdf.mjs] 10.2 ms 10.7 ms -4.5%

@Dunqing Dunqing force-pushed the 09-01-feat_transformer_support_all_regexp_plugins branch from a185cf8 to c458d27 Compare September 2, 2024 03:40
@Dunqing
Copy link
Member Author

Dunqing commented Sep 2, 2024

I saw a 2% performance regression in enabling plugins.

@Boshen Boshen changed the title feat(transformer): support all regexp plugins feat(transformer): support all /regex/ to new RegExp plugins Sep 2, 2024
@Boshen Boshen changed the title feat(transformer): support all /regex/ to new RegExp plugins feat(transformer): support all /regex/ to new RegExp transforms Sep 2, 2024
@Dunqing Dunqing marked this pull request as ready for review September 2, 2024 07:22
Copy link
Member

@Boshen Boshen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How are these turned on or off from targets?

crates/oxc_transformer/src/regexp/mod.rs Outdated Show resolved Hide resolved
@Dunqing
Copy link
Member Author

Dunqing commented Sep 2, 2024

How are these turned on or off from targets?

Consistent with other plugins, automatically enable plugins below those versions(from parsed targets). I don't know how to turn off plugins that are enabled by targets. Might want to look into how Babel does this

@Dunqing Dunqing force-pushed the 09-01-feat_transformer_support_all_regexp_plugins branch 2 times, most recently from 279c0e9 to 6bc460a Compare September 2, 2024 13:47
Copy link
Member

@Boshen Boshen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, waiting for @overlookmotel to take a look.

@Dunqing
Copy link
Member Author

Dunqing commented Sep 3, 2024

I just found that RegExpPattern AST had merged into oxc. So this plugin doesn't need to roughly parse and check pattern anymore. That's so great! Thanks for your work! @rzvxa @leaysgur

@Dunqing Dunqing force-pushed the 09-01-feat_transformer_support_all_regexp_plugins branch 3 times, most recently from fa8ccfc to f8ce585 Compare September 3, 2024 14:41
crates/oxc_transformer/src/regexp/mod.rs Outdated Show resolved Hide resolved
crates/oxc_transformer/src/regexp/mod.rs Outdated Show resolved Hide resolved
crates/oxc_transformer/src/regexp/mod.rs Outdated Show resolved Hide resolved
@overlookmotel
Copy link
Contributor

overlookmotel commented Sep 3, 2024

I've pushed a few commits. As usual, feel free to revert them if you don't like them.

Concerning changing "RegExp".into() to Atom::from("RegExp"): This makes no actual difference. It's just my personal preference to be explicit, as I find it makes it easier to read the code. Where I see .into() I often find myself wondering "but into what?". Atom::from(...) removes that ambiguity.

crates/oxc_transformer/src/regexp/mod.rs Outdated Show resolved Hide resolved
@Dunqing
Copy link
Member Author

Dunqing commented Sep 5, 2024

@rzvxa Can you take a look at the transform_conformance job? There is a panic in

let ch = char::from_u32(cp).expect("Invalid `Character`!");

@Dunqing
Copy link
Member Author

Dunqing commented Sep 5, 2024

Well, parsing RegExp increases the performance regression from 6% to 13%.

@rzvxa
Copy link
Contributor

rzvxa commented Sep 5, 2024

@rzvxa Can you take a look at the transform_conformance job? There is a panic in

let ch = char::from_u32(cp).expect("Invalid `Character`!");

Yes, I'll take a look. I'm having a hunch as to what it might be.

@leaysgur
Copy link
Contributor

leaysgur commented Sep 5, 2024

@rzvxa Issue is caused by these match cases and when ch is not used inside like UnicodeEscape. (Sorry my suggested code)

let result = match this.kind {

These patterns were the issue.

if raw == r"/^|\udf06/ug"
  || raw == r"/\udf06/"
  || raw == r"/\udf06/u"
  || raw == r"/^|\udf06/g"

@rzvxa
Copy link
Contributor

rzvxa commented Sep 5, 2024

@leaysgur Yes, that was my theory as well. Thanks for getting to the bottom of this.

@Dunqing I've submitted a fix for it.

@graphite-app graphite-app bot closed this in 9b984b3 Sep 5, 2024
@Dunqing
Copy link
Member Author

Dunqing commented Sep 5, 2024

Ah, This PR was closed, and cannot reopen it 🥲

@Boshen
Copy link
Member

Boshen commented Sep 5, 2024

@graphite-app graphite-app bot closed this in 9b984b3 16 minutes ago

wtf

@Boshen
Copy link
Member

Boshen commented Sep 5, 2024

@Boshen Boshen reopened this now

It can only be reopened from graphite.

@overlookmotel overlookmotel added the 0-merge Merge with Graphite Merge Queue label Sep 5, 2024 — with Graphite App
Copy link

graphite-app bot commented Sep 5, 2024

Merge activity

@overlookmotel overlookmotel removed the 0-merge Merge with Graphite Merge Queue label Sep 5, 2024
@overlookmotel
Copy link
Contributor

I didn't mean to add the merge label! I added it to upstack PR and didn't notice it would also merge this. Am reading through the updated code now.

)

related: #4754

The implementation port from [esbuild](https://github.com/evanw/esbuild/blob/332727499e62315cff4ecaff9fa8b86336555e46/internal/js_parser/js_parser.go#L12820-L12840). And cover all babel's regexp plugins

---

## The following description was generated by `Graphite` 😋

### TL;DR

Added support for transforming various RegExp features to ensure compatibility with older JavaScript environments.

### What changed?

- Implemented a new `RegExp` transformer to handle unsupported RegExp literal features
- Added options to control different RegExp transformations (e.g., sticky flag, unicode flag, dot-all flag, etc.)
- Updated the transformer to convert unsupported RegExp literals into `new RegExp()` constructor calls
- Added test cases for different RegExp transformations
- Integrated the new RegExp transformer into the existing transformation pipeline

### How to test?

1. Run the existing test suite to ensure no regressions
2. Execute the new RegExp-specific tests in the `tasks/transform_conformance/tests/esbuild-tests/test/fixtures/regexp/` directory
3. Try transforming code with various RegExp features using different target environments to verify correct transformations
@overlookmotel overlookmotel force-pushed the 09-01-feat_transformer_support_all_regexp_plugins branch from d963a2a to c59d8b3 Compare September 5, 2024 11:04
@graphite-app graphite-app bot merged commit c59d8b3 into main Sep 5, 2024
25 checks passed
@graphite-app graphite-app bot deleted the 09-01-feat_transformer_support_all_regexp_plugins branch September 5, 2024 11:08
@overlookmotel
Copy link
Contributor

Well Graphite merged it anyway! I'll do some follow-up PRs.

@oxc-bot oxc-bot mentioned this pull request Sep 6, 2024
Boshen added a commit that referenced this pull request Sep 6, 2024
## [0.27.0] - 2024-09-06

- bd820f9 semantic: [**BREAKING**] Remove
`SymbolTable::get_symbol_id_from_name` and
`SymbolTable::get_scope_id_from_name` (#5480) (overlookmotel)

- cba93f5 ast: [**BREAKING**] Add `ThisExpression` variants to
`JSXElementName` and `JSXMemberExpressionObject` (#5466) (overlookmotel)

- 87c5df2 ast: [**BREAKING**] Rename `Expression::without_parentheses`
(#5448) (overlookmotel)

### Features

- e8bdd12 allocator: Add `AsMut` impl for `Box` (#5515) (overlookmotel)
- 90facd3 ast: Add `ContentHash` trait; remove noop `Hash`
implementation from `Span` (#5451) (rzvxa)
- 23285f4 ast: Add `ContentEq` trait. (#5427) (rzvxa)
- 59abf27 ast, parser: Add `oxc_regular_expression` types to the parser
and AST. (#5256) (rzvxa)
- 68a1c01 ast_tools: Add dedicated `Derive` trait. (#5278) (rzvxa)
- c782916 codegen: Print `type_parameters` in `TaggedTemplateExpression`
(#5438) (Dunqing)
- 4cb63fe index: Impl rayon related to trait for IndexVec (#5421)
(IWANABETHATGUY)
- ba4b68c minifier: Remove parenthesized expression for dce (#5439)
(Boshen)
- ed8ab6d oxc: Conditional expose `oxc_cfg` in `oxc` crate (#5524)
(IWANABETHATGUY)
- 91b39c4 oxc_diagnostic: Impl DerefMut for OxcDiagnostic (#5474)
(IWANABETHATGUY)
- 10279f5 parser: Add syntax error for hyphen in `JSXMemberExpression`
`<Foo.bar-baz />` (#5440) (Boshen)
- 0f50b1e semantic: Check for initializers in ambient
`VariableDeclaration`s (#5463) (DonIsaac)
- 62f7fff semantic: Check for non-declared, non-abstract object
accessors without bodies (#5461) (DonIsaac)
- 5407143 semantic: Check for non-declared, non-abstract class accessors
without bodies (#5460) (DonIsaac)
- 052e94c semantic: Check for parameter properties in constructor
overloads (#5459) (DonIsaac)
- 32d4bbb transformer: Add `TransformOptions::enable_all` method (#5495)
(Boshen)
- c59d8b3 transformer: Support all /regex/ to `new RegExp` transforms
(#5387) (Dunqing)
- cedf7a4 xtask: Impl `as_ast_kind` method for each variant (#5491)
(IWANABETHATGUY)

### Bug Fixes

- 0df1d9d ast, codegen, linter: Panics in fixers. (#5431) (rzvxa)
- fce549e diagnostics: Ignore `Interrupted` and `BrokenPipe` errors
while printing (#5526) (Boshen)
- ea7a52f napi/transform: Fix test (Boshen)
- 9b984b3 regex: Panic on displaying surrogated `UnicodeEscape`
characters. (#5469) (rzvxa)
- 88b7ddb regular_expression: Handle unterminated character class
(#5523) (leaysgur)
- 7a797ac semantic: Incorrect reference when `MemberExpression` used in
`TSPropertySignature` (#5525) (Dunqing)
- d8b9909 semantic: `IdentifierReference` within `TSPropertySignature`
cannot reference type-only import binding (#5441) (Dunqing)
- 8f9627d transformer: RegExp transform do not transform invalid regexps
(#5494) (overlookmotel)
- 2060efc transformer: RegExp transform don't transform all RegExps
(#5486) (overlookmotel)
- cfe5497 transformer: Do not create double reference in JSX transform
(#5414) (overlookmotel)
- 0617249 transformer/nullish-coalescing-operator: Incorrect reference
flags (#5408) (Dunqing)
- 0eb32a6 traverse: Invalid variable name generated by
`generate_uid_based_on_node` (#5407) (Dunqing)- b96bea4 Add back
lifetime (#5507) (IWANABETHATGUY)

### Performance

- bfabd8f syntax: Further optimize `is_identifier_name` (#5426)
(overlookmotel)
- aeda84f syntax: Optimize `is_identifier_name` (#5425) (overlookmotel)
- ed8937e transformer: Memoize rope instance (#5518) (Dunqing)
- bfab091 transformer: Store needed options only on `RegExp` (#5484)
(overlookmotel)
- b4765af transformer: Pre-calculate if unsupported patterns in RegExp
transform (#5483) (overlookmotel)
- 182ab91 transformer: Pre-calculate unsupported flags in RegExp
transform (#5482) (overlookmotel)

### Documentation

- 64db1b4 ast: Clarify docs for `RegExpPattern` (#5497) (overlookmotel)
- 3f204a9 span: Update docs about `ContentEq` `Vec` comparison speed
(#5478) (overlookmotel)- 00511fd Use `oxc_index` instead of `index_vec`
in doc comments (#5423) (IWANABETHATGUY)

### Refactor

- 9f6e0ed ast: Simplify `ContentEq` trait definition. (#5468) (rzvxa)
- a43e951 ast: Use loop instead of recursion (#5447) (overlookmotel)
- 2224cc4 ast: Renumber `JSXMemberExpressionObject` discriminants
(#5464) (overlookmotel)
- a952c47 ast: Use loop not recursion (#5449) (overlookmotel)
- d9d7e7c ast: Remove `IdentifierName` from `TSThisParameter` (#5327)
(overlookmotel)
- ccc8a27 ast, ast_tools: Use full method path for generated derives
trait calls. (#5462) (rzvxa)
- fdb8857 linter: Use "parsed pattern" in `no_div_regex` rule. (#5417)
(rzvxa)
- e7bd49d regular_expression: Correct typo (#5429) (overlookmotel)
- e4ed41d semantic: Change the reference flag to `ReferenceFlags::Type`
if it is used within a `TSTypeQuery` (#5444) (Dunqing)
- 94a6ac6 span: Use `Hasher` from `std` (#5476) (overlookmotel)
- b47aca0 syntax: Use `generate_derive` for `CloneIn` in types outside
of `oxc_ast` crate. (#5280) (rzvxa)
- a96866d transformer: Re-order imports (#5499) (overlookmotel)
- 6abde0a transformer: Clarify match in RegExp transform (#5498)
(overlookmotel)
- 09c522a transformer: RegExp transform report pattern parsing errors
(#5496) (overlookmotel)
- dd19823 transformer: RegExp transform do not take ownership of
`Pattern` then reallocate it (#5492) (overlookmotel)
- 2514cc9 transformer/react: Move all entry points to implementation of
Traverse trait (#5473) (Dunqing)
- c984219 transformer/typescript: Move all entry points to
implementation of Traverse trait (#5422) (Dunqing)

### Styling

- 2a43fa4 linter: Introduce the writing style from PR #5491 and reduce
the if nesting (#5512) (dalaoshu)

### Testing

- 340b535 linter/no-unused-vars: Arrow functions in tagged templates
(#5510) (Don Isaac)

Co-authored-by: Boshen <1430279+Boshen@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-transformer Area - Transformer / Transpiler
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants