Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move all JS codegeneration to the generate-js pass #117

Merged
merged 9 commits into from
May 20, 2021

Conversation

Mingun
Copy link
Member

@Mingun Mingun commented Apr 27, 2021

Port of my PR to solve that problem: pegjs/pegjs#450.

Besides solving the original incapsulation problems, this PR also opens doors for the source map support (#105) by introducing place where original position information can be stored.

I'm not sure, should this be considered as breaking change or not. Strictly speaking, that it is, because it:

  • removes the consts member from the grammar node, which some third-party passes can expect (I can restore that, but if third-party passes modifies it, that modifications will not have the effect)
  • changes the name of the opcode MATCH_REGEXP to MATCH_CLASS (not strictly necessary, can be removed).

So if that changes is unacceptable now, can be held that until release 2.0.

Fixes #106

@StoneCypher
Copy link
Contributor

changes the name of the opcode MATCH_REGEXP to MATCH_CLASS (not strictly necessary, can be removed).

It seems like this would introduce confusion with regards to es6 classes?

@StoneCypher
Copy link
Contributor

removes the consts member from the grammar node, which some third-party passes can expect (I can restore that, but if third-party passes modifies it, that modifications will not have the effect)

I would like to understand what this means better, please

I assume this doesn't prevent peggy from parsing const, so, what's the goal here? Was it just previously under the wrong part of the grammar, or something like that?

@Mingun
Copy link
Member Author

Mingun commented Apr 27, 2021

I would like to understand what this means better, please

I mean, that if you wrote a plugin that attached a pass to the generate stage between generateBytecode and generateJs and that pass does something like that:

let myPass = visitor.build({
  grammar(node) {
    let consts = node.consts;// Uhh... where it now!?
    // use consts
  }
});

your plugin in a risk group.

@StoneCypher
Copy link
Contributor

There's almost no peg plugins. This is in some way a different product. This doesn't appear to interfere with the coffeescript or typescript plugins.

I'm mostly in support of this, despite that it is a breaking change, because it's a very minor breakage in a feature that's mostly unused. I think I'd be in total support of this when I knew:

  1. How hard would it be to update a plugin to your new structure?
  2. Why is it actually being done?

@Mingun
Copy link
Member Author

Mingun commented Apr 27, 2021

  1. If you not use opcodes.js and generate-bytecode.js, then nothing to do for update.
  2. Why I done that in the first time? I tried to write generator for other languages, Java, to be honest. And I realized that I need to decode JS regexps and JS code to make a port, or write my own copy of generate-bytecode.js, or not use bytecode at all. Then David create an issue where he want to remove excess evals and that change does exactly that.

@StoneCypher
Copy link
Contributor

yeah the eval is a problem, it makes a lot of build tools angry (and rightfully so)

in addition to being a security win, it'll also be a pretty big speed win, i suspect

@StoneCypher
Copy link
Contributor

i support this pr

@hildjj
Copy link
Contributor

hildjj commented Apr 28, 2021

changes the name of the opcode MATCH_REGEXP to MATCH_CLASS (not strictly necessary, can be removed).

It seems like this would introduce confusion with regards to es6 classes?

We already have a bunch of bits that refer to character classes [^abc] but it would be consistent to always say "character class" whenever we're referring to those. I'll admit to having been confused a few times looking through the code in the places where we didn't do that.

Copy link
Contributor

@hildjj hildjj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Look at the suggested optimizations, please. They are not mandatory, but I'd like you to at least rule them out before we merge. I want to land #107 before this, and #116 after this.

lib/compiler/passes/generate-bytecode.js Show resolved Hide resolved
lib/compiler/passes/generate-bytecode.js Show resolved Hide resolved
lib/compiler/passes/generate-bytecode.js Show resolved Hide resolved
lib/compiler/passes/generate-bytecode.js Show resolved Hide resolved
lib/compiler/passes/generate-bytecode.js Show resolved Hide resolved
lib/compiler/passes/generate-js.js Outdated Show resolved Hide resolved
@hildjj
Copy link
Contributor

hildjj commented May 2, 2021

@Mingun if you rebase this, I'd be happy to merge it next.

@Mingun
Copy link
Member Author

Mingun commented May 3, 2021

I've rebased and change MATCH_CLASS to MATCH_CHAR_CLASS (I still can drop that commit completely if you wish). All other remained unchanged (justifications inline)

@Mingun Mingun force-pushed the bc-cleanup branch 3 times, most recently from 652b4e0 to 1ce2528 Compare May 4, 2021 19:41
@hildjj
Copy link
Contributor

hildjj commented May 18, 2021

Is this the next patch to land? It needs a rebase onto main at least, to resolve the conflicts, but I'll go through it with a closer eye if we're about ready.

Mingun added 6 commits May 18, 2021 22:13
…on by content types into:

* `literals` -- for literal expressions, like `'a'`
* `classes` -- for character class expressions, like `[a]`
* `expectations` -- for constants describing expected values when parse failed
* `functions` -- for constants with function code
…rating to `generateJs` pass from `generateBytecode` pass.

That allows to generate parser on other languages and more easy replace actions.
Also, this is helpful for better hiding of the peggy internals from an initializers
and functions code, which is needed for the import feature.

After this commit, a function code (semantic predicates and actions) is stored
in the `functions` field of the grammar node instead of the `consts` field.

This is also opens doors for supporting source maps, because we get a place for
the storing information about the original positions of the codes
…eneration to `generateJs` pass from `generateBytecode` pass.
…ting to `generateJs` pass from `generateBytecode` pass.
…ation to pass `generateJs` from pass `generateBytecode`.
@Mingun
Copy link
Member Author

Mingun commented May 18, 2021

Yes, it is ready for review. I've made a rebase so please review.

Copy link
Contributor

@hildjj hildjj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than finishing the backward-compat discussion for plugins, this is ready to go.

Note for myself later: the escaping routines are not new, they were just moved from a now-unnecessary separate file. I think we should discuss how we're doing regexp and string escaping in #15, since we'll need to touch this code then anyway.

@@ -4,52 +4,52 @@
const opcodes = {
// Stack Manipulation

PUSH: 0, // PUSH c
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this (as an example of several opcode changes) conflict with the typescript plugin? Is @metadevpro following the work here yet? How could they support both peggy and pegjs if they wanted?

Copy link
Member Author

@Mingun Mingun May 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will. Instead of renaming that opcode we can just leave it unused and introduce a new opcode for PUSH_EMPTY_STRING. Renaming MATCH_REGEXP also should be reverted. Also we could add an alias instead of renaming or just do nothing. Then in version 2.0 we could completely remove the unused opcode.

On the other hand, @metadevpro could create a local copy of opcodes, he anyway is not needed to use exactly the same opcodes as peggy used because they used only as a glue between generate-bytecode-ts and generate-ts passes.

Copy link
Contributor

@hildjj hildjj May 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, my suggestion from your choices above:

  • Leave PUSH as 0. Mark it as deprecated with a comment.
  • Create a new opcode for PUSH_EMPTY_STRING
  • Leave MATCH_CHAR_CLASS and MATCH_REGEXP both as opcode 20, but mark MATCH_REGEXP as deprecated with a comment.

I think this will allow plugins to support both peggy and pegjs at the same time adequately. If you expect that plugins would need to take different actions for MATCH_CHAR_CLASS and MATCH_REGEXP, then don't alias them together, and instead leave MATCH_REGEXP as opcode 20, create a new opcode for MATCH_CHAR_CLASS.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All done as you said. I've make a rebase , changed only the last and the pre-pre-last commits. MATCH_REGEXP and PUSH marked @deprecated, noted this in the CHANGELOG.md

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Mingun and others added 2 commits May 21, 2021 00:26
…GEXP` -> `MATCH_CHAR_CLASS`.

New name reflects the opcode purpose, not implementation.
…h PUSH_EMPTY_STRING opcode because it only used with empty strings
@hildjj hildjj merged commit be21fa2 into peggyjs:main May 20, 2021
@Mingun Mingun deleted the bc-cleanup branch May 21, 2021 13:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Can we somehow avoid eval?
3 participants