Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not working optimized Pillar parser #20

Closed
JurajKubelka opened this issue Jun 22, 2018 · 6 comments
Closed

Not working optimized Pillar parser #20

JurajKubelka opened this issue Jun 22, 2018 · 6 comments
Labels

Comments

@JurajKubelka
Copy link
Collaborator

Hi @kursjan,

we have migrated Pillar parser to use Petit parser 2. We have spotted one issue using optimize and PP2GuardVisitor is the responsible one.

Here is the example:

PR2PillarParser new parse: 'a'. "--> returns a PRDocument object"
PR2PillarParser new optimize parse: 'a'. "--> produces an error and it should not"

Steps to reproduce:

Currently, the Pillar parser is stored in Documenter repository. You can install it in Pharo 6.1 using:

Metacello new
   baseline: 'GToolkitDocumenter';
   repository: 'github://feenkcom/gtoolkit-documenter/src';
   load.

I will appreciate if you can help us with it. I do not understand how to debug PP2GuardVisitor optimizations.

@girba
Copy link
Collaborator

girba commented Jun 22, 2018

Indeed, it would be great to get support for this issue. I would also be very interested to learn how to debug such issues in the future, too :)

@kursjan
Copy link
Owner

kursjan commented Jun 22, 2018

Thanks for migrating to PP2!

I commited a fix 1, I will add some details about why/how to debug during the weekend.

@JurajKubelka
Copy link
Collaborator Author

Thank you! I can confirm that 36f878b fixes the issue.

@girba
Copy link
Collaborator

girba commented Jun 23, 2018

Thanks a lot, @kursjan! After migrating to PP2, we got a parsing speedup of 2x, which is very relevant for tooling.

pp1 := GtPillarExtendedParser new.
[
    100 timesRepeat: [
        pp1 parse:
    (IceRepository repositoriesLocation / 'feenkcom'/ 'gtoolkit-examples' / 'doc' / 'tutorial' / 'examples-tutorial.pillar') contents ] ] timeToRun
"0:00:00:05.545".


pp2 := PR2PillarParser new optimize.
[
    100 timesRepeat: [
        pp2 parse:
        (IceRepository repositoriesLocation / 'feenkcom'/ 'gtoolkit-examples' / 'doc' / 'tutorial' / 'examples-tutorial.pillar') contents ] ] timeToRun.  
 "0:00:00:02.347"

@girba girba closed this as completed Jun 23, 2018
@kursjan
Copy link
Owner

kursjan commented Jun 23, 2018

Thanks @girba for the speedup report. 2x speedup is reasonable, I will have a look where are the inefficiencies if we can implement optimizations to improve it even more.

@kursjan
Copy link
Owner

kursjan commented Jun 23, 2018

Now how to debug/fix this issue

I installed GUI component of PP2 (IMO Moose should do this by default).

I run your use-cases in the debug mode.

PR2PillarGrammar new debug: 'a'.
PR2PillarGrammar new optimize debug: 'a'. 

screenshot from 2018-06-22 22-25-40

In optimized mode, paragraph is properly parsed, but lineEnd parser fails because of a guard.
screenshot from 2018-06-22 22-25-26

Inspecting the guard shows that guard does not fail if an input continues with Character cr or Character lf (comming from newline). The problem seems end of input (#eoi asPParser) in lineEnd.
screenshot from 2018-06-23 12-52-35

The main supect is firstCharSet method of PP2EndOfInputNode. It is rather unintuitive, because #endOfInput is a virtual character, is not a character in the input stream. Thus the return value should be unknown character set instead of empty character set (naming is confusing here 👎)
screenshot from 2018-06-23 12-55-21

PP2 deals with the rest on its own.

I was for a while confused with guards as well. It is usefull to check PP2GuardVisitor.

  • It adds guards to all the children of choices (visitChoice), but not to the choice itself
  • It adds guards to delegates (visitDelegate), this is the guard that made trouble in this bug, optional and repeating nodes (visitOptional, visitPossesiveRepeating)
  • sequences (visitSequence) are visited guarding only the first node (not sure why 😮), trimmed nodes (visitTrimming) seem complicated (I don't remember what does it do 😢)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants