Changing root feature #304

micrenda · 2024-08-23T05:38:46Z

I would like to ask if it is possible to pass a specific target rule instead of using the main priority chain when parsing a string.

Let me clarify with an example:

Suppose I have the following rule set:

species    <- molecule ( '(' excitatopm ')' )?
molecule <- # Description of a molecule
excitation <- excitation_ele / excitation_vib / excitation_rot
excitation_ele <- # something
excitation_vib <- # something
excitation_rot <- # something

Usually, in my code, I would do something like this:

pegParser = peg::parser();
pegParser.load_grammar(s);
std::any result;
pegParser.parse("H2O(2V1)", result);

This works fine. However, in my unit tests or in other parts of the code, I might want to parse according to a specific rule. In that case, I would like to do something like this:

pegParser = peg::parser();
pegParser.load_grammar(s);
std::any result;
pegParser.parse("2V1", "excitation_vib", result);

This way, I would use excitation_vib as the root rule and expect an exception if excitation_vib does not fully consume the input.

Is this possible? With the current implementation, to achieve something like this, I would need to change the grammar by making the target rule the new root. However, I was wondering if there is a better way to do it.

The text was updated successfully, but these errors were encountered:

micrenda · 2024-08-29T09:49:38Z

Added PR #305 which implement this feature: it may need some rework.

yhirose · 2024-08-30T03:22:37Z

@micrenda thanks for the feedback, but I don't understand the example grammar... The grammar isn't valid. ('excitatopm' is not defined, and 'excitation' is not referenced.) So pegParser.load_grammar(s); doesn't work due to the incorrect grammar. cpp-peglib doesn't allow such incorrect grammar...

micrenda · 2024-08-30T06:53:19Z

Hello

In the example I wrote I just omitted the actual implementation, because it was not important (and I also made a typos!). Let me give you a valid grammar:

species    <- molecule ( ' ' '(' excitation ')' )?
molecule <- ([A-Z] [a-z]? [0-9]?)+
excitation <- excitation_ele / excitation_vib / excitation_rot
excitation_ele <- 'A' / 'B' / 'C'
excitation_vib <- [0-9]* 'V' [0-9]+
excitation_rot <- 'J' [0-9]+

In my code, now I can do something like this:

pegParser = peg::parser();
pegParser.load_grammar(s);
std::any result;
pegParser.parse("H2O (2V1)", result);

And it will work perfectly.

However, using the PR #305, it is now possible to also do this in unit testing or in other section of code:

pegParser = peg::parser();
pegParser.load_grammar(s);
std::any result;
pegParser.parse("2V1",  result, nullptr, "excitation_vib");

For me this is a life saver :-)

yhirose · 2024-08-31T01:05:42Z

Thanks for the clear explanation. I now fully understand what you would like to do. (By the way, I put comments in your pull request to fix problems that I found, and the following sample uses the revised version.)

Unfortunately, there are some situations where the parser doesn't work properly with this. %whitespace feature is one of them.

// sample.cc
#include <iostream>
#include <peglib.h>

using namespace peg;

int main(void) {
  parser parser(R"(
Start       <- A
A           <- B (',' B)*
B           <- '[one]' / '[two]'
%whitespace <- [ \t\n]*
  )");

  std::cout << std::boolalpha;

  std::cout << parser.parse("[one],[two]") << std::endl;
  std::cout << parser.parse(" [one] , [two] ") << std::endl;

  std::cout << parser.parse("[one],[two]", nullptr, "A") << std::endl;
  std::cout << parser.parse(" [one] , [two] ", nullptr, "A") << std::endl;
}

> ./sample
true
true
true
false

As you can see, %whitespace only works with Start. It's because cpp-peglib applies some special treatments only to the start rule. You can see what are added to the start rule in perform_core function.

cpp-peglib/peglib.h

Line 3992 in 5ef7180

std::shared_ptr<Grammar> perform_core(const char *s, size_t n,

yhirose · 2024-09-01T22:25:33Z

@micrenda I made a change to allow users to specify the start definition rule name in the parser constructor and load_grammar method at #306. (Unfortunately, we cannot do the same in parse method because of the reason I explained in the above comment. But hope this pull request can satisfy your needs.)

auto grammar = R"(
  Start       <- A
  A           <- B (',' B)*
  B           <- '[one]' / '[two]'
  %whitespace <- [ \t\n]*
)";

peg::parser parser(grammar, "A"); // Start Rule is "A"

  or

peg::parser parser;
parser.load_grammar(grammar, "A"); // Start Rule is "A"

parser.parse(" [one] , [two] "); // OK

Could you take a look at it when you have time? Thanks!

micrenda pushed a commit to micrenda/cpp-peglib that referenced this issue Aug 29, 2024

Resolve yhirose#304: adding start node parameter

ea771f4

yhirose added a commit that referenced this issue Sep 1, 2024

Fix #304

5e33020

yhirose added a commit that referenced this issue Sep 2, 2024

Fix #304

6602b1d

yhirose added the enhancement label Sep 2, 2024

yhirose added a commit that referenced this issue Sep 2, 2024

Fix #304

4c30f90

yhirose closed this as completed in 79eb37c Sep 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Changing root feature #304

Changing root feature #304

micrenda commented Aug 23, 2024

micrenda commented Aug 29, 2024 •

edited

Loading

yhirose commented Aug 30, 2024

micrenda commented Aug 30, 2024 •

edited

Loading

yhirose commented Aug 31, 2024

yhirose commented Sep 1, 2024 •

edited

Loading

Changing root feature #304

Changing root feature #304

Comments

micrenda commented Aug 23, 2024

micrenda commented Aug 29, 2024 • edited Loading

yhirose commented Aug 30, 2024

micrenda commented Aug 30, 2024 • edited Loading

yhirose commented Aug 31, 2024

yhirose commented Sep 1, 2024 • edited Loading

micrenda commented Aug 29, 2024 •

edited

Loading

micrenda commented Aug 30, 2024 •

edited

Loading

yhirose commented Sep 1, 2024 •

edited

Loading