Go target generated code for Java.g4 grammar doesn't compile #1397

ereyes01 · 2016-11-22T23:08:29Z

Before submitting an issue to ANTLR, please check off these boxes:

[*] I am not submitting a question on how to use ANTLR; instead, go to antlr4-discussion google group or ask at stackoverflow
[*] I have done a search of the existing issues to make sure I'm not sending in a duplicate

Expected behavior

antlr4 -Dlanguage=Go Java.g4 should generate Go code that compiles. Java.g4 was taken from github.com/antlr/grammars-v4

Other info:

antlr4 command is: alias antlr4='java -Xmx500M -cp "/home/ereyes/.m2/repository/org/antlr/antlr4/4.6-SNAPSHOT/antlr4-4.6-SNAPSHOT.jar:$CLASSPATH" org.antlr.v4.Tool'

$ go version
go version go1.7.1 linux/amd64

Actual behavior

The generated code does not compile. Here's what the Go compiler says:

$ go install ./
# /home/ereyes/code/antlr4-java/parser
./java_lexer.go:726: syntax error: unexpected _input, expecting comma or )
./java_lexer.go:739: syntax error: unexpected _input, expecting comma or )

Steps to reproduce the behavior

$ cd ~/code
$ git clone https://github.com/antlr/grammars-v4
$ mkdir -p antlr4-java/src/parser
$ export GOPATH=$HOME/code/antlr4-java
$ cd antlr4-java/src/parser
$ cp ~/code/grammars-v4/java/Java.g4 ./
$ antlr4 -Dlanguage=Go Java.g4
$ go get github.com/antlr/antlr4/runtime/Go/antlr
$ go install ./

I looked at the error and attempted to fix by applying the patch below, but it still doesn't fix the problem:

diff --git a/java_lexer.go b/java_lexer.go
index b7b70c6..4b5114f 100644
--- a/java_lexer.go
+++ b/java_lexer.go
@@ -723,7 +723,7 @@ func (p *JavaLexer) JavaLetter_Sempred(localctx antlr.RuleContext, predIndex int
 			return Character.isJavaIdentifierStart(_input.LA(-1))
 
 	case 1:
-			return Character.isJavaIdentifierStart(Character.toCodePoint((char)_input.LA(-2), (char)_input.LA(-1)))
+			return Character.isJavaIdentifierStart(Character.toCodePoint((char)(_input.LA(-2)), (char)(_input.LA(-1))))
 
 	default:
 		panic("No predicate with index: " + fmt.Sprint(predIndex))
@@ -736,7 +736,7 @@ func (p *JavaLexer) JavaLetterOrDigit_Sempred(localctx antlr.RuleContext, predIn
 			return Character.isJavaIdentifierPart(_input.LA(-1))
 
 	case 3:
-			return Character.isJavaIdentifierPart(Character.toCodePoint((char)_input.LA(-2), (char)_input.LA(-1)))
+			return Character.isJavaIdentifierPart(Character.toCodePoint((char)(_input.LA(-2)), (char)(_input.LA(-1))))
 
 	default:
 		panic("No predicate with index: " + fmt.Sprint(predIndex))

... I tried to fix the parentheses in the cast in java_lexer.go. I also wondered if the intention was to cast to rune instead of char, which isn't a built-in type in Go. Anyways, here are the compiler errors after applying this patch:

./java_lexer.go:723: undefined: Character in Character.isJavaIdentifierStart
./java_lexer.go:723: undefined: _input in _input.LA
./java_lexer.go:726: undefined: Character in Character.isJavaIdentifierStart
./java_lexer.go:726: undefined: char
./java_lexer.go:726: undefined: _input in _input.LA
./java_lexer.go:726: undefined: char
./java_lexer.go:726: undefined: _input in _input.LA
./java_lexer.go:736: undefined: Character in Character.isJavaIdentifierPart
./java_lexer.go:736: undefined: _input
./java_parser.go:14393: no new variables on left side of :=
./java_lexer.go:736: too many errors

... and after changing char to rune:

./java_lexer.go:723: undefined: Character in Character.isJavaIdentifierStart
./java_lexer.go:723: undefined: _input in _input.LA
./java_lexer.go:726: undefined: Character in Character.isJavaIdentifierStart
./java_lexer.go:726: undefined: _input in _input.LA
./java_lexer.go:736: undefined: Character in Character.isJavaIdentifierPart
./java_lexer.go:736: undefined: _input in _input.LA
./java_lexer.go:739: undefined: Character in Character.isJavaIdentifierPart
./java_lexer.go:739: undefined: _input in _input.LA
./java_parser.go:14393: no new variables on left side of :=

The text was updated successfully, but these errors were encountered:

ereyes01 · 2016-11-22T23:16:37Z

I should also note, I have gotten JSON.g4 to work with the Go target, and have been able to play with the generated parser apis...

KvanTTT · 2016-11-23T10:22:02Z

You should remove target-specific (Java) semantic predicates from grammar such as {Character.isJavaIdentifierStart(_input.LA(-1))}? and other.

ereyes01 · 2016-11-23T22:51:37Z

I see, this makes sense. Thanks for the quick reply! I'll try this out later this evening...

ereyes01 · 2016-11-25T03:28:28Z

I removed the inline Java code from the grammar and now I'm left with:

> go install .
# parser
./java_parser.go:14393: no new variables on left side of :=

In the generated code:

decl1>>>	_alt := p.GetInterpreter().AdaptivePredict(p.GetTokenStream(), 139, p.GetParserRuleContext())

		for _alt != 2 && _alt != antlr.ATNInvalidAltNumber {
	...
		}
		p.SetState(1232)
		p.GetErrorHandler().Sync(p)
decl2>>>	_alt := p.GetInterpreter().AdaptivePredict(p.GetTokenStream(), 140, p.GetParserRuleContext())

The code above was inside the parser for this rule in Java.g4:

arrayCreatorRest
    :   '['
        (   ']' ('[' ']')* arrayInitializer
        |   expression ']' ('[' expression ']')* ('[' ']')*
        )
    ;

Not sure if this is the same problem encountered in #1401 ?

parrt · 2016-11-28T18:35:47Z

@pboyer for you, sir!

pboyer · 2016-11-30T03:58:51Z

Thank you @ereyes01! I'll take a look at this tomorrow.

pboyer · 2016-11-30T04:23:24Z

@ereyes01 Would you mind including your modified Java.g4 grammar as well?

ereyes01 · 2016-11-30T04:32:04Z

Here's the patch I applied to Java.g4:

diff --git a/java/parser/Java.g4 b/java/parser/Java.g4
index 8743761..986dee6 100644
--- a/java/parser/Java.g4
+++ b/java/parser/Java.g4
@@ -980,10 +980,8 @@ JavaLetter
     :   [a-zA-Z$_] // these are the "java letters" below 0x7F
     |   // covers all characters above 0x7F which are not a surrogate
         ~[\u0000-\u007F\uD800-\uDBFF]
-        {Character.isJavaIdentifierStart(_input.LA(-1))}?
     |   // covers UTF-16 surrogate pairs encodings for U+10000 to U+10FFFF
         [\uD800-\uDBFF] [\uDC00-\uDFFF]
-        {Character.isJavaIdentifierStart(Character.toCodePoint((char)_input.LA(-2), (char)_input.LA(-1)))}?
     ;
 
 fragment
@@ -991,10 +989,8 @@ JavaLetterOrDigit
     :   [a-zA-Z0-9$_] // these are the "java letters or digits" below 0x7F
     |   // covers all characters above 0x7F which are not a surrogate
         ~[\u0000-\u007F\uD800-\uDBFF]
-        {Character.isJavaIdentifierPart(_input.LA(-1))}?
     |   // covers UTF-16 surrogate pairs encodings for U+10000 to U+10FFFF
         [\uD800-\uDBFF] [\uDC00-\uDFFF]
-        {Character.isJavaIdentifierPart(Character.toCodePoint((char)_input.LA(-2), (char)_input.LA(-1)))}?
     ;
 
 //

Thanks!

pboyer · 2016-11-30T21:02:03Z

@ereyes01 If you get the chance, do you think you can try out this fix? pboyer#90

If not, I should be able to get to it tomorrow.

ereyes01 · 2016-12-01T01:53:25Z

@pboyer it works 👍 thanks!

pboyer · 2016-12-01T03:34:33Z

Thanks so much for checking, @ereyes01!

Fix for #1397

parrt · 2016-12-01T18:49:11Z

Fixed by #1436

BurtHarris · 2017-06-18T00:03:13Z

I believe this still isn't good. It's using a predicate when a simple character class will do: according to the Java language specification there's no need for anything but ASCII letters and numbers.. See PR antlr/grammars-v4#762.

cowang · 2017-06-20T03:35:02Z

@BurtHarris No. Identifiers may be written using any of the world's languages. For more than you ever wanted to know, see my comments elsewhere.

Nhoya · 2019-02-22T10:22:31Z

This issue seems still there.. I'm using ANTLR 4.7.2 with go

go version
go1.11.5 windows/amd64

I generated the code with

java -jar antlr-4.7.2-complete.jar -Dlanguage=Go parser .\Java8.g4

The errors are in java8_lexer.go

syntax error: unexpected _input, expecting comma or ) [756:86]
syntax error: unexpected _input, expecting comma or ) [769:70]

Snippet:

L 756

return Character.isJavaIdentifierStart(Character.toCodePoint((char)_input.LA(-2), (char)_input.LA(-1)))

L 769

return Character.isJavaIdentifierPart(Character.toCodePoint((char)_input.LA(-2), (char)_input.LA(-1)))

KvanTTT · 2019-02-22T10:25:15Z

Try to use Java.g4 grammar instead of Java8.g4. It does not contain actions, semantic predicates and it's suitable for 8 version.

Nhoya · 2019-02-22T11:31:08Z

Hi @KvanTTT, thanks for the reply

Unfortunately Java.g4 is not available in https://github.com/antlr/grammars-v4/tree/master/java

KvanTTT · 2019-02-22T13:01:08Z

I mean the ordinary java grammar that consists of JavaLexer.g4 and JavaParser.g4.

Nhoya · 2019-02-22T13:49:54Z

I'm trying to build a Java.g4 from Lexer and Parser.. Apparently this is not well documented in the README (please correct me if I'm wrong) and the only valid reference seems this one #638 but is really messy... I tried with

java -jar antlr-4.7.2-complete.jar .\grammars-v4\java\JavaLexer.g4 .\grammars-v4\java\JavaLexer.g4 -o .

but without any output... Any hint about it?

EDIT:
solved with:

java -jar antlr-4.7.2-complete.jar -Dlanguage=Go -lib .\grammars-v4\java\*.g4 -o parser

Nhoya · 2019-02-22T14:19:42Z

Unfortunately the command ahead only generates

Mode                LastWriteTime         Length Name
----                -------------         ------ ----
-a----        2/22/2019     15:17          52118 JavaParser.interp
-a----        2/22/2019     15:17           2344 JavaParser.tokens
-a----        2/22/2019     15:17          36037 javaparser_base_listener.go
-a----        2/22/2019     15:17          28371 javaparser_listener.go
-a----        2/22/2019     15:17         511445 java_parser.go

As you can see java_lexer.go is missing @KvanTTT

KvanTTT · 2019-02-22T14:38:14Z

You should generate lexer at first.

Nhoya · 2019-02-22T15:04:05Z

If I use

java -jar 'C:\Javalib\antlr-4.7.2-complete.jar' -lib .\JavaLexer.g4 .\JavaParser.g4 -o build

I expect to see both the java_lexer.go and java_parser.go to be generated in the build directory. If this is not the case I'm missing something. Can you please link me the part of the documentation I'm skipping?

pboyer added a commit to pboyer/antlr4 that referenced this issue Nov 30, 2016

Potential fix for antlr/antlr4 antlr#1397

39c77ca

pboyer mentioned this issue Nov 30, 2016

Potential fix for antlr/antlr4 #1397 pboyer/antlr4#90

Merged

pboyer mentioned this issue Dec 1, 2016

Fix for #1397 #1436

Merged

parrt added a commit that referenced this issue Dec 1, 2016

Merge pull request #1436 from pboyer/patch-5

3ccb827

Fix for #1397

parrt closed this as completed Dec 1, 2016

parrt added target:go type:bug labels Dec 1, 2016

parrt added this to the 4.6 milestone Dec 1, 2016

jundl77 mentioned this issue May 27, 2017

Javascript target generated code for Java8.g4 grammar is not valid #1885

Closed

Nhoya mentioned this issue Mar 11, 2019

Unable to comple Java grammar correcty in golang #2508

Open

2 tasks

onuba mentioned this issue Jun 18, 2019

Unable to parse Java8.g4 with javascript runtime #2576

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Go target generated code for Java.g4 grammar doesn't compile #1397

Go target generated code for Java.g4 grammar doesn't compile #1397

ereyes01 commented Nov 22, 2016

ereyes01 commented Nov 22, 2016

KvanTTT commented Nov 23, 2016

ereyes01 commented Nov 23, 2016

ereyes01 commented Nov 25, 2016

parrt commented Nov 28, 2016

pboyer commented Nov 30, 2016

pboyer commented Nov 30, 2016

ereyes01 commented Nov 30, 2016

pboyer commented Nov 30, 2016

ereyes01 commented Dec 1, 2016

pboyer commented Dec 1, 2016

parrt commented Dec 1, 2016

BurtHarris commented Jun 18, 2017 •

edited

Loading

cowang commented Jun 20, 2017

Nhoya commented Feb 22, 2019 •

edited

Loading

KvanTTT commented Feb 22, 2019 •

edited

Loading

Nhoya commented Feb 22, 2019

KvanTTT commented Feb 22, 2019

Nhoya commented Feb 22, 2019 •

edited

Loading

Nhoya commented Feb 22, 2019

KvanTTT commented Feb 22, 2019

Nhoya commented Feb 22, 2019

Go target generated code for Java.g4 grammar doesn't compile #1397

Go target generated code for Java.g4 grammar doesn't compile #1397

Comments

ereyes01 commented Nov 22, 2016

Expected behavior

Actual behavior

Steps to reproduce the behavior

ereyes01 commented Nov 22, 2016

KvanTTT commented Nov 23, 2016

ereyes01 commented Nov 23, 2016

ereyes01 commented Nov 25, 2016

parrt commented Nov 28, 2016

pboyer commented Nov 30, 2016

pboyer commented Nov 30, 2016

ereyes01 commented Nov 30, 2016

pboyer commented Nov 30, 2016

ereyes01 commented Dec 1, 2016

pboyer commented Dec 1, 2016

parrt commented Dec 1, 2016

BurtHarris commented Jun 18, 2017 • edited Loading

cowang commented Jun 20, 2017

Nhoya commented Feb 22, 2019 • edited Loading

KvanTTT commented Feb 22, 2019 • edited Loading

Nhoya commented Feb 22, 2019

KvanTTT commented Feb 22, 2019

Nhoya commented Feb 22, 2019 • edited Loading

Nhoya commented Feb 22, 2019

KvanTTT commented Feb 22, 2019

Nhoya commented Feb 22, 2019

BurtHarris commented Jun 18, 2017 •

edited

Loading

Nhoya commented Feb 22, 2019 •

edited

Loading

KvanTTT commented Feb 22, 2019 •

edited

Loading

Nhoya commented Feb 22, 2019 •

edited

Loading