Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bad Token's line and column when code line is broken with backslash #14

Open
grzegorz8 opened this issue Feb 1, 2014 · 3 comments
Open

Comments

@grzegorz8
Copy link

When we define a multi-line macro, such as:

6: #define THREE 1 \
7:    + \
8:    2

we could expect that calling token.getLine() for Token representing number "2" would return line 8, but surprisingly the entire define definition is regarded as one-line preprocessor directive, so the result is 6.

The tokens list representing macro THREE:

[HASH@6,0]:"#"
[IDENTIFIER@6,1]:"define"
[IDENTIFIER@6,8]:"A"
[(@6,9]:"("
[IDENTIFIER@6,10]:"a"
[,@6,11]:","
[IDENTIFIER@6,13]:"b"
[)@6,14]:")"
[IDENTIFIER@6,16]:"a"
[WHITESPACE@6,17]:"    "
[+@6,21]:"+"
[WHITESPACE@6,22]:"    "
[IDENTIFIER@6,26]:"b"
[NL@6,27]:"

I'm aware that tokens list in Macro is not public, but still line and column numbers should be correct.

@shevek
Copy link
Owner

shevek commented Feb 2, 2014

mm, this is presumably due to a weirdness in the cpp spec where backslash-newline is elided and reinserted after the line. We use JoinReader to elide the \ sequences. In order to fix this, it's likely that we will have to merge JoinReader into LexerSource.

Hrrrrnnnng. OK, I accept this as a good bug, but I'll have to think about how to fix it!

@grzegorz8
Copy link
Author

What's more, if we have a string broken by backslashes into multi-line token, the location of the following tokens is wrong. Example:

4: char *string = "a \
5:     b \
6:     c";

The expected semicolon's location is (6, 8) but actual is (4, 31).

@shevek
Copy link
Owner

shevek commented Mar 3, 2014

You're quite right. I need to merge JoinReader into LexerSource, but how one knows whether to unget a \ is a little beyond me at this time in the morning. Suggestions taken, else I'll get there soon enough. :-) I appreciate the test cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants