Bad Token's line and column when code line is broken with backslash #14

grzegorz8 · 2014-02-01T20:23:51Z

When we define a multi-line macro, such as:

6: #define THREE 1 \
7:    + \
8:    2

we could expect that calling token.getLine() for Token representing number "2" would return line 8, but surprisingly the entire define definition is regarded as one-line preprocessor directive, so the result is 6.

The tokens list representing macro THREE:

[HASH@6,0]:"#"
[IDENTIFIER@6,1]:"define"
[IDENTIFIER@6,8]:"A"
[(@6,9]:"("
[IDENTIFIER@6,10]:"a"
[,@6,11]:","
[IDENTIFIER@6,13]:"b"
[)@6,14]:")"
[IDENTIFIER@6,16]:"a"
[WHITESPACE@6,17]:"    "
[+@6,21]:"+"
[WHITESPACE@6,22]:"    "
[IDENTIFIER@6,26]:"b"
[NL@6,27]:"

I'm aware that tokens list in Macro is not public, but still line and column numbers should be correct.

The text was updated successfully, but these errors were encountered:

shevek · 2014-02-02T20:08:51Z

mm, this is presumably due to a weirdness in the cpp spec where backslash-newline is elided and reinserted after the line. We use JoinReader to elide the \ sequences. In order to fix this, it's likely that we will have to merge JoinReader into LexerSource.

Hrrrrnnnng. OK, I accept this as a good bug, but I'll have to think about how to fix it!

grzegorz8 · 2014-03-02T21:44:05Z

What's more, if we have a string broken by backslashes into multi-line token, the location of the following tokens is wrong. Example:

4: char *string = "a \
5:     b \
6:     c";

The expected semicolon's location is (6, 8) but actual is (4, 31).

shevek · 2014-03-03T19:55:06Z

You're quite right. I need to merge JoinReader into LexerSource, but how one knows whether to unget a \ is a little beyond me at this time in the morning. Suggestions taken, else I'll get there soon enough. :-) I appreciate the test cases.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bad Token's line and column when code line is broken with backslash #14

Bad Token's line and column when code line is broken with backslash #14

grzegorz8 commented Feb 1, 2014

shevek commented Feb 2, 2014

grzegorz8 commented Mar 2, 2014

shevek commented Mar 3, 2014

Bad Token's line and column when code line is broken with backslash #14

Bad Token's line and column when code line is broken with backslash #14

Comments

grzegorz8 commented Feb 1, 2014

shevek commented Feb 2, 2014

grzegorz8 commented Mar 2, 2014

shevek commented Mar 3, 2014