Skip to content
This repository has been archived by the owner on Dec 15, 2022. It is now read-only.

Improve #error and #warning tokenization #251

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

fnadeau
Copy link

@fnadeau fnadeau commented Oct 31, 2017

Fix highlight of error directive

Requirements

  • Filling out the template is required. Any pull request that does not include enough information to be reviewed in a timely manner may be closed at the maintainers' discretion.
  • All new code requires tests to ensure against regressions

Description of the Change

This treats text following warning and error as plain text with no interpretation.

There is one flaw with #233:
From: https://gcc.gnu.org/onlinedocs/cpp/Diagnostics.html
'The line must consist of complete tokens. It is wisest to make the argument of these directives be a single string constant; this avoids problems with apostrophes and the like.'

Unmatch token will generate compilation warning:

fnadeau@foobar:~/github$ cat issue_233.c 
#warning "This is a warning"
#warning Well, you don't actually need quote
#warning "This is a multiple \ /* asdg */ \
int allo = 14;"

#error doesn't 

#error need "fixing

int j;

int main (void) {

  return 0;
}
fnadeau@foobar:~/github$ gcc issue_233.c -o issue_233
issue_233.c:1:2: warning: #warning "This is a warning" [-Wcpp]
 #warning "This is a warning"
  ^
issue_233.c:2:23: warning: missing terminating ' character [enabled by default]
 #warning Well, you don't actually need quote
                       ^
issue_233.c:2:2: warning: #warning Well, you don't actually need quote [-Wcpp]
 #warning Well, you don't actually need quote
  ^
issue_233.c:3:2: warning: #warning "This is a multiple \ /* asdg */ int allo = 14;" [-Wcpp]
 #warning "This is a multiple \ /* asdg */ \
  ^
issue_233.c:6:13: warning: missing terminating ' character [enabled by default]
 #error doesn't 
             ^
issue_233.c:6:2: error: #error doesn't 
 #error doesn't 
  ^
issue_233.c:8:13: warning: missing terminating " character [enabled by default]
 #error need "fixing
             ^
issue_233.c:8:2: error: #error need "fixing
 #error need "fixing
  ^

GCC does threat comments in a warning/error as the message and not has comment.

issue_233

Alternate Designs

None

Benefits

Fix highlight issue when unmatch token are in a error/warning comment

Possible Drawbacks

None known.

Applicable Issues

Fixes #233

Frédéric Nadeau, ing added 2 commits October 31, 2017 17:04
Fix highlight of error directive
Fix issue with spec.coffee failing. Message were not
properly tagged.
@fnadeau
Copy link
Author

fnadeau commented Nov 6, 2017

error/warning message are now properly scoped.

After and before screen shot
screenshot from 2017-11-06 13 25 03

fnadeau@foobar:~/github$ cat issue_233.c 
#warning "This is a warning"
#warning Well, you don't actually need quote, but it's highly recommended

#warning "This is a multiple \
int allo = 14;"

#warning "This is a different multiple" \
"int allo = 14;"

#warning "This is a double quoted multiple line with stupid stuff \ /* asdg */ \
int allo = 14;"

#warning 'This is a single quoted multiple line with stupid stuff \ /* asdg */ \
int allo = 14;'

#warning This is a unquoted quoted multiple line with stupid stuff \ /* asdg */ \
int allo = 14;

#error doesn't
#error need "fixing

#warning "Warning message are not placeholder %d since they are not processed"
#warning "Escape char such as \ are legal in a warning message"

#error An unquoted multiple \
lines error message

#error Unquoted message, /* yet another one */ but different

#error "Comment is not part of the error message." /* comment */


int main (void) {

  return 0;
}
fnadeau@foobar:~/github$ gcc issue_233.c -o issue_233 2>&1  | grep "^issue_233"
issue_233.c:1:2: warning: #warning "This is a warning" [-Wcpp]
issue_233.c:2:2: warning: #warning Well, you don't actually need quote, but it's highly recommended [-Wcpp]
issue_233.c:4:2: warning: #warning "This is a multiple int allo = 14;" [-Wcpp]
issue_233.c:7:2: warning: #warning "This is a different multiple" "int allo = 14;" [-Wcpp]
issue_233.c:10:2: warning: #warning "This is a double quoted multiple line with stupid stuff \ /* asdg */ int allo = 14;" [-Wcpp]
issue_233.c:13:2: warning: #warning 'This is a single quoted multiple line with stupid stuff \ /* asdg */ int allo = 14;' [-Wcpp]
issue_233.c:16:2: warning: #warning This is a unquoted quoted multiple line with stupid stuff \ int allo = 14; [-Wcpp]
issue_233.c:19:13: warning: missing terminating ' character [enabled by default]
issue_233.c:19:2: error: #error doesn't
issue_233.c:20:13: warning: missing terminating " character [enabled by default]
issue_233.c:20:2: error: #error need "fixing
issue_233.c:22:2: warning: #warning "Warning message are not placeholder %d since they are not processed" [-Wcpp]
issue_233.c:23:2: warning: #warning "Escape char such as \ are legal in a warning message" [-Wcpp]
issue_233.c:25:2: error: #error An unquoted multiple lines error message
issue_233.c:28:2: error: #error Unquoted message, but different
issue_233.c:30:2: error: #error "Comment is not part of the error message."

@sean-mcmanus
Copy link

What's the status of this pull request? If this fixes the issue without any regressions we would like to get the fix in.

Copy link
Contributor

@50Wliu 50Wliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay. Left some comments, but this will need spec coverage in order to be merged.

# unquoted strings patterns for #error/warning lines (terminates at newline w/o line_continuation_character)
'begin': '[^\'"]'
'end': '(?<!\\\\)(?=\\s*\\n)'
'name': 'string.unquoted.single.c'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should just be string.unquoted.c

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Name chose was based on already present string.quoted.single.c at line 133. Should I still change it to string.unquoted.c?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

L133 is string.quoted.single.c because it uses single quotes. As an unquoted string (by the name) has no quotes, it should just be string.unquoted.c. This is consistent with other languages across the Atom organization.

'begin': '[^\'"]'
'end': '(?<!\\\\)(?=\\s*\\n)'
'name': 'string.unquoted.single.c'
'patterns': [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that GCC emits warnings by default if you have mismatched quotes, would it make sense to match for quotes here and scope them as invalid.deprecated.mismatched-quote.c?

{
  'match': '[\'"]'
  'name': 'invalid.deprecated.mismatched-quote.c'
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried your proposal and it looks like this:
image
I'm not sure what you means, English is not my mother tongue, I might have missed something.

My understanding is that GCC is not that cleaver. There is no warning about missing/missed used of quote for this:
#warning Well, you don't actually need quote, but it's highly recommended
but there is for:
#error doesn't

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 then let's not add it to avoid highlighting quotes as errors when they shouldn't be.

'1':
'name': 'keyword.control.directive.diagnostic.$3.c'
'2':
'name': 'punctuation.definition.directive.c'
'end': '(?<!\\\\)(?=\\n)|(?=//)|(?=/\\*(?!.*\\\\\\s*\\n))'
'end': '(?<!\\\\)(?=\\n)'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the lookahead for comments was removed here. Was that intentional?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Short answer, yes. I see no benefit of having this, thus I removed it. I have to admit that it might not be the proper way to do things. While we have the opportunity, could you provide an example where having this makes a difference? I couldn't come up with one.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, I previously thought that #error test /* comment */ test would terminate the error message at the comment, but clearly based on your tests it doesn't.

'include': '#line_continuation_character'
}
{
'include': '#comments'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this will match comments inside the unquoted error message such as #error test /* comment */ test.

Copy link
Author

@fnadeau fnadeau Jan 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, GCC output for
#error Unquoted message, /* yet another one */ but different
will be
issue_233.c:28:2: error: #error Unquoted message, but different
Comment is not actually part of the error message when unquoted.

If error message is quoted, then comment becomes part of it.

@50Wliu
Copy link
Contributor

50Wliu commented Jan 19, 2018

Thanks for the detailed responses @fnadeau 🙇. If you could just change the scope to string.unquoted.c and then add some spec coverage I'll be happy to merge! Please tell me if you need help adding the specs.

@50Wliu 50Wliu changed the title 🐛 #233 Improve #error and #warning tokenization Jan 19, 2018
@fnadeau
Copy link
Author

fnadeau commented Jan 24, 2018

Update: I'm working on this still. I found issues with error/warning message on multiple lines when updating spec file.

@ThatXliner
Copy link

ThatXliner commented Jan 20, 2021

What’s the status on this? (Jeff-hykin made that vs code thingy but I still want to use atom)

@mavenor
Copy link

mavenor commented Aug 28, 2021

TL;DR

  • Does breaking a large error message into separate lines with escaped LF/CRLF (\⏎) work fine after these patches?
  • This entire thought assumes that the GitHub web app uses the same linting/syntax-highlighting code as atom, so if it doesn't, please do call me out on my naïveté, and literally TL;DR :)

Hi! It's a bit late where I'm from rn 🥱, so I'm not in a position to process all the regex above; please bear with me.
Have (escaped) line-breaks in the error/warning messages/tokens — i.e. #error foo \⏎ bar — been taken into consideration in this PR?
Just noticed that the GitHub web app, only when viewing code, and not in the code editor, starts treating anything after such a break as live C/C++ code.

i.e., the following seems fine…

#error The quick brown fox called 'John Appleseed' jumped around and broke my code… for a while

but stuff breaks when the viewer encounters this:

#error The quick brown fox jumped, halted, caught its breath… \
and then proceeded to step all over my code. Again, for a while.

PS: I know a lot of you hardcore folks are going to be mad that I soiled the ethos of C-like coding and all by even trying to write error/warning messages that are long enough to be broken into separate lines 😅, but once I found the little glitch, I couldn't forget about it…

@jeff-hykin
Copy link

jeff-hykin commented Aug 29, 2021

@ThatXliner

What’s the status on this? (Jeff-hykin made that vs code thingy but I still want to use atom)

Atom defaults to the tree parser. If this bug still exists on the tree parser, then we should open up an issue on the c++ tree sitter grammar repo.

However, it is still possible to use the textmate parser in atom. In that case the cpp grammar I made can be ported almost effortlessly (only metadata and file name changes).

But, the most up to date version of Atom has been the atom community edition. So we should probably make a fork of this repo, then ask them to use the fork instead, because Atom team basically isn't merging anything.

@Assaf1978-boop
Copy link

Seems,cool!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

#error directive doesn't handle ' or " correctly
7 participants