Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tools: add ASCII only lint rule in lib/ #11371

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions lib/.eslintrc.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,4 @@ rules:
require-buffer: 2
buffer-constructor: 2
no-let-in-for-declaration: 2
only-ascii-characters: 2
4 changes: 2 additions & 2 deletions lib/console.js
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ function createWriteErrorHandler(stream) {
// If there was an error, it will be emitted on `stream` as
// an `error` event. Adding a `once` listener will keep that error
// from becoming an uncaught exception, but since the handler is
// removed after the event, non-console.* writes wont be affected.
// removed after the event, non-console.* writes won't be affected.
stream.once('error', noop);
}
};
Expand All @@ -90,7 +90,7 @@ function write(ignoreErrors, stream, string, errorhandler) {

stream.write(string, errorhandler);
} catch (e) {
// Sorry, theres no proper way to pass along the error here.
// Sorry, there's no proper way to pass along the error here.
} finally {
stream.removeListener('error', noop);
}
Expand Down
4 changes: 4 additions & 0 deletions lib/timers.js
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,8 @@ const kOnTimeout = TimerWrap.kOnTimeout | 0;
const TIMEOUT_MAX = 2147483647; // 2^31-1


/* eslint-disable only-ascii-characters */

// HOW and WHY the timers implementation works the way it does.
//
// Timers are crucial to Node.js. Internally, any TCP I/O connection creates a
Expand Down Expand Up @@ -105,6 +107,8 @@ const TIMEOUT_MAX = 2147483647; // 2^31-1
// However, these operations combined have shown to be trivial in comparison to
// other alternative timers architectures.

/* eslint-enable only-ascii-characters */


// Object maps containing linked lists of timers, keyed and sorted by their
// duration in milliseconds.
Expand Down
65 changes: 65 additions & 0 deletions tools/eslint-rules/only-ascii-characters.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
/**
* @fileoverview Prohibit the use of non-ascii characters
* @author Kalon Hinds
*/

/* eslint no-control-regex:0 */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer see eslint-disable-line or eslint-disable-next-line to target the places where the control characters are needed.


'use strict';

//------------------------------------------------------------------------------
// Rule Definition
//------------------------------------------------------------------------------

const nonAsciiPattern = new RegExp('([^\x00-\x7F])', 'g');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be clearer to use a regex literal here instead of the RegExp constructor. Right now, \x00 and \x7F are interpreted as part of the string, so the resulting regex pattern actually contains a null character. This still works fine, but it could be confusing for debugging (e.g. if the regex is printed, it will be difficult to tell that it contains a null character).

const suggestions = {
'’': '\'',
'—': '-'
};
const reportError = ({line, column, character}, node, context) => {
const suggestion = suggestions[character];

let message = `Non-ASCII character ${character} detected.`;

message = suggestion ?
`${message} Consider replacing with: ${suggestion}` : message;

context.report({
node,
message,
loc: {
line,
column
}
});
};

module.exports = {
create: (context) => {
return {
Program: (node) => {
const source = context.getSourceCode();
const sourceTokens = source.getTokens(node);
const commentTokens = source.getAllComments();
const tokens = sourceTokens.concat(commentTokens);

tokens.forEach((token) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will fail to match non-ascii whitespace that could appear between tokens, so it's not quite disallowing all non-ascii characters in files. This could be fixed by matching the regex against source.text rather than against each token.

That said, I think the no-irregular-whitespace rule will cover non-ascii whitespace.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the latter is true, it would be good to have a comment explaining that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment alone would not be sufficient, it is important that the linter ensures there are no irregular Unicode whitespace characters since they cannot be seen during code review.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aqrln by:

If the latter is true

I mean that if:

I think the no-irregular-whitespace rule will cover non-ascii whitespace.

@not-an-aardvark's theory is correct, and the non-ASCII whitespace characters are already covered in a separate rule, then we could just use that for whitespace, and add a comment in here to explain that we don't need to worry about whitespace as it's covered in another rule.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gibfahn ah, I see, sorry. I didn't pay enough attention to that "if the latter" part so I didn't understand you right.

const { value } = token;
const matches = value.match(nonAsciiPattern);

if (!matches) return;

const { loc } = token;
const character = matches[0];
const column = loc.start.column + value.indexOf(character);

reportError({
line: loc.start.line,
column,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could result in an invalid report location if the offending character is in a block comment. For example:

/* foo
■ */

The rule reports an error for this comment at line 1, column 7, but that location doesn't actually exist.

character
}, node, context);
});
}
};
}
};