Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add inclusive words filter #931

Merged
merged 1 commit into from
Mar 23, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
67 changes: 67 additions & 0 deletions docs/source/1.0/guides/model-linters.rst
Original file line number Diff line number Diff line change
Expand Up @@ -128,6 +128,73 @@ Example:
]


.. _NoninclusiveTerms:

NoninclusiveTerms
=================

Validates that all text content in a model (i.e. shape names, member names,
documentation, trait values, etc.) does not contain words that perpetuate cultural
biases. This validator has a built-in set of bias terms that are commonly found
in APIs along with suggested alternatives.

Noninclusive terms are case-insensitively substring matched and can have any
number of leading or trailing whitespace or non-whitespace characters.

This validator has built-in mappings from noninclusive terms to match model
text to suggested alternatives. The configuration allows for additional terms
to suggestions mappings to either override or append the built-in mappings. If
a match occurs and the suggested alternatives is empty, no suggestion is made
in the generated warning message.

Rationale
Intent doesn't always match impact. The use of noninclusive language like
"whitelist" and "blacklist" perpetuates bias through past association of
acceptance and denial based on skin color. Other words should be used that
are not only inclusive, but more clearly communicate meaning. Words like
allowList and denyList much more clearly indicate that something is
allowed or denied.

Default severity
``WARNING``

Configuration
.. list-table::
:header-rows: 1
:widths: 20 20 60

* - Property
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are either of these properties required? What's the default value for appendDefaults if it's not required?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will add documentation. It is worth noting there is a non-trivial, maybe counter intuitive defaults behavior:

Though appendDefaults defaults to false, if noninclusiveTerms mappings is entirely unset or empty, appendDefaults behaves as if it were true -- the built in mappings are present. noninclusiveTerms has to be non-empty before appendDefaults behavior applies. If this behavior is acceptable, then I'll focus on clear and concise documentation for it. If not, then I should change the implementation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Property structure have changed a bit. But current properties are documented for required or not, along with what the default values are

Copy link
Contributor

@kstich kstich Mar 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the reasoning for this structure change because you weren't happy with the behavior? I feel like a terms map and a includeDefaults-like boolean that defaults to true would be pretty clear and usable for customers.

- Type
- Description
* - terms
- { ``keyword`` -> [ ``alternatives`` ] }
- A set of noninclusive terms to suggestions to either override or replace
the built-in mappings. This property is not required unless
``excludeDefaults`` is true. The default value is the empty set.
* - excludeDefaults
- ``boolean``
- A flag indicating whether or not the mappings set specified by ``terms``
configuration replaces the built-in set or appends additional mappings.
This property is not required and defaults to false.

Example:

.. code-block:: smithy

$version: "1.0"

metadata validators = [{
name: "NoninclusiveTerms"
configuration: {
excludeDefaults: false,
terms: {
mankind: ["humankind"],
mailman: ["mail carrier", "postal worker"]
}
}
}]


.. _ReservedWords:

ReservedWords
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,200 @@
/*
* Copyright 2021 Amazon.com, Inc. or its affiliates. All Rights Reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License").
* You may not use this file except in compliance with the License.
* A copy of the License is located at
*
* http://aws.amazon.com/apache2.0
*
* or in the "license" file accompanying this file. This file is distributed
* on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
* express or implied. See the License for the specific language governing
* permissions and limitations under the License.
*/

package software.amazon.smithy.linters;

import java.util.ArrayList;
import java.util.Collection;
import java.util.Collections;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.stream.Collectors;
import software.amazon.smithy.model.Model;
import software.amazon.smithy.model.SourceLocation;
import software.amazon.smithy.model.knowledge.TextIndex;
import software.amazon.smithy.model.knowledge.TextInstance;
import software.amazon.smithy.model.node.NodeMapper;
import software.amazon.smithy.model.traits.Trait;
import software.amazon.smithy.model.validation.AbstractValidator;
import software.amazon.smithy.model.validation.Severity;
import software.amazon.smithy.model.validation.ValidationEvent;
import software.amazon.smithy.model.validation.ValidationUtils;
import software.amazon.smithy.model.validation.ValidatorService;
import software.amazon.smithy.utils.ListUtils;
import software.amazon.smithy.utils.MapUtils;
import software.amazon.smithy.utils.StringUtils;

/**
* <p>Validates that all shape names and values do not contain non-inclusive terms.
*/
public final class NoninclusiveTermsValidator extends AbstractValidator {
static final Map<String, List<String>> BUILT_IN_NONINCLUSIVE_TERMS = MapUtils.of(
"master", ListUtils.of("primary", "parent", "main"),
"slave", ListUtils.of("secondary", "replica", "clone", "child"),
"blacklist", ListUtils.of("denyList"),
"whitelist", ListUtils.of("allowList")
);

public static final class Provider extends ValidatorService.Provider {
public Provider() {
super(NoninclusiveTermsValidator.class, node -> {
NodeMapper mapper = new NodeMapper();
return new NoninclusiveTermsValidator(
mapper.deserialize(node, NoninclusiveTermsValidator.Config.class));
});
}
}

/**
* NoninclusiveTermsValidator configuration.
*/
public static final class Config {
private Map<String, List<String>> terms = MapUtils.of();
private boolean excludeDefaults;

public Map<String, List<String>> getTerms() {
return terms;
}

public void setTerms(Map<String, List<String>> terms) {
this.terms = terms;
}

public boolean getExcludeDefaults() {
return excludeDefaults;
}

public void setExcludeDefaults(boolean excludeDefaults) {
this.excludeDefaults = excludeDefaults;
}
}

private final Map<String, List<String>> termsMap;

private NoninclusiveTermsValidator(Config config) {
Map<String, List<String>> termsMapInit = new HashMap<>(BUILT_IN_NONINCLUSIVE_TERMS);
if (!config.getExcludeDefaults()) {
termsMapInit.putAll(config.getTerms());
termsMap = Collections.unmodifiableMap(termsMapInit);
} else {
if (config.getTerms().isEmpty()) {
//This configuration combination makes the validator a no-op.
throw new IllegalArgumentException("Cannot set 'excludeDefaults' to true and leave "
+ "'terms' empty or unspecified.");
}
termsMap = Collections.unmodifiableMap(config.getTerms());
}
}

/**
* Runs a full text scan on a given model and stores the resulting TextOccurrences objects.
*
* Namespaces are checked against a global set per model.
*
* @param model Model to validate.
* @return a list of ValidationEvents found by the implementer of getValidationEvents per the
* TextOccurrences provided by this traversal.
*/
@Override
public List<ValidationEvent> validate(Model model) {
TextIndex textIndex = TextIndex.of(model);
List<ValidationEvent> validationEvents = new ArrayList<>();
for (TextInstance text : textIndex.getTextInstances()) {
validationEvents.addAll(getValidationEvents(text));
}
return validationEvents;
}

/**
* Generates zero or more @see ValidationEvents and returns them in a collection.
*
* @param occurrence text occurrence found in the body of the model
*/
private Collection<ValidationEvent> getValidationEvents(TextInstance instance) {
final Collection<ValidationEvent> events = new ArrayList<>();
for (Map.Entry<String, List<String>> termEntry : termsMap.entrySet()) {
final String termLower = termEntry.getKey().toLowerCase();
final int startIndex = instance.getText().toLowerCase().indexOf(termLower);
if (startIndex != -1) {
final String matchedText = instance.getText().substring(startIndex, startIndex + termLower.length());
switch (instance.getLocationType()) {
case NAMESPACE:
//Cannot use any warning() overloads because there is no shape associated with the event.
events.add(ValidationEvent.builder()
.sourceLocation(SourceLocation.none())
.id(this.getClass().getSimpleName().replaceFirst("Validator$", ""))
.severity(Severity.WARNING)
.message(formatNonInclusiveTermsValidationMessage(termEntry, matchedText, instance))
.build());
break;
case APPLIED_TRAIT:
events.add(warning(instance.getShape(),
instance.getTrait().getSourceLocation(),
formatNonInclusiveTermsValidationMessage(termEntry, matchedText, instance)));
break;
case SHAPE:
default:
events.add(warning(instance.getShape(),
instance.getShape().getSourceLocation(),
formatNonInclusiveTermsValidationMessage(termEntry, matchedText, instance)));
}
}
}
return events;
}

private static String formatNonInclusiveTermsValidationMessage(
Map.Entry<String, List<String>> termEntry,
String matchedText,
TextInstance instance
) {
final List<String> caseCorrectedEntryValue = termEntry.getValue().stream()
.map(replacement -> Character.isUpperCase(matchedText.charAt(0))
? StringUtils.capitalize(replacement)
: StringUtils.uncapitalize(replacement))
.collect(Collectors.toList());
String replacementAddendum = !termEntry.getValue().isEmpty()
? String.format(" Consider using one of the following terms instead: %s",
ValidationUtils.tickedList(caseCorrectedEntryValue))
: "";
switch (instance.getLocationType()) {
DavidOgunsAWS marked this conversation as resolved.
Show resolved Hide resolved
case SHAPE:
return String.format("%s shape uses a non-inclusive term `%s`.%s",
StringUtils.capitalize(instance.getShape().getType().toString()),
matchedText, replacementAddendum);
case NAMESPACE:
return String.format("%s namespace uses a non-inclusive term `%s`.%s",
instance.getText(), matchedText, replacementAddendum);
case APPLIED_TRAIT:
if (instance.getTraitPropertyPath().isEmpty()) {
return String.format("'%s' trait has a value that contains a non-inclusive term `%s`.%s",
Trait.getIdiomaticTraitName(instance.getTrait()), matchedText,
replacementAddendum);
} else {
String valuePropertyPathFormatted = formatPropertyPath(instance.getTraitPropertyPath());
return String.format("'%s' trait value at path {%s} contains a non-inclusive term `%s`.%s",
Trait.getIdiomaticTraitName(instance.getTrait()), valuePropertyPathFormatted,
matchedText, replacementAddendum);
}
default:
throw new IllegalStateException();
}
}

private static String formatPropertyPath(List<String> traitPropertyPath) {
return String.join("/", traitPropertyPath);
}
}
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
software.amazon.smithy.linters.AbbreviationNameValidator$Provider
software.amazon.smithy.linters.CamelCaseValidator$Provider
software.amazon.smithy.linters.NoninclusiveTermsValidator$Provider
software.amazon.smithy.linters.InputOutputStructureReuseValidator$Provider
software.amazon.smithy.linters.MissingPaginatedTraitValidator$Provider
software.amazon.smithy.linters.RepeatedShapeNameValidator$Provider
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
[WARNING] -: ns.foo namespace uses a non-inclusive term `foo`. Consider using one of the following terms instead: `bar` | NoninclusiveTerms
[WARNING] ns.foo#MyMasterService: Service shape uses a non-inclusive term `Master`. Consider using one of the following terms instead: `Main`, `Parent`, `Primary` | NoninclusiveTerms
[WARNING] ns.foo#BlackListThings: Operation shape uses a non-inclusive term `BlackList`. Consider using one of the following terms instead: `DenyList` | NoninclusiveTerms
[WARNING] ns.foo#AInput$foo: Member shape uses a non-inclusive term `foo`. Consider using one of the following terms instead: `bar` | NoninclusiveTerms
[WARNING] ns.foo#AInput$foo: 'documentation' trait has a value that contains a non-inclusive term `apple`. Consider using one of the following terms instead: `banana` | NoninclusiveTerms
[WARNING] ns.foo#BlackListThings: 'documentation' trait has a value that contains a non-inclusive term `replacement`. | NoninclusiveTerms
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
{
"smithy": "1.0",
"shapes": {
"ns.foo#MyMasterService": {
"type": "service",
"version": "2021-10-17",
"operations": [
{
"target": "ns.foo#A"
},
{
"target": "ns.foo#BlackListThings"
}
]
},
"ns.foo#A": {
"type": "operation",
"input": {
"target": "ns.foo#AInput"
},
"output": {
"target": "ns.foo#AOutput"
},
"traits": {
"smithy.api#readonly": {}
}
},
"ns.foo#AInput": {
"type": "structure",
"members": {
"foo": {
"target": "smithy.api#String",
"traits": {
"smithy.api#documentation": "These docs are apples!"
}
}
}
},
"ns.foo#AOutput": {
"type": "structure"
},
"ns.foo#BlackListThings": {
"type": "operation",
"traits": {
"smithy.api#documentation": "Non-inclusive word with no replacement suggestion."
}
}
},
"metadata": {
"validators": [
{
"name": "NoninclusiveTerms",
"configuration": {
"terms": {
"apple": ["banana"],
"foo": ["bar"],
"replacement": []
}
}
}
]
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
[WARNING] -: ns.foo namespace uses a non-inclusive term `foo`. Consider using one of the following terms instead: `bar` | NoninclusiveTerms
[WARNING] ns.foo#AInput$foo: Member shape uses a non-inclusive term `foo`. Consider using one of the following terms instead: `bar` | NoninclusiveTerms
[WARNING] ns.foo#AInput$foo: 'documentation' trait has a value that contains a non-inclusive term `apple`. Consider using one of the following terms instead: `banana` | NoninclusiveTerms
Loading