Skip to content

Commit

Permalink
Rewrite JSON schema conversion
Browse files Browse the repository at this point in the history
This commit significantly cleans up the JSON schema conversion and
automatically inlines primitive references rather than inlining them
through a mapper.

The previous JSON schema implementation code had several issues:

1. It did not do a good job at handling shape ID conflicts when using
   namespace stripping. We had to add some pretty bad hacks to achieve
   this. For examplem, it had implicit state that was tricky to handle
   (like temporarily setting a ref strategy based on a converted shape
   index).
2. It didn't detect errors early in the process, resulting in strange
   errors when you try to use the schema.
3. It exposed too much public API (for example RefStrategy should not be
   public). Ideally with this trimmed down API surface area, we won't
   need another breaking change.
4. JSON schema names by default should not include a namespace.
5. Simple shapes by default should always be inlined. Things like list
   and set shapes aren't that important for generating good JSON
   schema or OpenAPI schemas. By inlining them, we also ensure that
   any member documentation attached to members that target list or
   set shapes isn't lost since that documentation comes from either
   the member or the targeted shape. This also reduces the possibility
   for naming conflicts when dropping the namespace from the Smithy
   shape ID and converting it to JSON Schema.
6. We were dropping member traits in some scenarios like
   documentation, pattern, range, length. This is now fixed.

Because converting shape IDs to JSON pointers can now result in a nested
JSON pointer, the ability to select schemas from a SchemaDocument using a
JSON pointer has been implemented.

Further, the Smithy document shape is actually meant to be a simple type,
but it was correctly subclassing SimpleShape, resulting in JSON schema
conversions not working correctly (document types were creating distinct
named shapes, whereas they are intended to be inlined).

Finally, this commit fixes a bug where JSON schema extensions weren't
being injected.
  • Loading branch information
mtdowling committed Feb 14, 2020
1 parent 1e14051 commit f6b45a4
Show file tree
Hide file tree
Showing 25 changed files with 1,473 additions and 669 deletions.
6 changes: 6 additions & 0 deletions config/spotbugs/filter.xml
Original file line number Diff line number Diff line change
Expand Up @@ -92,4 +92,10 @@
<Class name="software.amazon.smithy.model.knowledge.ServiceIndex"/>
<Bug pattern="NP_NULL_ON_SOME_PATH_FROM_RETURN_VALUE"/>
</Match>

<!-- Using a buffer here would actually allocate more, not less -->
<Match>
<Class name="software.amazon.smithy.jsonschema.SchemaDocument"/>
<Bug pattern="SBSC_USE_STRINGBUFFER_CONCATENATION"/>
</Match>
</FindBugsFilter>
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
/*
* Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License").
* You may not use this file except in compliance with the License.
* A copy of the License is located at
*
* http://aws.amazon.com/apache2.0
*
* or in the "license" file accompanying this file. This file is distributed
* on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
* express or implied. See the License for the specific language governing
* permissions and limitations under the License.
*/

package software.amazon.smithy.jsonschema;

/**
* Thrown when two shapes generate the same JSON schema pointer.
*/
public class ConflictingShapeNameException extends SmithyJsonSchemaException {
ConflictingShapeNameException(String message) {
super(message);
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,131 @@
/*
* Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License").
* You may not use this file except in compliance with the License.
* A copy of the License is located at
*
* http://aws.amazon.com/apache2.0
*
* or in the "license" file accompanying this file. This file is distributed
* on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
* express or implied. See the License for the specific language governing
* permissions and limitations under the License.
*/

package software.amazon.smithy.jsonschema;

import java.util.HashMap;
import java.util.Map;
import java.util.logging.Logger;
import java.util.regex.Pattern;
import software.amazon.smithy.model.Model;
import software.amazon.smithy.model.shapes.CollectionShape;
import software.amazon.smithy.model.shapes.MapShape;
import software.amazon.smithy.model.shapes.Shape;
import software.amazon.smithy.model.shapes.ShapeId;
import software.amazon.smithy.model.shapes.SimpleShape;
import software.amazon.smithy.model.traits.EnumTrait;
import software.amazon.smithy.utils.FunctionalUtils;
import software.amazon.smithy.utils.StringUtils;

/**
* Automatically de-conflicts map shapes, list shapes, and set shapes
* by sorting conflicting shapes by ID and then appending a formatted
* version of the shape ID namespace to the colliding shape.
*
* <p>Simple types are never generated at the top level because they
* are always inlined into complex shapes; however, string shapes
* marked with the enum trait are never allowed to conflict since
* they can easily drift away from compatibility over time.
* Structures and unions are not allowed to conflict either.
*/
final class DeconflictingStrategy implements RefStrategy {

private static final Logger LOGGER = Logger.getLogger(DeconflictingStrategy.class.getName());
private static final Pattern SPLIT_PATTERN = Pattern.compile("\\.");

private final RefStrategy delegate;
private final Map<ShapeId, String> pointers = new HashMap<>();
private final Map<String, ShapeId> reversePointers = new HashMap<>();

DeconflictingStrategy(Model model, RefStrategy delegate) {
this.delegate = delegate;

// Pre-compute a map of all converted shape refs. Sort the shapes
// to make the result deterministic.
model.shapes().filter(FunctionalUtils.not(this::isIgnoredShape)).sorted().forEach(shape -> {
String pointer = delegate.toPointer(shape.getId());
if (!reversePointers.containsKey(pointer)) {
pointers.put(shape.getId(), pointer);
reversePointers.put(pointer, shape.getId());
} else {
String deconflictedPointer = deconflict(shape, pointer, reversePointers);
LOGGER.info(() -> String.format(
"De-conflicted `%s` JSON schema pointer from `%s` to `%s`",
shape.getId(), pointer, deconflictedPointer));
pointers.put(shape.getId(), deconflictedPointer);
reversePointers.put(deconflictedPointer, shape.getId());
}
});
}

// Some shapes aren't converted to JSON schema at all because they
// don't have a corresponding definition.
private boolean isIgnoredShape(Shape shape) {
return (shape instanceof SimpleShape && !shape.hasTrait(EnumTrait.class))
|| shape.isResourceShape()
|| shape.isServiceShape()
|| shape.isOperationShape()
|| shape.isMemberShape();
}

private String deconflict(Shape shape, String pointer, Map<String, ShapeId> reversePointers) {
LOGGER.info(() -> String.format(
"Attempting to de-conflict `%s` JSON schema pointer `%s` that conflicts with `%s`",
shape.getId(), pointer, reversePointers.get(pointer)));

if (!isSafeToDeconflict(shape)) {
throw new ConflictingShapeNameException(String.format(
"Shape %s conflicts with %s using a JSON schema pointer of %s",
shape, reversePointers.get(pointer), pointer));
}

// Create a de-conflicted JSON schema pointer that just appends
// the PascalCase formatted version of the shape's namespace to the
// resulting pointer.
StringBuilder builder = new StringBuilder(pointer);
for (String part : SPLIT_PATTERN.split(shape.getId().getNamespace())) {
builder.append(StringUtils.capitalize(part));
}

String updatedPointer = builder.toString();

if (reversePointers.containsKey(updatedPointer)) {
// Note: I don't know if this can ever actually happen... but just in case.
throw new ConflictingShapeNameException(String.format(
"Unable to de-conflict shape %s because the de-conflicted name resolves "
+ "to another generated name: %s", shape, updatedPointer));
}

return updatedPointer;
}

// We only want to de-conflict shapes that are generally not code-generated
// because the de-conflicted names can potentially change over time as shapes
// are added and removed. Things like structures, unions, and enums should
// never be de-conflicted from this class.
private boolean isSafeToDeconflict(Shape shape) {
return shape instanceof CollectionShape || shape instanceof MapShape;
}

@Override
public String toPointer(ShapeId id) {
return pointers.computeIfAbsent(id, delegate::toPointer);
}

@Override
public boolean isInlined(Shape shape) {
return delegate.isInlined(shape);
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,176 @@
/*
* Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License").
* You may not use this file except in compliance with the License.
* A copy of the License is located at
*
* http://aws.amazon.com/apache2.0
*
* or in the "license" file accompanying this file. This file is distributed
* on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
* express or implied. See the License for the specific language governing
* permissions and limitations under the License.
*/

package software.amazon.smithy.jsonschema;

import java.util.regex.Pattern;
import software.amazon.smithy.model.Model;
import software.amazon.smithy.model.node.ObjectNode;
import software.amazon.smithy.model.shapes.CollectionShape;
import software.amazon.smithy.model.shapes.MemberShape;
import software.amazon.smithy.model.shapes.Shape;
import software.amazon.smithy.model.shapes.ShapeId;
import software.amazon.smithy.model.shapes.SimpleShape;
import software.amazon.smithy.model.traits.EnumTrait;
import software.amazon.smithy.utils.StringUtils;

/**
* This ref strategy converts Smithy shapes into the following:
*
* <ul>
* <li>
* Structures, unions, maps, and enums are always created as a top-level
* JSON schema definition.
* </li>
* <li>
* <p>Members that target structures, unions, enums, and maps use a $ref to the
* targeted shape. With the exception of maps, these kinds of shapes are almost
* always generated as concrete types by code generators, so it's useful to reuse
* them throughout the schema. However, this means that member documentation
* and other member traits need to be moved in some way to the containing
* shape (for example, documentation needs to be appended to the container
* shape).</p>
* <p>Maps are included here because they are represented as objects in
* JSON schema, and many tools will generate a type or require an explicit
* name for all objects. For example, API Gateway will auto-generate a
* non-deterministic name for a map if one is not provided.</p>
* </li>
* <li>
* Members that target a collection or simple type are inlined into the generated
* container (that is, shapes that do not have the enum trait).
* </li>
* </ul>
*/
final class DefaultRefStrategy implements RefStrategy {

private static final Pattern SPLIT_PATTERN = Pattern.compile("\\.");
private static final Pattern NON_ALPHA_NUMERIC = Pattern.compile("[^A-Za-z0-9]");

private final Model model;
private final boolean alphanumericOnly;
private final boolean keepNamespaces;
private final String rootPointer;
private final PropertyNamingStrategy propertyNamingStrategy;
private final ObjectNode config;

DefaultRefStrategy(Model model, ObjectNode config, PropertyNamingStrategy propertyNamingStrategy) {
this.model = model;
this.propertyNamingStrategy = propertyNamingStrategy;
this.config = config;
rootPointer = computePointer(config);
alphanumericOnly = config.getBooleanMemberOrDefault(JsonSchemaConstants.ALPHANUMERIC_ONLY_REFS);
keepNamespaces = config.getBooleanMemberOrDefault(JsonSchemaConstants.KEEP_NAMESPACES);
}

private static String computePointer(ObjectNode config) {
String pointer = config.getStringMemberOrDefault(JsonSchemaConstants.DEFINITION_POINTER, DEFAULT_POINTER);
if (!pointer.endsWith("/")) {
pointer += "/";
}
return pointer;
}

@Override
public String toPointer(ShapeId id) {
if (id.getMember().isPresent()) {
MemberShape member = model.expectShape(id, MemberShape.class);
return createMemberPointer(member);
}

StringBuilder builder = new StringBuilder();
appendNamespace(builder, id);
builder.append(id.getName());
return rootPointer + stripNonAlphaNumericCharsIfNecessary(builder.toString());
}

private String createMemberPointer(MemberShape member) {
if (!isInlined(member)) {
return toPointer(member.getTarget());
}

Shape container = model.expectShape(member.getContainer());
String parentPointer = toPointer(container.getId());

switch (container.getType()) {
case LIST:
case SET:
return parentPointer + "/items";
case MAP:
return member.getMemberName().equals("key")
? parentPointer + "/propertyNames"
: parentPointer + "/additionalProperties";
default: // union | structure
return parentPointer + "/properties/" + propertyNamingStrategy.toPropertyName(
container, member, config);
}
}

@Override
public boolean isInlined(Shape shape) {
// We could add more logic here in the future if needed to account for
// member shapes that absolutely must generate a synthesized schema.
if (shape.asMemberShape().isPresent()) {
MemberShape member = shape.asMemberShape().get();
Shape target = model.expectShape(member.getTarget());
return isInlined(target);
}

// Collections (lists and sets) are always inlined. Most importantly,
// this is done to expose any important traits of list and set members
// in the generated JSON schema document (for example, documentation).
// Without this inlining, list and set member documentation would be
// lost since the items property in the generated JSON schema would
// just be a $ref pointing to the target of the member. The more
// things that can be inlined that don't matter the better since it
// means traits like documentation aren't lost.
//
// Members of lists and sets are basically never a generated type in
// any programming language because most just use some kind of
// standard library feature. This essentially means that the names
// of lists or sets changing when round-tripping
// Smithy -> JSON Schema -> Smithy doesn't matter that much.
if (shape instanceof CollectionShape) {
return true;
}

// Strings with the enum trait are never inlined. This helps to ensure
// that the name of an enum string can be round-tripped from
// Smithy -> JSON Schema -> Smithy, helps OpenAPI code generators to
// use a good name for any generated types, and it cuts down on the
// duplication of documentation and constraints in the generated schema.
if (shape.hasTrait(EnumTrait.class)) {
return false;
}

// Simple types are always inlined unless the type has the enum trait.
return shape instanceof SimpleShape;
}

private void appendNamespace(StringBuilder builder, ShapeId id) {
// Append each namespace part, capitalizing each segment.
// For example, "smithy.example" becomes "SmithyExample".
if (keepNamespaces) {
for (String part : SPLIT_PATTERN.split(id.getNamespace())) {
builder.append(StringUtils.capitalize(part));
}
}
}

private String stripNonAlphaNumericCharsIfNecessary(String result) {
return alphanumericOnly
? NON_ALPHA_NUMERIC.matcher(result).replaceAll("")
: result;
}
}
Loading

0 comments on commit f6b45a4

Please sign in to comment.