Extending representations to parameters and headers #762

darrelmiller · 2016-08-18T15:46:55Z

Assuming the recent PR for representations becomes part of OpenAPI 3.0, I propose that we reuse the concept to allow defining complex types for both parameter values and header values.

However, the additional complexity of 'representations' should only be required for complex types and not primitive types. This means that both header and parameter objects should be defined as having [type | representations] for describing the value.

This proposal rolls back, the previous change in 3.0, that a schema property is required in parameters to describe primitive types.

By using representations to define complex types, we are able to identify the media type that is used for the purpose of serialization. This would help with issues like #401 #69 #222 and address #665.

Because representation objects have schemas we address #717 #652 #667

So a parameter can be a primitive value like this,

{
  "name": "token",
  "in": "header",
  "description": "token to be passed as a header",
  "required": true,
  "type": "string"
  }

Or a complex value,


{
  "name": "token",
  "in": "header",
  "description": "token to be passed as a header",
  "required": true,
  "representations" : {
      "text/csv" : {
         "schema": {
            "type": "array",
            "items": {
              "type": "integer",
              "format": "int64"
            }
      }
   }
}
}

For response headers we can also optionally use representations object to describe JSON based headers.

"headers" : {
  "bb-telemetry-data": {
          "description": "Client statistics ",
          "representations": {
               "application/json" : {
                        "schema": { ... },
                         "examples": [ ...] 
                     }
           }
   }
}

From an tooling implementers perspective, one implementation of the representations structure can now be reused for describing complex structures in request bodies, response bodies, parameters and headers. There are certain escaping rules that are different and so tooling will need to know where the complex type is being serialized to ensure that only valid characters are used in URLs and headers.

One challenge for implementers is that if an OpenAPI definition defines two potential representations for a URL parameter or Header value, then it would be necessary to "sniff" a HTTP message to determine which representation is being used. This might be simple when it comes to differentiating between JSON and XML but becomes more difficult when media types like text/plain and text/csv are used.

The text was updated successfully, but these errors were encountered:

DavidBiesack · 2016-08-18T16:43:51Z

I can see where an Accept: header can act as a 'representations' selector for a response, but what is the selector for the representation for a request header? How are discriminators and other selectors specified, for example if the type is determined by a query parameter, or a field in the request body, or a URL path parameter, or a .json or .xml or .yaml extension, or....

DavidBiesack · 2016-08-18T16:43:53Z

Would OAS also have a "representations" section (sibling of "definitions") so that items in the representations map (or the entire map) could be $ref'd? (IIRC there is a proposal for a reusable "components" structure/mapping)

darrelmiller · 2016-08-18T18:25:33Z

@DavidBiesack Where multiple representations are defined for inbound parameters, the server is responsible for identifying which representation the client used for a URL parameter or header parameter. This is going to require some kind of sniffing algorithm. It's not ideal, but I don't think there are many scenarios where multiple representations are needed for inbound complex typed parameters.

As far as response headers are concerned, there definitely is an issue on how a server should choose the appropriate representations. It might be reasonable to re-use the Accept header provided by the client, as the Accept header is not specifically a response content-type selector, but simply a declaration by the user-agent of what media types it supports. If a user-agent declares that it understands application/json and a response header has multiple representations, one of which being application/json, then the server should probably use json for the header.

DavidBiesack · 2016-08-18T19:37:15Z

I'm a it concerned about ambiguities - both the response body can have multiple representations, and each response header can have multiple representations. application/json is not a real content type; it is more of a format; for APIs like ours, there are many application/vnd.+json representations (the GitHub REST API is another example of using multiple application/vnd.+json types). Each value would needs its own selector/discriminator and Accept: would not work for all; I think we need a more explicit selector/discriminator for these "representations" elements (or a default for each; i.e. Accept request header is the default selector for a response body representation.)

darrelmiller · 2016-08-18T21:26:23Z

I believe I understand your point about having explicit selectors for different uses of representations, but I think I believe that adding multiple of these selectors is going to add too much complexity. If reusing Accept for all the uses of representations is not an solution that people can swallow, then I think I'd rather introduce a singular "representation" object for headers and parameters, so there is no ambiguity.

darrelmiller · 2016-08-19T01:32:55Z

@DavidBiesack Regarding the addition of a representations section to components, I think that might be a good idea. Although, I'm not sure how much more value it brings over being able to re-use responses and parameters, but there is not a whole lot of additional complexity by allowing it.

ePaul · 2016-08-19T09:49:16Z

Hmm, so we consider a header/path/url parameter to be a miniature document, and use some content-type (which is not declared beside the document, just in the API definition) to determine which type of document it is, so we can then match it with some schema.
And we still need to refer to that content type's documentation (instead of something in the spec) to actually map the values to a JSON object (or something else).

DavidBiesack · 2016-08-19T12:47:34Z

Certainly most cases there is only one representation, so no selector is needed, so the majority of cases remain simple.

regarding components - I favor uniformity/consistency across the spec, so if we can reuse one mechanism for expressing reusable components and apply it everywhere, that also brings simplicity.

darrelmiller · 2016-08-19T15:10:26Z

@ePaul The media type here is used simply to provide a "serialization strategy" as discussed in #665. The media type identifier and then optionally the schema should provide sufficient information to a client code generator to know how to deserialize the header/parameter value. I'm not sure I follow your concern about having to refer to the content type's documentation. We currently don't include all the words from RFC 7159 in the OpenAPI spec when we advertise a JSON payload.

And, I think it is important to remember that someone would only use a 'representations object' in a header or parameter if it is a non-primitive type. This means the additional complexity of the representation object should rarely be used.

darrelmiller · 2016-08-19T15:14:11Z

@DavidBiesack I hear you with regard to uniformity and consistency, that's one of the reasons I'm suggesting reusing representations for complex header/parameter types. The downside of uniformity is redundancy and the eternal "which mechanism should I use to do X" questions that follow. Having only one way for someone to do something means that we all do it the same way. I honestly don't know what the right answer is.

fehguy · 2016-08-19T17:34:05Z

@OAI/tdc let's get your input on this

darrelmiller · 2016-08-19T19:01:04Z

Certain members of the @OAI/tdc have raised the concern that "representations" is a lot of characters to type. So, in the interests of saving fingers, we are considering other proposals for the name of this new object.
Suggestions so far include:

"content" : {
                     "application/json" : { ... },
                     "application/jxml" : { ... }
                  }

The downside to this suggestion is that it is not easy to distinguish between the container content object and a specific content object. This is really just a documentation issue though.
An alternative might be,

"contentTypes" : {
                     "application/json" : { ... },
                     "application/jxml" : { ... }
                  }

Unfortunately, that only saves us 3 characters.
Another alternative is,

"bodies" : {
                    "application/json" : { ... },
                     "application/jxml" : { ... }
                  }

This would work, but we would probably need to rename requestBody to avoid confusion. It's also a tad morbid.

"payloads" : {
                    "application/json" : { ... },
                     "application/jxml" : { ... }
                  }

This works but was not liked by those on the call.

darrelmiller · 2016-08-19T19:02:40Z

To answer my own concern, maybe we can have a content object that contains content type objects. Does that sound reasonable?

DavidBiesack · 2016-08-19T19:18:33Z

"representational" got boiled down to just "RE" in "REST" so we could just use "re" : { ... } and to save the most keystrokes 😏

I prefer "content"alaContent-Type`. I think that is more accurate than "bodies" and "payloads" (which feel wrong to me for headers and parameters) and concise enough.

ePaul · 2016-08-19T20:39:42Z

@darrelmiller

I'm not sure I follow your concern about having to refer to the content type's documentation.
We currently don't include all the words from RFC 7159 in the OpenAPI spec when we
advertise a JSON payload.

While we don't do that, the OpenAPI schema objects describe (by reference to the JSON schema specification) JSON values (objects/arrays/primitives), which have an obvious mapping from/to JSON payloads (that is specified in the JSON specification).

But there is no obvious mapping of text/csv documents to JSON values. The most generic one I can imagine would be as an array of objects (with primitive property values) – but this does only work if there is a header line with the column names, because JSON objects don't have any way of property ordering, so you can't express the column meanings in the schema.
Otherwise maybe an array of arrays (of primitives) would be possible.

I also don't see how most documents matching RFC 4280 would fit into a HTTP header. (Maybe the CRLFs for separating records would have to be percent-encoded?)

From your example, I guess you imagine a CSV document with just a single record (= line) and without column headers (i.e. text/csv;header=false), and have this single record represented as a JSON array (of integers, in this case). I don't know how any implementation would be able to guess that from the text/csv specification, though.
(Also, HTTP allows it to replace a comma separated list in a header value with multiple occurences of the same header, or the other way around. Is this still text/csv when we have this?

Token: 15
Token: 2
Token: 56

Similar problems appear for application/x-www-form-urlencoded (the media type registry refers to the HTML specification, which in turn refers to a section in the URL specification) – it encodes a list of name-value tuples (where name and value are character strings) into a byte string (or decodes them again). But a list of name-value tuples is not quite the same as a JSON object (for example, the list can contain duplicate names, and a JSON object could contain non-primitive property values).

multipart/form-data (which would certainly not used for headers, but maybe for the request body, similar to application/x-www-form-urlencoded, see some start of the discussion in #761) defines how a series of "form field values" (which are either plain text or file contents, possibly with a file name attached), each with a name – where names can be duplicated, again (and at least for file uploads, this is a common use case) – is represented in a message (for HTTP or MIME). Again, I don't see an completely obvious way of mapping this to a JSON object.

(This is not meant to be a rant, but to show why I feel there is something missing in the specification as currently proposed.)

darrelmiller · 2016-08-19T22:34:48Z

@ePaul I completely agree there is a big chunk of hand waving going on between the media type and the JSON schema for non-JSON scenarios. In the TDC call today we decided that next week we are going to being to address how to clearly describe non-JSON content. When it comes to headers and parameters there is another challenge that you illustrated well. How, do we map media-types into the constraints of HTTP headers, and URL parameters? Will it be sufficient to simply say that certain characters are not allowed depending on the location in the message?

In order to fix the mapping between media types and schema, I believe we have three basic directions we could take:

Stick with our commitment to JSON schema as it is the 90% solution (today) and create special case mappings from JSON schema, to key value pair,s delimited lists, and whatever other common syntax shows up.
Define our own format independent data modelling syntax to be used by the schema property
Add support for other schema formats. e.g. XSD, ABNF, Relax-NG, JCR, protobuf-schema But make it clear that tooling may or not support all these schemas.

There are pros and cons to all of these approaches but I believe we need to make a clear decision on our future direction

DavidBiesack · 2016-08-22T17:48:39Z

Consider adding examples to content objects as per the Parameter Object samples.

jharmn · 2016-09-30T13:22:39Z

Boiling this down a bit further, there are a few practicalities:

schema is only useful for JSON (and perhaps some primitive types/regex's), based on the current spec. This is already solving JSON Schema-based parameters as well as we can.
- IMO we should finish of Proposal to enable support for alternate schemas #764 (which I think addressed the need), rather than try to make JSON Schema a weak fits-all solution (this is subpar status quo).
collectionFormat is in items right now, effectively inside schema.
- This was previously part of parameter object in 2.0, and has been moved as pointed out in Parameter Object cleanup removed collectionFormat #754
- There isn't a viable JSON Schema solution (@darrelmiller's example doesn't work in practice because there aren't []s), thus the reason this exists. Here's a screenshot proving it
- collectionFormat is needed for all of the non-body parameters...perhaps moving this up to the Parameter object as an alternative to schema (mutually exclusive, one or the other allowed) improves visibility.

Other than perhaps tweaking collectionFormat, I'm not sure we can address other parameter serialization unless we take on the issues identified in #764.

P.S. 👍 for content if went this route. Nice and terse, without overlapping too much with request handling terminology.

h4. collectionFormat tweak example

{
  "name": "tokens",
  "in": "query",
  "description": "tokens to be passed as a query parameter",
  "required": true,
  "collectionFormat": "csv" //schema would not be allowed in this case
}

darrelmiller · 2016-09-30T13:35:19Z

@jharmn I think this approach of using JSON Schema to describe things that are not JSON only works if consider the JSON Schema as describing a generic model of properties, values, lists and maps. From there you need to have a serialization strategy per media type. As you point out, using standard JSON Schema tooling that expect JSON as a input are not going to work.

If OpenAPI choose to treat JSON Schema as the generic data modelling language, then it is going to have to describe the mappings to media types we support.

The alternative evil as described in #764 is we allow other schema languages to be used that already have mappings for serialization.

There is no avoiding pain. It just comes down to what kind of pain we like better.

jharmn · 2016-09-30T13:43:29Z

IMO the inclusion of a mongrel JSON Schema has been a long standing problem (which I'm glad we are fixing). The reason is that, practically speaking, tooling providers want to use existing schema parser/validators. That wasn't fully possible before (due to a hacked JSON Schema), but tooling providers did it anyways.
The notion that anyone is going to write a XML parser/validator (or any other format) based on JSON Schema is hard for me to believe, especially in the historical context.
If we supported XSD, for instance, we could simply required that $ref be utilized (or some other linking syntax, maybe $schemaRef), so we don't have to have XSD/protobuf-schema/etc quoted into JSON Schema (which could get pretty gross).
P.S. This probably belongs in #764 as a revival comment.

fehguy · 2016-09-30T17:51:53Z

@darrelmiller and @jharmn to look at overlap between merged templating PR to see how one of these can go away

darrelmiller · 2017-02-25T15:08:43Z

Incorporated into V3

darrelmiller mentioned this issue Aug 22, 2016

Proposal to enable support for alternate schemas #764

Closed

DavidBiesack mentioned this issue Aug 22, 2016

Parameters Object does not list examples field #766

Closed

darrelmiller closed this as completed Feb 25, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extending representations to parameters and headers #762

Extending representations to parameters and headers #762

darrelmiller commented Aug 18, 2016

DavidBiesack commented Aug 18, 2016

DavidBiesack commented Aug 18, 2016

darrelmiller commented Aug 18, 2016

DavidBiesack commented Aug 18, 2016

darrelmiller commented Aug 18, 2016

darrelmiller commented Aug 19, 2016

ePaul commented Aug 19, 2016

DavidBiesack commented Aug 19, 2016

darrelmiller commented Aug 19, 2016

darrelmiller commented Aug 19, 2016

fehguy commented Aug 19, 2016

darrelmiller commented Aug 19, 2016

darrelmiller commented Aug 19, 2016

DavidBiesack commented Aug 19, 2016

ePaul commented Aug 19, 2016 •

edited

Loading

darrelmiller commented Aug 19, 2016

DavidBiesack commented Aug 22, 2016

jharmn commented Sep 30, 2016

darrelmiller commented Sep 30, 2016 •

edited

Loading

jharmn commented Sep 30, 2016

fehguy commented Sep 30, 2016

darrelmiller commented Feb 25, 2017

Extending representations to parameters and headers #762

Extending representations to parameters and headers #762

Comments

darrelmiller commented Aug 18, 2016

DavidBiesack commented Aug 18, 2016

DavidBiesack commented Aug 18, 2016

darrelmiller commented Aug 18, 2016

DavidBiesack commented Aug 18, 2016

darrelmiller commented Aug 18, 2016

darrelmiller commented Aug 19, 2016

ePaul commented Aug 19, 2016

DavidBiesack commented Aug 19, 2016

darrelmiller commented Aug 19, 2016

darrelmiller commented Aug 19, 2016

fehguy commented Aug 19, 2016

darrelmiller commented Aug 19, 2016

darrelmiller commented Aug 19, 2016

DavidBiesack commented Aug 19, 2016

ePaul commented Aug 19, 2016 • edited Loading

darrelmiller commented Aug 19, 2016

DavidBiesack commented Aug 22, 2016

jharmn commented Sep 30, 2016

darrelmiller commented Sep 30, 2016 • edited Loading

jharmn commented Sep 30, 2016

fehguy commented Sep 30, 2016

darrelmiller commented Feb 25, 2017

ePaul commented Aug 19, 2016 •

edited

Loading

darrelmiller commented Sep 30, 2016 •

edited

Loading