Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON-serialized client data is wrong #712

Closed
annevk opened this issue Dec 7, 2017 · 33 comments · Fixed by #1017
Closed

JSON-serialized client data is wrong #712

annevk opened this issue Dec 7, 2017 · 33 comments · Fixed by #1017

Comments

@annevk
Copy link
Member

annevk commented Dec 7, 2017

You basically specify the correct thing by only allowing UTF-8 and such, but https://infra.spec.whatwg.org/#parse-json-from-bytes is a little more formally correct.

@annevk
Copy link
Member Author

annevk commented Dec 7, 2017

It also seems you want to produce a string out of an object using JSON.stringify. Maybe we should define that operation in Infra too.

@emlun
Copy link
Member

emlun commented Dec 7, 2017

What part(s) of the spec is this in reference to?

@annevk
Copy link
Member Author

annevk commented Dec 7, 2017

The question was about 5.8.1. I think OP actually doesn't apply since you don't use JSON.parse...

@emlun
Copy link
Member

emlun commented Dec 7, 2017

Yeah, section 5.8.1 has nothing to do with parsing JSON, only encoding to JSON. Do you mean we can close this, then?

@annevk
Copy link
Member Author

annevk commented Dec 7, 2017

No, I'm wondering whether it should be formalized similar to how parsing is formalized at https://infra.spec.whatwg.org/#parse-json-from-bytes. We could put such a definition in the Infra Standard.

@emlun
Copy link
Member

emlun commented Dec 7, 2017

Ok. If you do add such a definition to the Infra Standard that would be nice I suppose, but I'd say that the current language in the WebAuthn draft is formal enough that it's not necessary to change it.

@equalsJeffH
Copy link
Contributor

AFAICT this issue refers to the dfn of JSON-serialized client data in 5.10.1. Client data used in WebAuthn signatures (dictionary CollectedClientData), yes?

If so, I agree with @emlun's #712 (comment) above.

@emlun
Copy link
Member

emlun commented Feb 13, 2018

@annevk is @equalsJeffH's assessment correct?

@emlun
Copy link
Member

emlun commented Feb 15, 2018

I believe we can close this.

@annevk
Copy link
Member Author

annevk commented Feb 16, 2018

There's still an issue with the current text in that if someone overwrites JSON it's still ambiguous.

Another problem I just realized is that you cannot invoke a JavaScript function on an IDL value. You first need to convert the IDL value to a JavaScript value.

@annevk annevk changed the title Consider using "parse JSON from bytes" JSON-serialized client data is wrong Feb 16, 2018
@emlun
Copy link
Member

emlun commented Feb 16, 2018

There's still an issue with the current text in that if someone overwrites JSON it's still ambiguous.

This is not ambiguous, because the spec reads "the result of calling the initial value of JSON.stringify" (emphasis added). Even without the emphasized part, I'd argue that this is a reference specifically to the JSON.stringify function in the ECMAScript language spec, and not an instruction to "evaluate the expression JSON.stringify in the current JavaScript scope".

Another problem I just realized is that you cannot invoke a JavaScript function on an IDL value. You first need to convert the IDL value to a JavaScript value.

This may be technically correct, but I cannot currently see how this would ever lead to any kind of confusion.

Furthermore, on both points: These steps are to be implemented internally in the client, not by third party JS code. I don't think the exact implementation details matter so much as the end result, and we have so far observed zero interoperability issues with this.

@annevk
Copy link
Member Author

annevk commented Feb 16, 2018

@emlun it is. The initial value of JSON.stringify means just the initial value of the stringify property of some JSON object. It doesn't mean the JSON object (that's %JSON%; see Infra for how it's worded for the parser).

@emlun
Copy link
Member

emlun commented Feb 16, 2018

And that's exactly where the WebAuthn spec links to. I have a hard time seeing how that is ambiguous.

@emlun
Copy link
Member

emlun commented Feb 16, 2018

Ah, but on second thought we do spell out %ArrayBuffer% elsewhere in the spec. So perhaps we should rectify this for internal consistency if nothing else.

@emlun
Copy link
Member

emlun commented Feb 21, 2018

Hm... %JSON%.stringify returns a UTF-16 encoded string, but the JSON-serialized client data is supposed to be UTF-8 encoded. Is this an issue? I suppose that "This is the UTF-8 encoding of the result of calling %JSON%.stringify on a CollectedClientData dictionary" can be understood as instructing to recode the result from UTF-16 to UTF-8, but is that sufficiently clear?

@annevk
Copy link
Member Author

annevk commented Feb 21, 2018

It is per how Infra defines strings. There could be a problem I suppose if JSON.stringify can return a string containing a lone surrogate, but I don't think that's possible.

@selfissued selfissued modified the milestones: PR, CR Feb 21, 2018
@selfissued
Copy link
Contributor

Currently it seems to me that the spec is internally inconsistent - with JSON.stringify returning UTF-16 and the spec saying that it's UTF-8. We should make it consistent, one way or the other. I'll defer to the browser experts on which way we should go.

@selfissued selfissued modified the milestones: CR, Last Working Draft Feb 21, 2018
@annevk
Copy link
Member Author

annevk commented Feb 21, 2018

You UTF-8 encode the return value of JSON.stringify, resulting in UTF-8 bytes, no? JSON.stringify doesn't return bytes, it returns a string. I'm not sure what you think is inconsistent.

@emlun
Copy link
Member

emlun commented Feb 21, 2018

@annevk Could you please review PR #813?

@emlun
Copy link
Member

emlun commented Feb 22, 2018

In the review of #813, @domenic asserts that the current language is better than changing it to refer to %JSON%.

If anything, I think we should instead not reference the ECMAScript API at all for this. This part of the spec does not define actions to take by JavaScript code running in the client, rather it specifies behaviour internal to the client. It doesn't matter to the consumers of this JSON - i.e., the RPs - how it was produced or in what format.

@nadalin
Copy link
Contributor

nadalin commented Feb 24, 2018

@emlun @annevk @selfissued seems we have no agreed upon conclusion, is this something we punt for CR and get other eyes on this ?

@selfissued
Copy link
Contributor

I don't think we've agreed on any particular change, or even that a change is necessary. I think there is agreement that what's there will work interoperably. Therefore, I think this can be moved to the next milestone and we can continue discussion on if there's a better way to say this.

@nadalin nadalin modified the milestones: Last Working Draft, PR Feb 24, 2018
@emlun
Copy link
Member

emlun commented Feb 24, 2018

Yes, I don't think we need to hold up CR for this. Considering that we've not observed any interoperability issues with this despite the importance of these fine details, I'm thinking it seems to be unambiguous enough.

Also, I'll retract my previous idea of eliminating the reference to the ECMAScript API. I think it probably is best to keep that, so we don't have to repeat a definition of how to serialize the JavaScript object to JSON.

Considering the discussion in #813 and that noone seems to have a concrete proposal for how this should be fixed/improved, I'm leaning towards closing this as not (currently) actionable.

@selfissued
Copy link
Contributor

Rather than closing it, I'd rather that we changed the milestone because there are substantive details lurking here that we should try to get right. @nadalin - do you want to do that?

@annevk
Copy link
Member Author

annevk commented Feb 24, 2018

I think the best way to fix this would be analogous to https://infra.spec.whatwg.org/#json, but then for serialization. That requires some effort though as you'll need to upstream a change to JavaScript to get %JSONSerialize%.

@equalsJeffH
Copy link
Contributor

@emlun wrote #712 (comment)

I think we should instead not reference the ECMAScript API at all for this. This part of the spec does not define actions to take by JavaScript code running in the client, rather it specifies behaviour internal to the client.

even tho you later retracted the above, just to clarify, I'll note that "this part of the spec", i.e. both [[Create]] and [[DiscoverFromExternalSource]] algorithms, are explicitly defining "behaviour internal to the client", i.e., the browser. [ Hence, RP folk do not necessarily need to read or understand them. Hence PR #375 ]

@emlun later wrote #712 (comment)

I'll retract my previous idea of eliminating the reference to the ECMAScript API. I think it probably is best to keep that, so we don't have to repeat a definition of how to serialize the JavaScript object to JSON.

whew, agreed :)

@selfissued #712 (comment)

Rather than closing it, I'd rather that we changed the milestone because there are substantive details lurking here that we should try to get right.

Nominally agreed, tho if what we have in there now is "good enough" for PR and Recommendation maturity-levels (I spose we'll find out), then we may not want to gate those milestones on updating both https://infra.spec.whatwg.org/ and https://tc39.github.io/ecma262/.

@selfissued
Copy link
Contributor

@annevk - on the 28-Feb-18 WebAuthn working group call, we decided to ask you, as a subject matter expert, to propose specific edits to specific sections of the relevant specifications, so that these can be pursued.

It would be fine for these to be PRs for the relevant specifications that are then cited in this issue. We want these comments to become actionable. Thanks.

@selfissued selfissued assigned annevk and unassigned selfissued Feb 28, 2018
@annevk
Copy link
Member Author

annevk commented Mar 1, 2018

Could you help me by explaining what is not clear about #712 (comment)?

@equalsJeffH
Copy link
Contributor

equalsJeffH commented Mar 3, 2018

Here's a n00b attempt at defining a "JSON stringify and UTF-8 encode to bytes" operation for the Infra spec:

<p>To <dfn export>JSON stringify and UTF-8 encode to bytes</dfn> a given JavaScript value 
<var>value</var>, run these steps:

<ol>
 <li>
  <p>Let <var>jsonString</var> be the result of
  <a abstract-op>Call</a>(<a>%JSONStringify%</a>, undefined, « <var>value</var> »).
  [[!ECMA-262]]
  <p class=note>The conventions used in this step are those of the JavaScript specification.

 <li><p>Return the result of running <a>UTF-8 encode</a> on <var>jsonString</var>. [[!ENCODING]]
</ol>

If I understand correctly, [ECMA-262] needs only a new row (after the %JSONParse% row) added to Table 7 "well-known intrinsic objects" like so (?):

%JSONStringify%      JSON.Stringify     The initial value of the 
                                        stringify data property of %JSON%

@annevk
Copy link
Member Author

annevk commented Mar 3, 2018

Pretty good for a n00b attempt. That looks right to me. 😊

@equalsJeffH
Copy link
Contributor

equalsJeffH commented Mar 5, 2018

shucks, thx @annevk -- I've submitted issue whatwg/infra#193

@equalsJeffH
Copy link
Contributor

equalsJeffH commented Jul 17, 2018

PR for Infra created: whatwg/infra#207 (fixes whatwg/infra#193)

@equalsJeffH
Copy link
Contributor

ecma262 spec PR created: tc39/ecma262#1272, apparently depends to some degree on tc39/ecma262#1105

@agl agl closed this as completed in #1017 Aug 1, 2018
agl added a commit that referenced this issue Aug 1, 2018
…nt-data

fix #712 JSON-serialized client data is wrong
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment