Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define serialization and deserialization steps for CryptoKey objects #264

Merged
merged 5 commits into from
Sep 13, 2021

Conversation

twiss
Copy link
Member

@twiss twiss commented May 11, 2021

Fix #260.

  • This PR attempts to roughly define what implementations actually do, while still being concrete:
    • Blink and Gecko serialize the key material as SPKI for public keys, PKCS8 for private keys, and raw for symmetric keys.
      This PR mirrors this.
    • WebKit seems to use an internal format for RSA keys, and the raw format for EC public keys (it still uses PKCS8 for EC private keys).
      In my opinion, this can be viewed as an optimization that doesn't change the observable behavior as specified here. But let me know if you think that this should be allowed explicitly, or if we should have a note about this.
    • Similarly, Gecko serializes the "attributes" (type, extractable, and usages) as a bitmask, rather than separate elements, and in general there's some variety in how the different engines serialize the key metadata.
  • This PR makes use of the internal export and import key operations for serialization and deserialization, respectively. The import key operation is not quite a perfect fit for this purpose, as it constructs a CryptoKey object itself (after doing various validation steps), whereas in the deserialization steps we already have one, so we have to do an extra step to copy the handle.
    In my opinion, this could be refactored later, e.g. by constructing a CryptoKey object in subtle.importKey() rather than the algorithm-specific import key operations. (The downside of this refactoring would be that, if the validation fails, subtle.importKey() would have constructed the CryptoKey object unnecessarily early - but this seems like an unimportant case to optimize for.)
  • For that purpose, it defines the export key operation for HKDF and PBKDF2 keys (which didn't need it previously because they are always non-extractable) with a note that it is only invoked by the serialization steps.
  • Because of that, the steps in subtle.exportKey() and subtle.wrapKey() that check whether the export key operation exists for the relevant algorithm have been removed, as this would now always be the case. They still check whether the key is extractable, so they will throw for HKDF and PBKDF2 keys either way, but this does change the specified error from NotSupportedError to InvalidAccessError. However, all three mentioned engines already throw an InvalidAccessError in this case, so it seems that none of them implemented this check, and so this PR brings the spec closer to their actual behavior on this point.

@annevk please let me know what you think :)

@twiss twiss requested a review from annevk May 11, 2021 12:16
Copy link
Member

@annevk annevk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can attempt to review this, but someone more familiar with Web Crypto should do so as well. Is it correct that forStorage is ignored? These objects can thus be stored in IndexedDB and the like?

The serialization implementation details are not observable and therefore should not matter. What matters is what you can observe from JavaScript by inspecting the various properties of the resulting deserialized object.

</li>
</ol>
<div class=note>
When performing the structured clone algorithm in
order to serialize a {{CryptoKey}} object, implementations must not allow the
When deserializing a serialized {{CryptoKey}} object, implementations must not allow the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should not have requirements in notes (i.e., no must). Just statements of fact. And you shouldn't have duplicate requirements either so please don't move it out of a note either. Stating something again is good though.

@mikewest
Copy link
Member

@sleevi would know who (if anyone!) is familiar enough with Chromium's internals to review this.

@twiss
Copy link
Member Author

twiss commented May 17, 2021

Hi @annevk, thanks for the review, and sorry for the delay!

Is it correct that forStorage is ignored? These objects can thus be stored in IndexedDB and the like?

Yes, that's the intention.

You should not have requirements in notes (i.e., no must). Just statements of fact. And you shouldn't have duplicate requirements either so please don't move it out of a note either. Stating something again is good though.

I've reworded the note, let me know if this looks better :)

@sleevi
Copy link
Contributor

sleevi commented May 17, 2021

@mikewest We don’t have anyone focused on Web Crypto internals. I think the most valuable reviews would come from those looking at the perspective of the overall platform; that was always a challenge for Web Crypto. While I’ll do my best here, it’ll be a few weeks before I can review in depth.

As @annevk points out, this seems to go a lot further than needed to accomplish the goal, by not only specifying the observable details but attempting to specify implementation.

I’m concerned about the HKDF/PBKDF2 approach; it was something we considered early on and intentionally rejected, because these don’t really fit. However, it seems like if this only specifies in terms of properties/slots, and not implementation details, then there’s no need for a separate algorithm, and no need to create that asymmetry. Is that wrong?

@twiss
Copy link
Member Author

twiss commented May 17, 2021

Right. So the alternative here indeed is to specify something like

Set serialized.[[Key]] to a serialized representation of the cryptographic key data represented by the [[handle]] internal slot of value.

Or something to that effect, wording up for debate of course. I didn't want to do that, as it seems overly hand-wavy, and doesn't really help you when you want to implement these steps. That's why I decided to refer to the internal export key operation instead. However, if you think that's overly specific and the above wording is better, I can change it of course. (Indeed then the advantage is that the implementer is explicitly free to serialize the key in whatever format they want.)

@sleevi
Copy link
Contributor

sleevi commented May 17, 2021

Or something to that effect, wording up for debate of course. I didn't want to do that, as it seems overly hand-wavy, and doesn't really help you when you want to implement these steps. That's why I decided to refer to the internal export key operation instead. However, if you think that's overly specific and the above wording is better, I can change it of course. (Indeed then the advantage is that the implementer is explicitly free to serialize the key in whatever format they want.)

Correct. This was an intentional decision of the WG, precisely to ensure it was platform and device agnostic.

There are WebCrypto implementations on hardware devices that ensure the key material is stored in the hardware at all times. Yes, this is abysmal performance, and no, it’s not something sites can reliably detect (also by design), but it is a valid implementation that is no longer valid by this change, even though it has the same observable contract. Even ChromeOS has such an implementation (albeit via an extension namespace, due to permissions). The proposal also does not match implementations like Edge, which further wrap keys for storage according to their internal policies.

I think this might highlight a fundamental design question as you approach taking on editing. It sounds like one goal you may have is for the spec is to document implementation approaches for implementors, rather than provide guidance and API contracts for developers. During the WG, this approach was rejected, precisely because of the desire to have implementation flexibility. Are you wanting to revisit that?

@twiss
Copy link
Member Author

twiss commented May 17, 2021

Fair enough. No, it was my intention to specify the existing behavior, not to restrict implementer freedom. But you bring up good points e.g. regarding hardware keys, and I wasn't aware of ChromeOS's and Edge's implementations. I'll change the wording.

@twiss twiss force-pushed the serialization-steps branch from ffe4a27 to 9b14426 Compare May 17, 2021 14:15
Copy link
Member

@annevk annevk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this by the way! Great to see this get some maintenance. This by-and-large looks good to me now and more in line with what I would expect.

<a>[[\usages]]</a> internal slot of <var>value</var>.
</li>
<li>
Set <var>serialized</var>.[[\Key]] to a serialized representation of the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You cannot say serialized representation here and not say deserialized later on. Could it be sufficient to just set Key to handle? Or is this data mutable in some way? (Please excuse my lack of knowledge about what is going on.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is that [[handle]] might refer to a key object in memory, for example (e.g., a pointer to an OpenSSL key object, or whatever underlying implementation is used) whereas this might have to go on disk. But maybe I can just say "Set serialized.[[Key]] to a representation of the cryptographic key data..." - I also don't like the double "serialized" here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm reading the specification correctly [[handle]] is a logical key in which case I think we can just assign and implementations can figure out how to make that work. Specifications don't operate on implementation-reality, they operate at a higher-level of abstraction. https://infra.spec.whatwg.org/#algorithm-conformance might help here.

What's a bit weird to me is that I cannot find the place that sets [[handle]]. It seems importKey() might be a way to mint a CryptoKey object, but it doesn't detail how [[handle]] is set.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, fair enough. But even abstractly speaking, I think the name [[handle]] sort of implies that it's a "handle that refers to a cryptographic key stored elsewhere" - though the spec doesn't really say that explicitly, but it does say:

The CryptoKey object represents an opaque reference to keying material that is managed by the user agent.

So storing that reference somewhere to me feels incorrect, as to me it seems to imply that they should serialize the handle / pointer, rather than the key material.

Re. initializing [[handle]], yeah I agree, it seems to be implied in the import key operations of each of the algorithms, with language such as:

Let key be a new CryptoKey associated with the relevant global object of this [HTML], and that represents the RSA public key identified by publicKey.

But it's fairly hand-wavy indeed. I think one (small) improvement could be to construct the CryptoKey seperately (perhaps earlier on in importKey()) and then replace this with "Set the [[handle]] internal slot of key to a representation of the RSA public key identified by publicKey" or some such?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I had to guess perhaps the editors went with [[handle]] rather than [[key]] to make it clear the key is/can be stored in a different place not directly accessible from the process the CryptoKey object is in. In that case though it still seems to me like you should just set, but perhaps rather than calling it [[Key]] in the serialization you should call it [[Handle]].

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sleevi Thanks for the background. Do you know if there are any implementations where the key material can indeed be "gone" after deserialization?

I believe Edge was/(is?) been such a case. They process isolated the key storage under the EdgeHTML implementation and relied on DPAPI for protection, except that DPAPI keys can be lost under certain circumstances (related to offline password changes), which would make decrypting the keys impossible.

E.g., does ChromeOS's implementation allow storing hardware-backed keys in IndexedDB? (I guess anyway in that case the hardware is built-in so probably it's not likely to disappear?)

Most SEE hardware is extremely limited in storage, which is what my previous reply was referring to. The hardware itself doesn’t store all the keys - rather, it vends an opaque blob wrapped by a key that is in hardware, and then you feed that opaque blob in later. The opaque blob is basically “your” key wrapped by the hardware key, and it’s loaded into the hardware RAM.

This is part of TPM, and thus means it is possible for keys not to be unwrapped (e.g. a TPM reset).

I don’t believe ChromeOS fully implements structured clone for these keys (recall, it’s in an extension namespace that is ACL restricted, but which otherwise follows Web Crypto semantics), but I also believe that’s being fixed - in which case, keys can totally go away.

Personally, I think that to properly support hardware-backed keys which can "go away", it would be nice to have a way for the web app to indicate that they would be able to deal with that (and perhaps have a separate case for those keys during serialization) and require all other keys to be persistent (as I suspect a lot of code already depends on this anyway).

Right, the problem with this is fundamental (and part of the past discussions): the web author has no way of actually knowing that the UA is doing this in a reliable/trustworthy way. That is, you can always polyfill on top of it to lie. The only way around that is protected attributes and attestation, which is just a fancy way of saying “DRM supercookies”.

But yeah, I suppose for platforms that have built-in hardware, theoretically speaking they could already use that today when you call generateKey with extractable=false, so to support that use case we would indeed have to store the handle rather than the key material.

Even without hardware, it’s a separation of implementation concerns. The big issue was largely wanting to work with existing crypto APIs, because of the compliance concerns with a user agent shipping its own implementation. That’s why it was left opaque as much as possible.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I wasn't debating the value of having a handle in general, just whether, when serializing for storage, the implementation should export and serialize the key material or not (and I also wasn't trying to argue that we should force the UA to keep it in IDB forever, just that if it's in IDB, and it's deserialized, that I imagine that there's a lot of code that assumes that it can be used).

But yeah, I also see the value in doing something like Edge is/was doing. I'll change the wording 👍

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In HTML we don't follow the rule that Record field values have to be Records or JavaScript values and effectively store abstract values in them (such as "bitmap data"). This doesn't seem problematic to me, but it's probably something that needs to be cleaned up at some point. cc @domenic

It sounds to me what should happen is that we do say that we serialize [[handle]] in an abstract-manner. And then in the deserializaton steps we deserialize [[Key]] to a [[handle]] and say that if that fails for some reason, a "DataCloneError" DOMException is thrown. (At least, there's nothing in the specification that suggests CryptoKey can operate without a [[handle]] so that seems like what should happen.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least, there's nothing in the specification that suggests CryptoKey can operate without a [[handle]] so that seems like what should happen.

We tried to handle it, but may be badly, and that’s entirely my fault, which is the OperationError bits sprinkled about. That’s because we may not know deserialization/rehydrating the handle failed until we actually try to perform an operation, at least with some crypto APIs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I guess in that case I wouldn't suggest that [[handle]] is serialized and deserialized here. I.e., back to set [[Handle]] to [[handle]]. It's just an immutable device to look up a key and when used it may fail to return a key.

Copy link
Member

@annevk annevk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good editorially.

@twiss
Copy link
Member Author

twiss commented Jun 3, 2021

@sleevi Does it look good to you as well?

@twiss twiss requested a review from sleevi June 3, 2021 16:03
@twiss
Copy link
Member Author

twiss commented Sep 6, 2021

If no one objects, I'll merge this PR soon, on the basis that it's (intended to be) a purely editorial change, and reflect existing browser behavior. If it turns out that it doesn't, we can always open an issue and fix it later, but I'm relatively confident that it does, and it's surely better than the current text, which is just outdated.

@twiss twiss merged commit 829e57a into main Sep 13, 2021
@twiss twiss deleted the serialization-steps branch September 13, 2021 12:22
github-actions bot added a commit that referenced this pull request Sep 13, 2021
SHA: 829e57a
Reason: push, by @twiss

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
twiss added a commit that referenced this pull request Sep 15, 2021
SHA: 829e57a
Reason: push, by @twiss

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fully define [Serializable]
4 participants