Add written proposal for enhancement #103 #104

stringlytyped · 2023-09-18T12:18:33Z

Add written proposal for enhancement #103 which has been adapted from the earlier Roadmap to Push Model Support in Keylime document

THS-on

I think we can merge it, so that work can start begin on this.

We should clean up the APIs between agent and server during the implementation, but this is out-of-scope for this enhancement

stringlytyped · 2023-09-20T10:01:09Z

@THS-on Thanks for giving the PR your approval. I agree that the APIs need a bit of a refactor/clean-up and I think it is natural to tackle that during implementation of the push model. Is anything else needed to get this merged?

THS-on · 2023-09-20T10:05:12Z

Is anything else needed to get this merged?

Let's get at least 1-2 more approvals, but because the discussed this now for a long time I don't think there is going to be changes.

stringlytyped · 2023-09-20T10:09:52Z

Makes sense. Thanks for clarifying the process

aplanas

that is a big one, and a very very good one

honestly I still do not have the full picture, but what for the parts that I understand I like very much this path

103_agent-driven-attestation.md

aplanas · 2023-09-20T10:58:23Z

103_agent-driven-attestation.md

+
+3. The agent will gather the information required by the verifier (UEFI log, IMA entries and quote) and report these in a new HTTP request along with other information relevant to the quote (such as algorithms used).
+
+4. The verifier will reply with the number of seconds the agent should wait before performing the next attestation and an indication of whether the request from agent appeared well formed according to basic validation checks. Actual processing and verification of the measurements against policy can happen asynchronously after the response is returned. 


what to do if the attestation is outside the window (too soon, too late)? Should be described here, or is maybe too much detail?

If it is too soon, @mheese has suggested that the verifier could reply with a 429 status which I think makes a lot of sense—I just forgot to add it. I'll amend the proposal now to include that detail.

For the "too late" case: I suppose it would make sense to specify a period of validity for the nonce, after which it would expire. If an agent submits an attestation using a nonce that is no longer valid, the verifier would then reply with 400 status. What are your thoughts on this?

The "too many requests" and the TTL for the nonce makes a lot of sense 👍. We can make the TTL > TimeForNextAttestation, so we can play with network delays. This clarify the protocol.

What would be the consideration from the attestation PoV? If there is a 429 I guess that this will not make the note failed in the attestation state (it should be considered trusted), but maybe the 400 should require a change of status if it is repeated?

To reiterate, a 400 is returned when the verifier receives an attestation for a nonce that is invalid or has expired.

This may occur if the agent or node is behaving unexpectedly. However, it can also occur if the network or the verifier is behaving unexpectedly. E.g., if the verifier is overloading and taking a long time to reply to requests, this could delay the delivery of nonces and cause them to be used outside their period of validity.

So, we cannot use repeated 400s as a reliable indicator of a problem with the node. The verifier should only change the status of the node if an attestation fails verification or if no valid attestation has been received for the configured period of time.

maybe to add on to that: a successful response should always include a timestamp (or a time duration) on when to query again, so that the agent knows when to perform another request. If it is too early, it receives a 429. (At least that was my idea when we discussed it with @stringlytyped )

@mheese I agree that for 429s the verifier should include the seconds to wait before trying again (in the Retry-After header).

But for 400s, there is no way the verifier can know how long the agent should wait before trying again. Really, if everything is working fine, then the agent should be able to try again straight away by requesting a new nonce.

However, the key thing is that the 400 condition should not occur under normal operation. So, it makes sense for the agent to wait a while and then try again, increasing the wait time at each attempt until reaching a maximum wait time. This gives the verifier a chance to recover, if it happens that the condition has been caused by an overloaded verifier.

103_agent-driven-attestation.md

Signed-off-by: Jean Snyman <jean.snyman@hpe.com>

stringlytyped · 2023-09-20T18:06:59Z

I've made some changes to the proposal based on the above discussion, specifically:

@aplanas you noted that you were having a bit of trouble grasping the full picture. I've gone ahead and lifted out the Agent Lifecycle section from my earlier draft and added it to the proposal. If you read through those steps, does that clarify things for you?
Thinking about it more, I've realised that we do in fact need the nonce to expire. So, I've gone ahead and added more detailed explanations on error handling in Proposed Attestation Protocol (which have also been carried through to the Agent Lifecycle section, mentioned above).
I've dropped the dependency change.
Clarified those bits that were confusing.

I noticed the bit about HTTP proxy support was missing from the earlier draft by mistake, so I've added that back in also (see HTTP Proxy Support).

maugustosilva · 2023-09-21T17:16:39Z

I believe we discussed it at a length and depth that allows the merge and the starting of the actual implementation work. Of course minor course correction might be proposed/request @stringlytyped , but lets not hold it any longer. I again, thank you for the time, effort and thought put on it!

stringlytyped mentioned this pull request Sep 18, 2023

Agent-Driven Attestation #103

Open

stringlytyped force-pushed the master branch from 0d821eb to b1791b8 Compare September 18, 2023 12:24

THS-on approved these changes Sep 19, 2023

View reviewed changes

THS-on requested review from mpeters, aplanas, mheese and maugustosilva September 20, 2023 10:03

aplanas reviewed Sep 20, 2023

View reviewed changes

stringlytyped force-pushed the master branch from b1791b8 to 0d20ee9 Compare September 20, 2023 17:45

Add written proposal for enhancement keylime#103

0a7c78b

Signed-off-by: Jean Snyman <jean.snyman@hpe.com>

stringlytyped force-pushed the master branch from 0d20ee9 to 0a7c78b Compare September 20, 2023 17:53

stringlytyped requested a review from aplanas September 20, 2023 18:12

aplanas approved these changes Sep 21, 2023

View reviewed changes

maugustosilva approved these changes Sep 21, 2023

View reviewed changes

maugustosilva merged commit 9578111 into keylime:master Sep 21, 2023
1 check passed

stringlytyped mentioned this pull request Nov 21, 2023

Meeting 22/11/23 keylime/meetings#71

Closed

24 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add written proposal for enhancement #103 #104

Add written proposal for enhancement #103 #104

stringlytyped commented Sep 18, 2023

THS-on left a comment

stringlytyped commented Sep 20, 2023

THS-on commented Sep 20, 2023

stringlytyped commented Sep 20, 2023

aplanas left a comment

aplanas Sep 20, 2023

stringlytyped Sep 20, 2023

aplanas Sep 21, 2023

stringlytyped Sep 21, 2023

mheese Sep 25, 2023

stringlytyped Sep 27, 2023 •

edited

Loading

stringlytyped commented Sep 20, 2023

maugustosilva commented Sep 21, 2023


		3. The agent will gather the information required by the verifier (UEFI log, IMA entries and quote) and report these in a new HTTP request along with other information relevant to the quote (such as algorithms used).

		4. The verifier will reply with the number of seconds the agent should wait before performing the next attestation and an indication of whether the request from agent appeared well formed according to basic validation checks. Actual processing and verification of the measurements against policy can happen asynchronously after the response is returned.

Add written proposal for enhancement #103 #104

Add written proposal for enhancement #103 #104

Conversation

stringlytyped commented Sep 18, 2023

THS-on left a comment

Choose a reason for hiding this comment

stringlytyped commented Sep 20, 2023

THS-on commented Sep 20, 2023

stringlytyped commented Sep 20, 2023

aplanas left a comment

Choose a reason for hiding this comment

aplanas Sep 20, 2023

Choose a reason for hiding this comment

stringlytyped Sep 20, 2023

Choose a reason for hiding this comment

aplanas Sep 21, 2023

Choose a reason for hiding this comment

stringlytyped Sep 21, 2023

Choose a reason for hiding this comment

mheese Sep 25, 2023

Choose a reason for hiding this comment

stringlytyped Sep 27, 2023 • edited Loading

Choose a reason for hiding this comment

stringlytyped commented Sep 20, 2023

maugustosilva commented Sep 21, 2023

stringlytyped Sep 27, 2023 •

edited

Loading