-
Notifications
You must be signed in to change notification settings - Fork 368
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add handshake probe. #1157
Add handshake probe. #1157
Conversation
Extension for 2.1.0 |
Could explain the probe mode a bit more. This is not clear do me 😕 |
I basically use this for mobile connectivity with DTLS 1.2 connection id and a "mostly stable server". If the mobile peer doesn't get a ACK or response that may have two different causes:
The second is sometime hard to detect; the peer's state is connected, but effectively it's not working. In that case, after some retransmissions, the peer starts a handshake. Without this PR that removes on the client the session. If the handshake timesout (though the connection is not working), the peer still requires a new handshake after the connectivity is established again. With this PR, the probe handshake starts a handshake without removing the session. If some data is received, the session is removed and the handshake gets completed. If no data is received, the peer assumes, that the connectivity is lost (even if it's own state indicates connectivity) and just timesout the request. if the connectivity is established again, just a new request could be send without a handshake. I use this for my android test app and it reduces the handshakes in my test by 80%. (current test run: 2400 request, 114 retransmissions, 3 request failed, 18 connects (12 by restart) => only 6 handshakes (reaching the server) for 2400 request while moving around.) |
Thx for the clarification, I will review this now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the sendMessage(final long nanos, final RawData message, ...
begin to be a bit too long maybe we could consider to cut it in several method to improve readability.
The new isExpired
state in handshaker in not so clear to me, isExpired means that last sent flight timed-out ? If yes why do we need to use a new timestamp attributes why not set the handshaked as expired in DTLSConnector.handleTimeout?
/** | ||
* Force handshake probe before send this message. Doesn't start a | ||
* handshake, if the connector is configured to act as server only. | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should explain more what is hanshake probe (or at least add a link to #1157 (comment))
@@ -299,7 +306,8 @@ protected Handshaker(boolean isClient, int initialMessageSeq, DTLSSession sessio | |||
this.session.setMaxTransmissionUnit(maxTransmissionUnit); | |||
this.applicationLevelInfoSupplier = config.getApplicationLevelInfoSupplier(); | |||
this.inboundMessageBuffer = new InboundMessageBuffer(); | |||
|
|||
long timeout = config.getRetransmissionTimeout(); | |||
this.expiresNanosTimeout = TimeUnit.MILLISECONDS.toNanos(timeout << (config.getMaxRetransmissions() + 1)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any chance to write this without <<
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would adding a comment be enough?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About the content :
I guess you try to calculate the maximum time to wait before to consider the message as timed-out.
But I'm not sure to understand the math, e.g. for max retransmission = 3
t + t*2 + t*2*2 + t*2*2*2 = t*(1+2+4+8) = t*15
!= t*16 = t*2^(3+1) = t <<(3+1)
Did I miss something ?
And by the way the DTLS specification limit the waiting time to 60s.
So maybe it should be more like : min(t,60)+min(t*2,60) + ...
About the form :
If I'm correct about the content, I feel that *2^a is nearer from the original idea than << a
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About the 60s: OK, do it in a other small PR.
With that, the calculation will be adapted to a "loop", to address the 60s limit.
By the way, the expire timeout should be some larger than the usual sum of the retransmit times to ensure, that usually a handshake timesout prior to expire.
@@ -176,6 +177,11 @@ | |||
/** Realtime nanoseconds of last sending a flight */ | |||
private long flightSendNanos; | |||
|
|||
/** Expire realtime nanoseconds */ | |||
private long expiresNanos; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any idea about a naming which express that expiresNanos
is a timestamp and expiresNanosTimeout
a timeout duration ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't "timeout" used for durations so far?
And I'm not sure, if we name the timestamps special, but I maybe wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you're right we do nothing special until now.
It's just I didn't get the meaning of this variable at first sight and I tell myself that maybe naming could help.
(not a strong opinion)
Maybe javadoc is enough ? (not strong opinion too)
* Usually a client handshake removes the session from the connection store | ||
* with the start. Probing removes the session only with the first data | ||
* received back. | ||
* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe adding the link to your #1157 (comment) here too
@@ -233,6 +239,7 @@ | |||
protected int statesIndex; | |||
protected HandshakeState[] states; | |||
|
|||
protected boolean probe = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why adding all the "probe stuff" in Handshaker, it seems to me that this is only used in ResumingClientHandshaker
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you search for the usage, you should find, that it's mainly used in places, where a general Handshaker
is available. To add dummy implementations in the base and override them in the ResumingClientHandshaker
seems for me to make it more complex. But possible ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep I saw that and "dummy implementations" was what I have in mind.
because this avoid to jump from class to class to understand the idea when reviewing (and so understanding the code)
(but this is not a strong opinion)
connection.resetSession(); | ||
Handshaker previous = connection.getOngoingHandshake(); | ||
if (probing) { | ||
connection.setResumptionRequired(false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the end of the week so maybe I'm a bit tired, but the code seems hard to read several line above we just connection.setResumptionRequired(true)
if probing is true and now we set it to false because probing is true
And until now I believed that probing was only for abbreviated handshake (so resumption = true) 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Above, the connection.setResumptionRequired(true)
was introduced to trigger the later
if (connection.isAutoResumptionRequired(timeout))
. That was implemented to support message specific handshake behaviour.
if (probing) {
connection.setResumptionRequired(false);
} else {
connection.resetSession();
}
Here the setResumptionRequired(false)
in case of probing replaces the resetSession()
, which also includes that. The difference is to keep the session.
That happens on android in "deep sleep". The timers are not cancelled, they are just paused. On wakeup, even after some minutes, they continue. That extension therefore holds them up to try to continue a stale handshake. |
551ab7f
to
e1c840f
Compare
e1c840f
to
b9db9f1
Compare
I'm not sure I get it 🤔. |
Yes, if android pushs the app in the background, android wakes it up on a conditional scheduled task (here, about every 5 minutes, when connectivity is available). In the sleep time (e.g. 4 minutes) |
Does it means you want I review #1146 ? |
Yes, since yesterday evening, I think, that will make more sense for now. |
6416366
to
d934312
Compare
d934312
to
91f4841
Compare
91f4841
to
8d23e72
Compare
Signed-off-by: Achim Kraus <achim.kraus@bosch-si.com>
8d23e72
to
dabda47
Compare
In the meantime I fixed my issues. |
// on sending, abort, keep connection for new handshake | ||
handshaker.handshakeAborted(new Exception("handshake already expired!")); | ||
} else if (handshaker.isProbing()) { | ||
if (checkOutboundEndpointContext(message, null)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not linked to this PR, but maybe a method with a clearer name and without null
param could make this clearer.
if (probing) { | ||
// Only reset the resumption trigger, but keep the session for now | ||
// the session will be reseted with the first received data | ||
connection.setResumptionRequired(false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't get this part.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe you can give some more hints, what is hard to get?
The intention of probing is documented in the javadoc of ResumingClientHandshaker
.
if (probing) {
// Only reset the resumption trigger, but keep the session for now
// the session will be reseted with the first received data
connection.setResumptionRequired(false);
} else {
connection.resetSession();
}
And in processRecord
("the session will be reseted with the first received data"):
if (handshaker != null && handshaker.isProbing()) {
// received record, probe successful
if (connection.hasEstablishedSession()) {
connectionStore.removeFromEstablishedSessions(connection.getEstablishedSession(), connection);
}
connection.resetSession();
handshaker.resetProbing();
LOGGER.debug("handshake probe successful {}", connection.getPeerAddress());
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I try to be clearer.
I saw the documentation in ResumingClientHandshaker
👍
What I understand is that with probing mode, we try to do a handshake and if it failed we didn't remove the original/previous connection. (because maybe we lost connectivity but not connection)
What I didn't get is the line : connection.setResumptionRequired(false);
I knew there is a comment just above, but I didn't get why using probing means "no resumption required"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah OK, no it requires one resumption, but this one is started in the lines afterwards.
Reset that mark, either by connection.setResumptionRequired(false)
or
connection.resetSession()
, suppresses further resumptions without new trigger.
(see javadoc of connection.resetSession()
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I finally get it.
We will initiate a handshake and so in case where resumption was required, we currently take it into account and so we go back in a "normal state"
Generally we go back to this normal state/"no resumption required" using resetSession()
But here we want to keep the session and so we just note that resumption is taking into account.
@@ -83,11 +117,12 @@ | |||
* if session, recordLayer or config is <code>null</code> | |||
*/ | |||
public ResumingClientHandshaker(DTLSSession session, RecordLayer recordLayer, Connection connection, | |||
DtlsConnectorConfig config, int maxTransmissionUnit) { | |||
DtlsConnectorConfig config, int maxTransmissionUnit, boolean probe) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[API break] It would be easy to not break.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, but I even don't want someone encourage to create nor implement a own Handshaker
.
For me this is one the strange points of revapi:
A public interface may be used for two use-cases:
- as callback interface, where it is intended to provide own implementations. Such interfaces can't be modified without major release. To add functionality requires additional interfaces (e.g.
CloseSupportingConnectionStore
) and a merge of these with the next major version. - as "functionality expose interface", where it is not intended to provide a own implementation. Here it's very easy to add function without breaking the API.
revapi doesn't reflect this nor provide guidance to deal with. At least I didn't found such guidance. Starting issues in revapi will help out, but as I already also said, for me this "public API definition" has, amoung other tasks, not htat prio, so I don't want to spend more of my time in it. Feel free, to spend your own time. Maybe you start to use revapi for leshan, even if the usage there is more to gain best practice with that tool (or @sophokles73 for hono).
As I already wrote "sometimes", without prior preparing the sources according agreed patterns and rules, it's in my opinion not too helpful, to apply strictness.
The Handshaker
got exposed by a request of a user (checkout for the introduction of onInitializeHandshaker
). At that time, no one concerned about unknown API design patterns. I also already asked you, to spend your time into investigate for the Handshaker
. That is unfortunately time consuming, but would help more than insist in "not changing stuff at all". With agreed patterns and rules, the user's request at that time would have been implemented using such "expose API interfaces" and would have cause more changes, than the actually chosen approach. Fell free to spend your time in provide a proposal for that.
Please don't spend your time in answers without express your opinion on the two interface use-cases and how to setup revapi accordingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uups I didn't expect that you take too much time to answer to this remark. As I said in our private discussion I didn't fight for this, I just notice it. (But maybe you didn't read this and so you could be irritated by my [API break] remarks, sorry about that)
About the 2 interface uses cases, I agree about those 2 ways to use interface.
Even if maybe, for the expose interface
way, using a class (facade) to do the job is maybe better. So adding method is not a problem anymore.
With java8 default implementation is a solution too.
So I would say there is many way to handle this and available solution are not the same depending of the situation. That's why I think we can not agreed about patterns or rules before to really start to do the job.
My vision when revapi raise an issue :
- if there is a simple way to handle it without exception we go with it
- if there is no solution or the solution is too ugly or complex, we eventually create an exception if this make sense.
In a general way, we could also discuss with revapi developer as he probably already think about that and could be interested by our feedback.
Maybe you start to use revapi for leshan
I will as soon as I release the 1.0.0. But we plan to consider all the public/protected stuff as part of the API. I plan 1.0.x release but I don't know if there will be a 1.1.x. The 2.0.0 will probably target LWM2M 1.1 and so will come with API break.
So I'm not sure this would help.
I also already asked you, to spend your time into investigate for the Handshaker
I already answer about that (still in the private mail) with the interface pattern way which work but taking into account your remark about extensibility, create a class (facade) is maybe better.
That is unfortunately time consuming, but would help more than insist in "not changing stuff at all".
Here this is a "straw man" of my position, anyway I suppose this remark is the result of irritability generated by the fact you didn't read my last email.
Intended to be used, if connectivity may be the cause for missing responses.
If the 1. handshake flight is not responded, the handshake is cancelled and the session still kept.
(Stop handshake after "deep sleep").
Signed-off-by: Achim Kraus achim.kraus@bosch-si.com