Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[coap+tcp] Adding Capabilities and Settings Messages (CSM) support #2092

Closed
sbernard31 opened this issue Nov 29, 2022 · 38 comments
Closed

[coap+tcp] Adding Capabilities and Settings Messages (CSM) support #2092

sbernard31 opened this issue Nov 29, 2022 · 38 comments

Comments

@sbernard31
Copy link
Contributor

I create this issue to discuss about design of CSM support in Californium.

The RFC says :

Once a Transport Connection is established, each endpoint MUST send a
CSM (see Section 5.3) as its first message on the connection. This
message establishes the initial settings and capabilities for the
endpoint, such as maximum message size or support for block-wise
transfers. The absence of options in the CSM indicates that base
values are assumed.

Here some ideas I share just to know, if this could be the right approach :

  1. Adding connected() / disconnected() methods in Layer class. I don't know if this should be directly done in Layer or if we should create a dedicated ConnectionOrientedLayer (or any better name) ?

  2. I guess we need a way to identify a connection at Californium-core level, with current design this should be EndpointContext ?

  3. Then create a CapabilitiesSettingsLayer which will be responsible to handle CSM message :

  • Receive and store Capabilities and Settngs for a given connection
  • Send CSM
  • Delay Message waiting CSM from foreign peer.
  • Time-out if we don't get CSM.
  1. If we want to do that, we also need to add this kind of connected / disconnected event on Connector.
    Maybe adding a setConnectionEventHandler(ConnectionEventHandler eventHandler)
    I guess we could also need a connect/disconnect method on Connector to force connection or disconnection.
    I don't know if we should add it directly to Connector or create a new Interface ConnectionOrientedConnector (we really need a better name...) ?

That's just some ideas to start a discussion or at least identify what could looks like a first try on this topic.

@boaks
Copy link
Contributor

boaks commented Nov 29, 2022

Not that sure, if it covers the requirements:
Currently the "onContextEstablished" callback indicates, that the connection is established.
Not sure, if that works, but adding the CSM in the "CoapTcpStack" as "none critical field" in the outgoing endpoint context would enable the tcp-connector to send such a CSM as first message.

@sbernard31
Copy link
Contributor Author

sbernard31 commented Nov 29, 2022

onContextEstablished seems to be linked to "sending raw data", I'm not sure it matches so much.
The use of a connect/disconnect event seems more in line with the RFC. (and will match Client and Server side use case)

Not sure, if that works, but adding the CSM in the "CoapTcpStack" as "none critical field" in the outgoing endpoint context would enable the tcp-connector to send such a CSM as first message.

I'm not sure to get the idea.
FMPOV, the tcp-connector should not really know the CSM concept. CSM will just be a rawData to send. 🤔

@boaks
Copy link
Contributor

boaks commented Nov 29, 2022

FMPOV, the tcp-connector should not really know the CSM concept. CSM will just be a rawData to send.

The additional endpoint context field may be just a byte[] and the usage would be, that the connector sends that data first, if a new connection is required.

@sbernard31
Copy link
Contributor Author

The additional endpoint context field may be just a byte[] and the usage would be, that the connector sends that data first, if a new connection is required.

I'm not so sure this will work, the RFC says each time a new TCP connection is established we need to send a CSM. So this is not really linked to sending data.

Between both solution, I will be more comfortable with the idea about adding some new events. My guess those events will be needed for more than that anyway (especially the disconnect event for some cleaning task)

@boaks
Copy link
Contributor

boaks commented Nov 29, 2022

I'm not so sure this will work, the RFC says each time a new TCP connection is established we need to send a CSM. So this is not really linked to sending data.

Just to mention:
The implementation so far does exactly that. It "connects", when a messages is sent and no prior connection is available. The connect includes also (d)tls-handshakes. One additional field with data to send on connect, would not change that much.

Anyway, if you prefer the other way, feel free.

@sbernard31
Copy link
Contributor Author

Anyway, if you prefer the other way, feel free.

Thx 🙏 !
I will experiment this way.

Just a question do you prefer we try to have dedicated interface/class for connect / disconnect new event or should I directly add it to existing class/interface ?

E.g. should I :

  1. create a ConnectionOrientedLayer which extends Layer with connect and disconnect method ?
  2. OR add connect and disconnect method to existing Layer interface ?

First as benefit to isolate changes but drawback to create a lot of new "ConnectionOriented" class/interface. (Layer, Stack, Connector, ..)

@boaks
Copy link
Contributor

boaks commented Nov 29, 2022

I would prefer a ConnectionOrientedLayer, maybe that has an chance to reduce the changes to the CoapTcpStack.

@sbernard31
Copy link
Contributor Author

I would prefer a ConnectionOrientedLayer, maybe that has an chance to reduce the changes to the CoapTcpStack.

Typo CoapTcpStack vs CoapUdpStack?

Because for now, It seems to me that this will increase changes in CoapTcpStack but reduce impact on all UDP stack.

@boaks
Copy link
Contributor

boaks commented Nov 29, 2022

Just the way of express things.

For me: there are overall changes, and these changes are then reduced to changes in the CoapTcpStack.
And that also means, it reduces the changes in/for the CoapUdpStack.

@sbernard31
Copy link
Contributor Author

@boaks I think I need some helps/advices.

I begin to work on this in cms branch and I face some problems. Californium stack design is based on "Exchange". (I must confess that the Exchange concept of californium is still not so clear to me)

But in my case, I have the new connected event and I want to send a new CMS message but I didn't have any Exchange.
See :

public class CapabilitiesSettingsLayer extends AbstractConnectionOrientedLayer {
@Override
public void connected(EndpointContext context) {
// TODO send CSM Message ?
// But we need an exchange :S
// sendEmptyMessage(null, null); ?
// sendRequest(null, null); ?
super.connected(context);
}
}

Any idea ?

@boaks
Copy link
Contributor

boaks commented Dec 2, 2022

Until now:

An Exchange is created either for received messages or for requests. Though these Exchanges are maintained with the MessageExchangeStore, the Exchanges are created (or found in the store) before passing the processing into the stack.

So, somehow, the "connected" may be better implemented in the CoapEndpoint, if such an Exchange is required.

@rogierc
Copy link
Contributor

rogierc commented Dec 3, 2022

(we really need a better name...

A suggestion:

When no CSM is received there maybe is a connection on tcp level, but on the Coap level the connection is not established. Then the ConnectionOrientedLayer should not reach a connected state when there is no aggreement between client and server on CSMs. To try and get an aggreement is then the responsibility of this one layer.

Seems the layer is about the connection concept on Coap level. The tcp level connection is the domain of the tcp-connector.

The name of ConnectionOrientedLayer could then be CoapConnectionLayer (contrary to the tcp/tls/ws connection) because it implements specific CoAP requirements regarding connections.

@sbernard31
Copy link
Contributor Author

I moved forward on this, see cms branch.
(this is still a draft so code is not cleaned)

I'm now able to receive and send the CSM message but I don't do anything with it for now...

I tested with libcoap and this is enough to be able to send a request and get a response (unlike before).

Before to go further, I would like to get feedback about this, to be sure I'm more or less on right way.

(about Signaling Option there is a dedicated topic : #2094)

Currently I have one big concern : the threading model ! Currently at coap level the threading model is around Exchange right ? My new events (connect/disconnect) / signal message handling is out of Exchange scope, so I'm afraid that this raise new race condition issue. I haven't clear idea about this. I believe that netty solve this kind of problem by assuring that all work about same connection is done in same thread. At californium level, would it mean same identity/endpointContext ?

@boaks
Copy link
Contributor

boaks commented Dec 15, 2022

Currently I have one big concern

Yes, that's a big one.
I don't know, what the initial intention was. When I was trying to do the bug-fixing back in 2018, I found:

An Exchange is processed:

  • when messages are arriving (that may happen also in parallel)
  • when the application sends a ACK/RST or a response. Or a notification.
  • when the retransmission timer expires
  • when a different blockwise transfer cancels it
  • when cancel is called from the application

The calls where crossing many different objects, so a common "synchronization" would have caused potential issues, therefore I used the "serial execution" in order to cover that.

For blockwise we still use a "synchronization" on the resource-level, but that causes a lot of trouble.

Until now, the idea was to decouple the different processing layers. Therefore the connector threads hand over their messages to the coap-layer threads. And these, after handled the Exchange, then pass the processing to the serial execution.

I don't know, if it will be easy (or even possible) to switch to a other threading model. Issue #2091 was asking for that. If we go that way, then I would try make that "execution" configurable to keep the current for UDP. But that will still leave the exercise, work and test for that "other execution model" to someone else.

@sbernard31
Copy link
Contributor Author

sbernard31 commented Dec 15, 2022

@boaks, any opinion about cms branch itself ?

About threading model, I'm not sure to get your position ? Is it something that you would like to "improve"/"clarify" for caliornium ? or you just happy with current design ?

@sbernard31
Copy link
Contributor Author

sbernard31 commented Dec 15, 2022

For blockwise we still use a "synchronization" on the resource-level, but that causes a lot of trouble.

🤔 All works for a given identity (endpoint context or something like this) instead of Exchange this will solve the issue ?
(About identity, we already talked about idea about short term and long term identity, in this case I think this should be short term identity)

About inspiration from Netty, they also have kinds of Layer, this is called ChannelHandler and the stack is called channelPipeline. The interesting part is that this stack is not shared by all peer but it is strongly linked to a Channel (a channel is a kind of connection abstraction). That's means that each time a new connection/channel is established/created a new pipeline is created, an so new ChannelHandler instances. If an handler store data as attributes, this will only be data concerning this channel/connection. This makes code easier as you only focus on 1 peer and this should also make house keeping easier.

@boaks
Copy link
Contributor

boaks commented Dec 15, 2022

I just don't know the outcome of these intended changes.
I only know, that in the past, it was unfortunately very time-consuming to do all the tests in order to prevent leaks and deadlocks.

@sbernard31
Copy link
Contributor Author

I still don't get too much what I should do now.

I understand that :

you would have reduced that "experiment" in just sending that initial CSM in order to keep the other peer happy. That's it.

But I don't get if :

  • the current branch cms is going is the right direction (OR at least is OK for the "experiment")
  • OR you didn't look at it for now and so I should wait
  • OR ... ?

@boaks
Copy link
Contributor

boaks commented Dec 19, 2022

FMPOV:

The main question for me is, what is the intention:

  • provide some early demonstration, but no plan to finish/consolidate it?
    Then it depends on how much needed to be changed. For now, it looks for me to change quite a lot.
    With that I would prefer to "park" that on a feature branch, but at least I don't plan to spend my time in
    maintaining that.

  • provide a solution with good test coverage? I'm not sure, how much time that takes.
    Also here it depends on the required changes. Less changes should be not that issue to go into main.
    But the more "untested and not consolidated" changes and "work-arounds" it takes, the more I would
    prefer a feature branch. But again, I don't plan to spend my time in maintaining that.

@sbernard31
Copy link
Contributor Author

sbernard31 commented Dec 19, 2022

Interesting 🤔

I don't see the situation like a choice between those 2 ways.

I mean even if the idea is to start to provide a demonstration (a kind of minimal testable feature).
Doing this only makes sense if this keep the door open to the possibility to make it evolves to something more mature.

And I also think that doing this could (should?) be an opportunity to maybe improve current californium design.
But I guess we didn't agree on this part too as you said :

I think, that no one will really spend the weeks. It may start but then stick in the "pain" of testing. The current design isn't that bad.


With that I would prefer to "park" that on a feature branch, but at least I don't plan to spend my time in
maintaining that.

With this, I will never be able to integrate it in Leshan. So this doesn't really fit my needs.

Personally, for Leshan, I never expect that contributors provide a turnkey solution perfectly tested. I always expect to have to polish the contributed code. Often this is opportunity to re-design Leshan or to work on something I put aside by the past.

Finally I understand that :

  • you don't see too much benefits in coap+tcp
  • you don't see any design improvement opportunity linked directly or indirectly to coap+tcp

and so this leads you to not want to take too much time on this or take risks to have regression because of this. I can understand that.

The past also showed that we have difficulty to collaborate together and we also often have different design preferences, so this makes more difficult the idea to work on this successfully.

So maybe Californium is just not the right place to put coap+tcp forward 🤔 ?

With that in mind, should we just remove coap+tcp from californium ?
If not, I guess we should at least add a very simple CMS + ping pong message support ? (else I don't get the purpose of keep coap+tcp in that state in cf)
And if we decide to add this, Is the current way in cms branch too impacting OR could be OK to be integrated in master ?

@rogierc
Copy link
Contributor

rogierc commented Feb 17, 2023

So maybe Californium is just not the right place to put coap+tcp forward ?

I hope this does not appear true. Although coap+tcp maybe isn't of the greatest importance now, it could enlarge the number of use-cases significantly in the future IMPOV. It has some unique selling points.

@sbernard31
Copy link
Contributor Author

sbernard31 commented Feb 17, 2023

@rogierc, just by curiosity could you elaborate about :

It has some unique selling points.

(This could help me to complete https://github.com/eclipse/leshan/wiki/CoAP-over-TCP, see also this discussion : eclipse-leshan/leshan#1343)

On my side,

  • I never succeed to have something which could looks like "yes the cms is a good way to explore" from the only active committer. (or any other feedback that make me know which direction I should take)
  • And also looking at committer's web site, make me think that pushing forward TCP is not really in its priority.
  • And also public and private discussion about coap+tcp in Californium make me think that collaboration will be very difficult.

So all of this, totally discouraging me to invest more effort on this.

Currently, I'm playing with java-coap(fork), we'll see if this could be a challenger for Californium. At least collaboration/communication with its maintainer seems to be simple for me.

@rogierc
Copy link
Contributor

rogierc commented Feb 17, 2023

@rogierc, just by curiosity could you elaborate about :

It has some unique selling points.

It has the unique selling points of CoAP in general of course, on top of that:

  1. It can be used where (corporate) network regions have unsufficient UDP support.
  2. It eliminates the MID (message id) and the resulting message rate limit.

These features become more important when coap is used at larger (corporate) scale and IoT devices are integrated in a layered IoT service architecture where IoT is seamlessly connected with the API-layers of cloud and datacenters. coap(s)+tcp/ws can be a valuable asset in bridging the IoT and API layers. In some of these layers the information flow tends to get much more dense (information from many devices is gathered and/or distributed). CoAP is useless here when there is a risk of bumping into the message rate limit.

For Smart Home, Smart City applications this is maybe not a big issue. But in domains like IIoT there probably are many use cases that benefit from coap(s)+tcp/ws. It is expected that there will be a Fourth Industrial Revolution by using IoT technology in Industrial Automation. I would like to think Californium is part of that.

And also looking at committer's web site, make me think that pushing forward TCP is not really in its priority.

Isn't this more a question of available resources? When a feature is added it also needs to be maintained. With a small number of committers this always is a challenge. Seems to me a lot of effort is put into the security features. That security gets priority seems right to me. Although I value coap+tcp - security is vital to IoT. Hope we can get both forward with joint efforts.

@boaks
Copy link
Contributor

boaks commented Feb 17, 2023

@sbernard31
I never claimed to have too much interest in coap+tcp. For that I've spent already quite a lot of time in it.
I changed the job last year, and so in the meantime I'm only able to spend limited time in Californium. I've tried to get some feedback from other committers about the csm branch, but with your private e-mail answer about that request, I'm somehow mixed up and the other committers never answered. You seems to be fixed on my person ("only active committer"), but with that, I don't see the possibility to continue with this work.
So, if "java-coap" works for you, then that's a nice way to go.

@rogierc
Did you test the csm branch? My main points are listed above in my comment from the 19. Dec.. I will not be able (nor willing) to spend too much time in testing that. So, if we merge it into main and the next progress with that will then be in a couple of months or so, everyone has to live with the result. For the 3.8.0 release, the tcp-bug together with the general observe client bug took about 5 days debugging. That's not possible to spend that much time in the future, that would then be something as 3 weekends, which then easily takes a 1 month in calendar time.

@sbernard31
Copy link
Contributor Author

@rogierc thx for this feedback.

It eliminates the MID (message id) and the resulting message rate limit.

That's a good point. First time I heard about it, I will add it to my wiki page.

Isn't this more a question of available resources? When a feature is added it also needs to be maintained. With a small number of committers this always is a challenge.

Yep, so for me solutions could be :

  • bring effort / invest time to embedded new committer in the project and supervise their works.
  • OR accept that the feature is out of scope because this is a 1 man project.

For now, I see Californium as a 1 man project and if the way it is manage doesn't change I'm not sure the project will really attract new active committer.

@rogierc
Copy link
Contributor

rogierc commented Feb 17, 2023

Did you test the csm branch?

Didn't as yet. Would like to, but time is scarce as - no doubt - it is for others.

That's not possible to spend that much time in the future, that would then be something as 3 weekends, which then easily takes a 1 month in calendar time.

In open source projects efforts are welcomed, not claimed. No comments from my side. When it takes longer to complete a feature we have to accept that, I think. Another question is that coap+tcp has - in principle - a place in Californium. I don't think we need to doubt that, apart from the time we can spend, it has, do we?

bring effort / invest time to embedded new committer in the project and supervise their works.

FMPOV Californium is worth to attract contributers.

@boaks
Copy link
Contributor

boaks commented Feb 17, 2023

When it takes longer to complete a feature we have to accept that, I think.

The point is, if it should be "pending" on "main", maybe with consequences, or in a feature branch or try to separate the work in an other module (if possible). That's what I mainly wrote above in my comment from the 19. Dec..

Another question is that coap+tcp has - in principle - a place in Californium.

Otherwise I would have removed it. But it has also obvious no place in my calendar.
In the end you will see, that a "professional coap+tcp" implementation will take a lot of time. I haven't seen someone in the past years, who was willing to spend that. So no matter, if it has a place in Californium or not, for years there was no interest in implementing it, just in "having it implemented".

@sbernard31
Copy link
Contributor Author

FMPOV Californium is worth to attract contributers.

Yep but involving real active committer is time consuming.
And managing a 1 man project and managing a team project is not the same. Different skills are needed.

@rogierc
Copy link
Contributor

rogierc commented Feb 18, 2023

It has the unique selling points of CoAP in general of course, on top of that:

forgot to mention:

  1. BERT extends the block-wise transfer protocol to enable the use of
    larger messages over a reliable transport.

The point is, if it should be "pending" on "main", maybe with consequences, or in a feature branch or try to separate the work in an other module (if possible).

From the side of Mule CoAP Connector there is no urgency. There is enough work to do on UDP/DTLS side there. Tcp importance is for future usage. Either way "main" or "feature branch" will do there.

For Leshan this seems different. There is a wish to merge with main on more shorter term. What is needed to be able to do that?

  • Testing tcp can be minimal as long as it´s considered experimental?
  • Assure enough testing so there is no regression on coap(s)+udp?
  • Maintenance on tcp feature to keep pace with future releases?
  • ?

If the goal is to merge soon, one should be carefull with design changes that have impact on udp, I think.

@boaks
Copy link
Contributor

boaks commented Feb 19, 2023

About the benefits:

  1. It can be used where (corporate) network regions have unsufficient UDP support.

If "insufficient UDP support" turns out to be "HTTP only", also that approach fails. I personally switched in such cases to cellular IoT. That removes all risks for those networks as long as the modems are only feeded via "data diodes".

  1. It eliminates the MID (message id) and the resulting message rate limit.

That's more an topic of RFC7252 4.8. Transmission Parameters.

A system either "learns" and optimize these parameters or use an assumption to define matching defaults.
For RFC7252 these assumptions are mainly based on "constraint networks" and sporadic traffic, which make learning inefficient, because the ip-route may change too often.

As for now, you may therefore use either other parameters, or someone needs to spend time in CongestionControl to provide a stable solution using larger N-START values.
If you use other parameters and a larger N-START value, then coap-over-udp speeds up as well.
On the server-side we implemented already the SweepPerPeerDeduplicator, which is closer to a "TCP message window". That means, Californium considers only a configured number of last messages, and relaxes from constraints caused by the EXCHANGE_LIFETIME.

  1. BERT extends the block-wise transfer protocol to enable the use of larger messages over a reliable transport.

See BERT for non-TCP/WS applications. If your network layer supports larger datagrams, you may use BERT also for UDP.

@boaks
Copy link
Contributor

boaks commented Feb 19, 2023

one should be carefull with design changes that have impact on udp

Therefore I tried to get the opinion of two other committers about the ratio between changes, risks and benefits.
But that have been canceled. From my side, I don't want to spend more time in it. It's not only Simon, who considers the collaboration to be very difficult, I do that as well. If asking either users (votes) or other committers is denied, I don't see, how we can move forward. We stick at the same question how to proceed as two years ago.

So, lets keep the tcp work in that feature branch and see, what happens.

@sbernard31
Copy link
Contributor Author

sbernard31 commented Feb 20, 2023

Therefore I tried to get the opinion of two other committers about the ratio between changes, risks and benefits.

There is no other active committers... they didn't work on this project for years...
By the way I'm not sure that both of them understood that they should evaluate : "the ratio between changes, risks and benefits." ?
At least I wasn't understood that...
(@jvermillard, @sophokles73)

But that have been canceled.

At least not by me.
I let the door open about it while trying to be clear about my current motivation, my future involvement in this and also the reason about it. (explained above)

If asking either users (votes) or other committers is denied

I said that : user votes is bad idea to decide about design (and If I remember well @sophokles73 agreed with that)
I never said that I was against committer vote ...

I'm not even sure where vote should be involved in this case ? 🤔
I explored about missing CSM feature in californium.
I explain my work in details from the beginning.
I proposed a design, I was clear this is just a draft.
I asked about confirmation on the direction chosen before to polish and doing a real PR.
I never get answer.
I tried to understand why and started a more general (private) discussion to know your general position about coap+tcp in Californum.
I didn't succeed to get a clear answer.
I vaguely understand "OK to have coap+tcp in californium but I don't want to spend time on it and I don't want to be impacted by it"
FMPOV, this is just unrealistic ... the state of the project doesn't allow that ... Imagining that someone will provide a PR which perfectly work without affecting anything is at best naive...

So I came to the conclusion that Achim don't want this feature but are not comfortable with clearly saying it.

Anyway, I don't feel that my contribution is welcomed. The more I read Achim's answer the more I'm demotivated. (and this even begin to affect my personal life again) So @boaks please let me out of this.

@rogierc Good luck If you decide to work on this. I hope you will be better than me, feel free to reuse my work if you think it's useful 🙂. Of course if you have better idea feel free to drop it 😉. If one day you have a minimal viable feature for coap+tcp in Californium, I could try it with Leshan.

@boaks
Copy link
Contributor

boaks commented Feb 25, 2023

Looks more like the "always repeating story of misunderstandings and misinterpretations" in the communication between Simon and me. It makes me feel, that even the time invested in that communication is in vain, regardless of my limitation in English or not.
Anyway, maybe @sophokles73 or @jvermillard could give a short statement, what's your understanding of the requested "feedback" was and why you finally decide to not spent time in it.

@jvermillard
Copy link
Contributor

I don't understand why we are asking for a "statement" from me; I made some contributions in the past as a committer. I did lose interest in developing features for Californium for various reasons.
For this case, I trust Simon to produce a quality PR, engage with the community, listen for feedback and help fix any side effects; if he needs my help, he knows how to reach me.

CoAP TCP looks useful for some use cases; Simon is willing to work on it, so let's merge it in the dev version and wait for community feedback. Maybe I'm oversimplifying, but it's how a community-driven open-source project works.

@boaks
Copy link
Contributor

boaks commented Feb 27, 2023

Thanks for your feedback!

@sbernard31
Copy link
Contributor Author

Just to avoid any ambiguity :

Simon is willing to work on it

Simon was willing to work on it

@boaks
Copy link
Contributor

boaks commented Feb 27, 2023

Also thanks for that clarification.

@boaks
Copy link
Contributor

boaks commented Feb 10, 2024

In the case that someone wants to continue this work, feel free to comment or open an new issue.

@boaks boaks closed this as completed Feb 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants