Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Timestamps in seconds or milliseconds ? #1304

Closed
JaroslawLegierski opened this issue Sep 1, 2022 · 31 comments
Closed

Timestamps in seconds or milliseconds ? #1304

JaroslawLegierski opened this issue Sep 1, 2022 · 31 comments
Labels
question Any question about leshan

Comments

@JaroslawLegierski
Copy link
Contributor

We got following feedback in timestamps topic which needs some clarifications. Timestamps in Leshan client in ManualDataSender class https://github.com/eclipse/leshan/blob/master/leshan-client-core/src/main/java/org/eclipse/leshan/client/send/ManualDataSender.java in collectData are curently calculated in miliseconds (ManualDataSender line 47). Maybe they are supposed to be calculated in seconds (as in SenML) ? The problem is wider, should be the timestamps in TimestampedLwM2mNode and TimestampedLwM2mNodes in milliseconds or in seconds only?
Currently only SenML manage timestamped values (in seconds) byt maybe in a future data format they could be in ms. What is your opinion on this topic ?

@sbernard31 sbernard31 added the question Any question about leshan label Sep 1, 2022
@sbernard31
Copy link
Contributor

sbernard31 commented Sep 1, 2022

Lot of Good questions ! and you probably find something which smell not so good ...

Actually, SenML is no the only one to support timestamped values, JSON content format (application/vnd.oma.lwm2m+json, 11543) support it too (ok this is also a kind of SenML format but not the official one).

I quickly look at the LWM2M specification and I understand that timestamp for JSON is also in seconds.

So conclusion is the same because all current format are using seconds (and not milliseconds).

Currently only SenML manage timestamped values (in seconds) byt maybe in a future data format they could be in ms. What is your opinion on this topic ?

I was thinking this too but reading more deeper about SenML RFC and LWM2M spec about JSON format, I finally understand that time are in seconds but could be float number, and so you could have milli or ever micro seconds (or even maybe more)...

The problem is wider, should be the timestamps in TimestampedLwM2mNode and TimestampedLwM2mNodes in milliseconds or in seconds only?

Reading most of the encoder / decoder I understand current code are using /expecting seconds.

But what should it be is less clear... especially with what I found above. ☝️

So my opinion about this :
Short term : It seems that almost all the code is thought with timestamp in seconds in mind, so we should fix code which does not respect this and document most of the code about timestamp node (we need to precise that seconds is expected)

Mid/long term :

  • we need to discuss if we want more precision than seconds
  • if yes, we need to define which kind of precision
  • then how we encode it (long, double, bigInteger, Date, I don't known ? I guess this will not be so easy)
  • then we need to adapt the code / documentation.

What do you think ?

@JaroslawLegierski
Copy link
Contributor Author

I prepared very small commit regardings the timestamps callculation in ManualDataSender: JaroslawLegierski@5eb946b. After this modification timestamps in ManualDataSender should be consistent with the others timestamps in Leshan code.

When it comes to more precise timestamps, they are needed in many IoT use cases - and very important from the point of view of device vendors, of course we have to define the requirements in the first step.

@sbernard31
Copy link
Contributor

I prepared very small commit regardings the timestamps callculation in ManualDataSender: JaroslawLegierski@5eb946b. After this modification timestamps in ManualDataSender should be consistent with the others timestamps in Leshan code.

Do you plan to provide a PR ?

When it comes to more precise timestamps, they are needed in many IoT use cases - and very important from the point of view of device vendors, of course we have to define the requirements in the first step.

Yep so let discuss about that. I think we can use this issue.

Generally timestamp are store in long.

But do we need seconds, milli, micro or nano seconds ?
If we use a long this means that :

  • seconds go up to : very very very much :)
  • milli go up to : 17 Aug 292278994 07:12:55 GMT
  • micro go up to : 10 Jan 294247 04:00:54 GMT
  • nano go up to : 11 Apr 2262 23:47:16 GMT

To get this result I used :

System.out.println(new Date(Long.MAX_VALUE).toGMTString());
System.out.println(new Date(Long.MAX_VALUE/ 1_000).toGMTString());
System.out.println(new Date(Long.MAX_VALUE / 1_000_000).toGMTString());

(Please, double check if I'm not wrong)

So if we target milli or micro it's largely OK.
For nano, I guess this is more debatable but being able to have timestamp for next 240 years is maybe enough 🙂

What do you think ?

@adamsero
Copy link
Contributor

adamsero commented Sep 6, 2022

Hi, I'm going to chime in and speak on behalf of myself and @JaroslawLegierski 🙂. We decided to use Double as the data type for timestamps and in terms of time units it would be seconds with millisecond precision. There are several reasons behind that decision:

  1. In Java, when one wants to obtain epoch time, it makes the most sense to use milliseconds; System.nanoTime() doesn't provide an absolute time value and is only used to calculate relative time. Besides, not every device will have an internal clock accurate to measure in nanoseconds. We're not going to use microseconds for the same reason, since to obtain them we'd have to use System.nanoTime() anyway.
  2. Since multiple formats and standards use seconds already (as you mentioned earlier), it makes sense to convert the milliseconds returned from System.currentTimeMillis() into seconds. We should however use a floating point number to represent the number of seconds, as we can preserve the three decimal points in case there's a need to use millisecond precision.
  3. Which brings us to the final point about using Double as the data type. It provides greater accuracy than Float - tests have shown that we lose the decimal points when converting current time from milliseconds to seconds while using Float, but using Double allows us to keep the precision (as shown in example below). Additionally, Double's range is potentially very high, so we don't have to ever worry about running out of bits for storage.
Milliseconds: 1662474312644
Seconds (Float): 1.66247424E9 <-- we lose precision, specifically the number of milliseconds (644)
Seconds (Double): 1.662474312644E9 <-- every digit is preserved

In summary, this is roughly how measuring current time would look like:

long timestampMillis = System.currentTimeMillis();
double timestampSeconds = timestampMillis / 1000d;

Does this sound OK to you?

@sbernard31
Copy link
Contributor

sbernard31 commented Sep 6, 2022

Using Floating point number (float or double) comes generally with headhache precision issue and generally I prefer to avoid to use it unless I have no choice.

@sbernard31
Copy link
Contributor

Here some code which can be used to play with Floating point number and see the kind of precision issue you can face.:

    public static void main(String[] args) {
        // inputs
        // ------
        // Initials date
        long timeInNano = (new GregorianCalendar(2022, 9, 7, 10, 6, 31).getTimeInMillis() + 951) * 1_000_000l;
        // increment in
        long incrementInNano = 1l;

        int difference = 0;
        for (int i = 0; i < 1000; i++) {
            double timeInSeconds = timeInNano / 1_000_000_000d;

            BigDecimal nano = new BigDecimal(timeInNano);
            BigDecimal sec = new BigDecimal(timeInSeconds);
            BigDecimal secFromString = new BigDecimal(Double.toString(timeInSeconds));
            BigDecimal secFromNano = nano.divide(new BigDecimal(1_000_000_000));

            System.out.println("=================================================");
            System.out.println("Time in nano (long)                            : " + timeInNano);
            System.out.println("Time in sec (double) (nano x 1_000_000d)       : " + timeInSeconds);
            System.out.println("Time in nano (Big decimal from long)           : " + nano);
            System.out.println("Time in sec  (Big decimal) from nano BigDecimal: " + secFromNano);
            System.out.println("Time in sec  (Big decimal) from sec (double)   : " + sec);
            System.out.println("Time in sec  (Big decimal) from sec (string)   : " + secFromString);
            
            if (sec.compareTo(secFromNano) != 0) {
                difference++;
            }
            timeInNano = timeInNano + incrementInNano;
        }
        System.out.println("=================================================");
        System.out.println("Number of difference :" + difference);
    }

@JaroslawLegierski
Copy link
Contributor Author

Do you plan to provide a PR ?

Yes of course PR created: #1306

@adamsero
Copy link
Contributor

adamsero commented Sep 8, 2022

@sbernard31
While I agree about the occasional problems with floating point numbers, I think in this case we'll be fine. Your example shows some differences that could occur when using nanoseconds, but as I said earlier, they 're very unlikely to be used as an absolute timestamp in Java. When using milliseconds however, the earlier issues seem to be nonexistent (the only difference in your example would be from using new BigDecimal(Double.toString())), so I think we'd be fine with that option.

Additionally, SenML specification tells us which fields should use which datatypes, specifically that Base Time and Time should use Double. There are several examples of this across the document, e.g. section 5.1.6 provides a sample payload with the Base Time field as a floating point number (1.320078429e+09).

Ultimately, using floating point numbers would allow us to:

  • Settle on one timestamp format (epoch seconds)
  • Maintain millisecond precision
  • Be compliant with RFC specification

@sbernard31
Copy link
Contributor

sbernard31 commented Sep 8, 2022

think in this case we'll be fine. Your example shows some differences that could occur when using nanoseconds, but as I said earlier, they 're very unlikely to be used as an absolute timestamp in Java.

I'm not sure to get why you could face this kind of limitation in java
See : Instant.now()

but anyway, this is a protocol communication one peer could use Java, the other peer could use something else.

When using milliseconds however, the earlier issues seem to be nonexistent (the only difference in your example would be from using new BigDecimal(Double.toString())), so I think we'd be fine with that option.

Maybe it's ok for milliseconds (I'm not even sure) but why milliseconds would the right target ?
Any issue to support nano as timestamp abstraction ?

Additionally, SenML specification tells us which fields should use which datatypes, specifically that Base Time and Time should use Double.

Not exactly, this is double for senml+xml, for senml+cbor this could be any cbor number encoding and for senml+json this is a JSON number. A JSON number is a string and so can be any number (no real limitation), so probably a kind of BigDecimal in Java.

Anyway, I think you mix up 2 API.

  • The SenMLRecord API should probably use floating point number to follow the RFC. (Probably BigDecimal)
  • The TimestampedLwM2mNode could use something different if wanted (long, bigInteger, Instant). This is an abstraction for all format.

As @JaroslawLegierski said maybe one day we will see another format with more precise timestamp or which use Integer instead of floating point number.

To come back to the initial question :
which kind of precision ?
My opinion, we should use nano unless there is technical issue to use it, and so micro unless there is technical issue to use it ... etc etc..

how we encode it ??
If long is enough I think this is the best. (I don't know if going up to 11 Apr 2262 23:47:16 GMT is OK)
If we need to encode more, we can consider BigInteger or Instant.
About floating point as timestamp abstraction for TimestampedNode, I feel there is some risk we could be avoid easily with other format. So unless formats above raise some issues, I prefer to avoid it.

@adamsero
Copy link
Contributor

  • The SenMLRecord API should probably use floating point number to follow the RFC. (Probably BigDecimal)
  • The TimestampedLwM2mNode could use something different if wanted (long, bigInteger, Instant). This is an abstraction for all format.

That's a good idea; since according to the specification, timestamp values in SenML records are in seconds, we should convert them to an integer number. I think we should use long in TimestampedLwM2mNodes and use millisecond format, since most devices will provide that precision at best anyway and we won't have to worry about maximum date range.

@sbernard31
Copy link
Contributor

sbernard31 commented Sep 13, 2022

Reading your comments, I understand that using milliseconds format for timestamp as integer number seems important to you.

But I can understand why ? 🤔
What would be the drawback of targeting nano or micro seconds ?

As I said above, eventually I could understand that nano could be an issue as we can encode timestamp "just" up to 2262 (and even that I'm not sure this is a real issue, not sure that Leshan will still be used in 2262 😅 ...)

But for mircro, I don't see any drawback so why using milliseconds and so dealing with an abstraction format (TimestampedLwM2mNode) less precise that transport format (senml)?

@sbernard31
Copy link
Contributor

(Note that maybe Instant class is a not so bad Candidate ?)

@adamsero
Copy link
Contributor

After some internal discussion, I think we can settle for Instant, it looks like the best option. I'll prepare a PoC soon.

@adamsero
Copy link
Contributor

I changed the SenML Record API and the new class for timestamped nodes (TimestampedLwM2mNodes). I didn't really touch the old class yet (TimestampedLwM2mNodes) except for a few things to get it to compile. The changes are on this branch: opl/new_timestamp_format

Everything's working fine except for the fact that we lose two last decimal places when sending a timestamp with nanosecond precision using JSON. It has to do with the fact that Jackson saves this value to double, so a timestamp encoded like this:
1663159731.123456789
ends up being truncated to this:
1663159731.1234567
It's fine with CBOR since I was able to retrieve the value as String and then convert it to BigDecimal.

@sbernard31
Copy link
Contributor

Everything's working fine except for the fact that we lose two last decimal places when sending a timestamp with nanosecond precision using JSON.

Strange, I pretty sure that jackson should be able to support BigDecimal 🤔

@adamsero
Copy link
Contributor

You can indeed encode a BigDecimal into a JSON node (the ObjectNode class accepts BigDecimal as input), however it's not the case when decoding, at least by default. You can however do a little trick with an ObjectMapper instance:
mapper.enable(DeserializationFeature.USE_BIG_DECIMAL_FOR_FLOATS), though that would mean decoding to BigDecimal for every floating point value and I'm not sure if we want that.

@adamsero
Copy link
Contributor

By the way, do you know roughly when M9 could come out, and will it include these changes? And also more broadly, when a stable release for Leshan v2.x could potentially be released?

@sbernard31
Copy link
Contributor

By the way, do you know roughly when M9 could come out, and will it include these changes?

My plan does not change since I explain it at : #1222

I'm still focus on transport layer #1025.
This is huge changes (probably the biggest one since I work on Leshan)
(See my first try #1220, #1312)
Keep in mind this will take times (lot of times) before I get a clean version of it. (help is welcome)

So I don't plan to release M9 soon.
And as explained I don't plan to integrate new feature in master soon too.

But if an M9 with just bug fixes and/or small feature is needed, we can consider it on demand.

If you need lot of changes in next month maybe you should consider to use an internal fork waiting #1025 is ready.

Sorry for the annoyance but the situation is pretty exceptional.

also more broadly, when a stable release for Leshan v2.x could potentially be released?

Clearly I'm not able to do that, but my bet this will not happen soon.

@sbernard31
Copy link
Contributor

About the Jackson / BigDecimal issue, I think this can be done in a second time.

I mean most important is :

  • define a new API which will allow us to support nano. (Instant + BigDecimal)
  • having SenML decoder which supports your needs (If I well understand having a good precision for milliseconds)

So we can start with a PR limited to this.

@adamsero
Copy link
Contributor

I finished refactoring TimestampedLwM2mNode on opl/new_timestamp_format API, unfortunately there's a lot of code changed since I had to modify all the classes associated with encoders/decoders and all the tests. Do you want me to clean it up, format it and create a PR, or do you want to take a look beforehand?

@sbernard31
Copy link
Contributor

I just take a look quickly and I think there is no major point to discuss, so better to discuss it in a PR.

@sbernard31
Copy link
Contributor

Oups we just detect with #1313 that Instant is not available for android level API 19. which is the current level API targeted by Leshan.

Instant was added with API level 26 : https://developer.android.com/reference/java/time/Instant
Looking at https://apilevels.com/, If we jump to API level 26, we lost ~8-9% of device which could use Leshan.

So the question is :

  1. does it worth to lost compatibility with those devices to be able to use Instant ?
  2. should we fallback to long ?
  3. or try https://www.threeten.org/threetenbp/

I didn't like so much the idea to add a new dependency just for that, so at first sight I would not pick 3. 🤔
Between 1 and 2, it's hard to choice. I think we don't lost so many devices but in a meantime I feel that long could also easily fits the needs.
Increase the API level supported is a hard decision because it's very hard to go back if someone complain about it.

Note: that if we go with Instant, we should probably refactor the code to remove usage of Date and replace it with Instant the 2.0.0 release. (see https://docs.oracle.com/javase/tutorial/datetime/iso/legacy.html)

@adamsero
Copy link
Contributor

@sbernard31 By the way, you said that

If an M9 with just bug fixes and/or small feature is needed, we can consider it on demand.

so I was wondering if it would be possible to have a smaller M9 after integrating #1313. Also, we'd like to contribute to refactoring of the transport layer, since you said

Keep in mind this will take times (lot of times) before I get a clean version of it. (help is welcome)

In case you want our help, you can elaborate on what exactly we could do.

@adamsero
Copy link
Contributor

Regarding the API conflict:
Bumping the API level to 26 wouldn't be such a bad idea in my opinion, for following reasons:

  • We could benefit from using Instant class since it can be cast to a variety of different formats and has utilities that could prove to be useful (like isBefore(), isAfter(), parse() etc.). We also never have to wonder what format it's saved as - unlike when using long, where we have to know if it's milliseconds, microseconds etc.
  • Limiting ourselves to API level 19 effectively restricts us from using Java 8 and all the neat syntax it provides (streams, lambda expressions, date and time API and many more).

Ultimately I have no knowledge about how many devices could be using the old API (it's from 9 years ago), but if it's not a big deal, please consider upgrading it. Otherwise we'd probably have to fall back to using long with some helper utility methods to keep track of formats and their conversions.

@sbernard31
Copy link
Contributor

Bumping the API level to 26 wouldn't be such a bad idea in my opinion

I think we will probably go in that way. (I try to get some feedback internally and it also goes in that direction)

so I was wondering if it would be possible to have a smaller M9 after integrating #1313.

Yep I think this is possible.
(I known next sentence is maybe a bit dumb) Maybe you should double check if there is not other bugs/small feature needed which could go in that M9 ?

In case you want our help, you can elaborate on what exactly we could do.

About question about what to work on, do not hesitate to use #1049.
For transport layer, in particularly we can use #1222.

@sbernard31
Copy link
Contributor

I take time to be able to release a M9 soon.

@sbernard31
Copy link
Contributor

Here I try to clarify my mind about timestamped value in LWM2M.

AFAIK, there is nothing in LWM2M which define the precision of the timestamp.
We decide to use Instant class to be able to support precision until nanoseconds, this way we could abstract any current and future Content Format.

Currently there is 3 content format which allow to support timestamped value :

For Old JSON and SenML JSON, have timestamp value (bt, t) are encoded in Number.
As nothing more is precised. We looked at last JSON RFC RFC8259.
It says that any number can be represented with JSON format but for interoperability it is recommended to limit range to IEEE 754 binary64 (double precision) numbers :

... .... .... .... .... .... ... ... ... ... ... ... .... Since software that implements
IEEE 754 binary64 (double precision) numbers [IEEE754] is generally
available and widely used, good interoperability can be achieved by
implementations that expect no more precision or range than these
provide, in the sense that implementations will approximate JSON
numbers within the expected precision.

For SenML CBOR, this is pretty much the same thing,

For JSON Numbers, the CBOR representation can use integers,
floating-point numbers, or decimal fractions (CBOR Tag 4);
however, a representation SHOULD be chosen such that when the CBOR
value is converted to an IEEE double-precision, floating-point
value, it has exactly the same value as the original JSON Number
converted to that form. For the version number, only an unsigned
integer is allowed.

So nothing forbid to use bigger precision than a IEEE 754 binary64 (double precision) number but this is recommended to limit to it.

I guess in an ideal world, we should :

  • by default, ensure that all number received/sent with this format are Java Double and so raise exception if not.
  • have a way to configure encoders/decoders for those format to be able to use maximum precision

I'm not saying we need to implement this now but try to get the big picture.
@adamsero what do you think about that ?

@adamsero
Copy link
Contributor

Sure, if that means better interoperability, I'm all for it. But right now, I'd like to focus on getting the current changes on master and releasing M9. I should be done with adjustments after your review by the end of the day, by the way.

@sbernard31
Copy link
Contributor

#1313 is not integrated in master and will be integrated in next 2.0.0-M9 version. (#1319)

I think we can close this one ?

@JaroslawLegierski
Copy link
Contributor Author

Yes of course. Thank You very much.

@sbernard31
Copy link
Contributor

Thanks both of you to moved this forward 🙏 !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Any question about leshan
Projects
None yet
Development

No branches or pull requests

3 participants