Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decimal literal number #1752

Merged
merged 5 commits into from
Oct 22, 2019
Merged

Conversation

leonid-s-usov
Copy link
Contributor

@leonid-s-usov leonid-s-usov commented Oct 27, 2018

Hi all

@pkoppstein @wtlangford @nicowilliams, let me present you the descendant of #1743, and a proud fixer of #1741 and such, the decimal number literal implementation.

Based on the other PR it features full PASS under valgrind.

Unlike my suggestion at #1741 I haven't implemented the simple math (+ and -) to be run by the decimal number logic, although it's trivial to achieve currently.
Comparisons, however, are fully decimal, so please feel free to try

./jq -n '100000000000000000000000000000002 < 100000000000000000000000000000003'

and vice versa

@coveralls
Copy link

coveralls commented Oct 27, 2018

Coverage Status

Coverage decreased (-0.6%) to 84.482% when pulling 45c8bbe on leonid-s-usov:dec_literal_number into 4f58a59 on stedolan:master.

@leonid-s-usov
Copy link
Contributor Author

With the latest minor change the significance of the literal is preserved.

@leonid-s-usov
Copy link
Contributor Author

@wtlangford @nicowilliams I wonder what this silence means. Does this pull request have a future?

@wtlangford
Copy link
Contributor

I wonder what this silence means. Does this pull request have a future?

Just that I've been busy this last week. I just need to scrounge up the time to take a proper look at it.

@leonid-s-usov
Copy link
Contributor Author

@wtlangford @nicowilliams so wdyt here?

@leonid-s-usov
Copy link
Contributor Author

😕

@wtlangford
Copy link
Contributor

Hey, sorry, I've been super busy. I'll have time in the next day or two to give this a good, proper read-through, due to the holiday.

@leonid-s-usov
Copy link
Contributor Author

leonid-s-usov commented Nov 26, 2018 via email

@wtlangford
Copy link
Contributor

wtlangford commented Nov 26, 2018 via email

Copy link
Contributor

@wtlangford wtlangford left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay, but I've finally made it through!

I need to go back and read up a bit on the decNumber library and how it works to better understand the allocations we're making and the semantics of it, but I didn't want to block the rest of my comments on that.

I'm going to run some benchmarks of my own, but in the meantime, do you have any benchmarking on how this change impacts performance (speed, mem usage, etc)?

@@ -292,11 +292,37 @@ sections:
program can be a useful way of formatting JSON output from,
say, `curl`.

An important point about the identity filter is that it
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might argue that we shouldn't be so explicit about this behavior in the documentation (though the changelog/etc is a different story).

I still like to think of it as an internal detail that we're keeping the original decimal representation around.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll follow your advises with regards to the documentation, because this is quite a fuzzy topic and we could argue for no practical reason.

I simply wanted to cover the thing which was one of the main motivations behind the whole PR - the fact that today jq is not able to keep the input unchanged if the number literal appears to be of grater significance than double can represent

@@ -0,0 +1,38 @@
/* ------------------------------------------------------------------ */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can probably omit the documentation pdf and example c files.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be glad to remove everything which is not used, but I wasn't sure if we can do that with regards to the open source nature of this part. If we can exclude all other things then let's do this

const char * str = jv_number_as_string(x, buf, JVP_DTOA_FMT_MAX_LEN);
put_str(str, F, S, flags & JV_PRINT_ISATTY);
if (jvp_number_is_nan(x)) {
jv_dump_term(C, jv_null(), flags, indent, F, S);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the behavior change here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have actually fixed this annoying thing that when we fall back to null here we used output it as plain text null and it wouldn't get highlighted as real null values in the output. Just try it in console, the actual null and a null of a NAN are shown differently today.

int jv_invalid_has_msg(jv inv) {
assert(JVP_HAS_KIND(inv, JV_KIND_INVALID));
int r = JVP_HAS_FLAGS(inv, JVP_FLAGS_INVALID_MSG);
jv_free(inv);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This free changes the semantics of this function. Previously, it would not free inv, and was only freeing the extra copy on the msg performed by jv_invalid_get_msg (see https://github.com/stedolan/jq/pull/1752/files/bf7f1f2b8176597a04bf4a8c00243567d3c7c83d#diff-5c46af0444da42be404b9ff2acf24da2L124)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it's a change of implementation to follow the actual semantics of every single place this function was called, which was resulting a memory leak.
Please check that in all places where it was called the parameter has been jv_copyied.

This is exactly one of the things I have tried addressing in my API name refactoring. Today there is a big mess because one can't tell by the naming convention whether the function will be consuming the value or not.

At the same time, there is a general tendency to consume parameters in the calls, so instead of fixing all places where this function was called I have decided to change the implementation. And actually optimised it as well - previously it would construct a message just to see that it wasn't nil

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it's a change of implementation to follow the actual semantics of every single place this function was called, which was resulting a memory leak.
Please check that in all places where it was called the parameter has been jv_copyied

Ah, I hadn't realized (yes, more evidence of your point. 😃). This is fine, then.

#define JVP_FLAGS(j) ((j).kind_flags)
#define JVP_KIND(j) (JVP_FLAGS(j) & KIND_MASK)

#define JVP_HAS_FLAGS(j, flags) (JVP_FLAGS(j) == flags)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's put parens around flags and kind here.

Also, I'd like the name to be a bit clearer. JVP_HAS_FLAGS would imply to me that I can ask about any particular bits on the flags, but this is actually saying "does it have only and exactly these specific flags?". Can we find a better name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed about parens.

Naming - i'm open to suggestions

#define JVP_KIND(j) (JVP_FLAGS(j) & KIND_MASK)

#define JVP_HAS_FLAGS(j, flags) (JVP_FLAGS(j) == flags)
#define JVP_HAS_KIND(j, kind) (JVP_KIND(j) == kind)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this be better called JVP_IS_KIND?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe. Or maybe not.

This way it's HAS_FLAGS and HAS_KIND. But i'm not too married to this.

jv_array_foreach(a, ai, aelem) {
jv_free(aelem);
jv_array_foreach(b, bi, belem) {
if (!jv_equal(jv_array_get(jv_copy(a), ai + bi), jv_copy(belem)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove the jv_copy from belem here, and you can skip the jv_free(belem) below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

static struct {
uint64_t refcnt;
struct dtoa_context C;
} shared_dtoa_context;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have concerns about these shared contexts and thread-safety. While the jq vm itself is single-threaded (and the main jq executable as well), we do expose a C API that other, multi-threaded programs could use. Two programs attempting to use the same contexts here could encounter issues.

I understand the purpose was to reduce allocations/improve performance, but we have to be careful with globals and shared state.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. Indeed, this shared data will not be usable even with external syncrhonization techniques.
I can change it to thread local storage. The only constraint here would be that the jv vm would need to be started and used and destroyed on the same thread.

Or, we could think of some "vm context" to be used anywhere inside the implementation its needed, but it looks like a much more impacting refactor, it would require passing the context everywhere around. But it would mean that a given vm instance could be used across multiple threads given some external sync to prevent simultaneous execution.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to avoid the thread local storage so we don't have to pin a vm to its original thread. That seems like a pretty heavy cost on our users.

The vm context probably costs a bit more to solve than we'd like (certainly in implementation complexity).

That said, why do these need to be shared at all? Why not just make a dtoa_context or a dec_context on the stack when we need one? These don't seem to be particularly large (though the dtoa_context could be sneaky if it has to allocate a lot during processing). I'd like to investigate the performance cost of just making these on the stack every time.

Copy link
Contributor Author

@leonid-s-usov leonid-s-usov Nov 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's pretty heavy. The dtoa algo is creating several BigNum buffers on the heap and reusing the context will actually reuse those buffers, while creating and freeing this context will alloc and release all those BigNums.

Currently it's not an issue since the dtoa context is intialized once per parse (but actually multiple times per dump, so dumping many different numbers is not as efficient as it could have been, while dumping one array of many numbers is reusing the same context)

I have solved both issues with this approach. Actually, we could have one dtoa context per thread and expose a "thread_free" method for jv vm to cleanup once thread is not used any more.

Actually I kind of liked my approach of having shared data reused by multiple instances. If jv_object could somehow get the reference to the vm context it lives in, it could be our solution of choice to share the data between instances of the same vm and clean up once all objects in that vm are gone.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I have just realised that I can come up with a thread local implementation immune to thread move.

The shared objects will be managed per thread, with refcount; once a new jv_number requires a shared context it received an initialized pointer. Then, every time it needs the contex it should pass the pointer and the system will compare it to the current thread's shared pointer. If it's different then it will reduce the refcount for the other thread's context and increment the refcount for this thread's context, and return this thread's context, so that the jv_number already has a different pointer to work with currently.
this way, the refcount will properly flow between threads and the contexts will be freed once the last object is deallocated or moved from that thread

@leonid-s-usov
Copy link
Contributor Author

🎉 let's rock and roll!

@leonid-s-usov
Copy link
Contributor Author

Re. performance.

I have linked in my other posts to the benchmarks of this library. Just to clarify, we aren't using any of the math except for comparison, and only in case two literals are being compared.

Also, if you have noticed, I have delayed the dtoa call until it's actually required to convert a decimal fraction into a binary fraction, and then i only do that once per literal.
Parsing a decimal literal from decimal string representation is cheaper so if no conversion is needed we actually should gain some performance here.
However, if all parsed numbers are also converted to double for some operations, it will be slightly less performant since it will first generate a decimal structure and then run the dtoa algo for every number.

I haven't run any checks cause I didn't have any infra for that at the time. I remember you mentioned trying things out so I thought that maybe you had some handy setup available, so left it out of my work scope.

@pkoppstein
Copy link
Contributor

maybe you had some handy setup available

Please see
https://github.com/stedolan/jq/wiki/X---Experimental-Benchmarks

@leonid-s-usov
Copy link
Contributor Author

Guys, where are we standing with this?

Is this pending on me having to run the benchmarks? I'm not sure if I'll have time for this until the end of this week.
However maybe there are other concerns/issues? If we make a clear roadmap I'll try and commit to this. However if you are not interested, which would be a bummer, I'd rather not invest additional time

Thanks.

@wtlangford
Copy link
Contributor

wtlangford commented Dec 19, 2018 via email

@wtlangford
Copy link
Contributor

Sorry for the delay! The holidays got the better of me.

After further reflection, I feel that thread-locals are the right way to go, so I pushed up a copy of your branch to my fork with an extra commit on it (wtlangford@b2cf0f9) that switches the shared contexts over to using pthread's TLD (and makes pthread no longer optional).

There's no major speed penalty to it compared to your branch, and we go back to being thread-safe. (Interestingly, most libc implementations I've looked at use similar mechanisms for their strtod context allocations).

Relating to documentation:
I'm happy to document as guarantees the semantics we intend to guarantee. For semantics we don't intend to guarantee (details about the specific implementation, etc), I'm fine with documenting them, but we need to be very explicit that we are not making guarantees about them.

I don't want to make any guarantees that we'll maintain precision on your number (or even that we'll be keeping bignum support- maybe this causes lots of issues we aren't aware of yet and want to revert back to IEEE754 doubles). However, I'm happy with documenting our current behavior as "behavior", but not a "promise":

  • Numbers you don't do math on will retain their precision

The last thing I might want is to put some #ifdef USE_DEC_NUMBER guards around all of the decnumber code and a configure flag to control it. I'd like us to be able to build without this support for testing/measurement purposes. (We'd likely also want to use that configure flag to control whether or not the decNumber sources get built into the binary, too).

@leonid-s-usov
Copy link
Contributor Author

leonid-s-usov commented Jan 5, 2019 via email

Leonid S. Usov and others added 4 commits April 3, 2019 22:01
The library adds support for decimal numbers of arbitrary length.
Downloaded from ICU, under ICU 1.8.1 license
http://download.icu-project.org/files/decNumber/decNumber-icu-368.zip
Extend jv_number to use decNumber for storing number literals. Any math
operations on the numbers will truncate them to double precision.
Comparisons when both numbers are literal numbers will compare them
without truncation.

Delay conversion of numbers to doubles until a math operation is performed,
to preserve precision. A literal jv_number will only need conversion to
double once, and will reuse the resultant double on subsequent
conversions.

Outputting literal jv_numbers preserves the original precision.

Add strong pthread requirement to manage contexts/allocations for
converting numbers between their decNumber, string, and double formats.
Allow building jq in a mode that doesn't use decnumber for benchmarking
purposes. decnumber support is enabled by default, and this option is
meant to be removed once we're happy with the performance.
@wtlangford
Copy link
Contributor

Sorry again for the delays! We're at the end of them, though; at this point, all that's left is a review from @nicowilliams!

I've pushed a branch (https://github.com/stedolan/jq/compare/dec_literal_number) which contains all of this awesome work (less the jv.c split, as we've discussed elsewhere), rebased against the current master. I squashed most of the commits together to ease the process, and added a commit providing a configure flag (off by default!) to disable the literal numbers (meant for benchmarking, etc. I don't want it to stick around particularly long, though.).

@leonid-s-usov: The commits on my branch still have you as an author. Would you prefer to reset your branch to those or do you want me to close this PR and open one under the branch I pushed up? Either is fine with me.

@nicowilliams
Copy link
Contributor

I'll review this weekend. Thanks @wtlangford!

@leonid-s-usov
Copy link
Contributor Author

leonid-s-usov commented Apr 4, 2019 via email

tests.out Outdated Show resolved Hide resolved
@nicowilliams
Copy link
Contributor

@leonid-s-usov Let's backup.

Until now all jv values, and their jvp parts were immutable as long as refcnt was larger than 1. Now when calling jv_number_value() on a number whose jvp's refcnt is larger than 1 can mutate the jvp. The way it mutates the jvp is deterministic, so the only danger is leaking.

Using an atomic CAS would prevent the leaking.

Of course, at this time we don't use any atomic operations, and we'd have to start using atomic increment/decrement for the refcnt in order to make any of this thread-safe, so this is all a bit academic.

Anyways, please disregard as I am convinced we can make this thread-safe.

@leonid-s-usov
Copy link
Contributor Author

That's cool, I understand it, and I agree.

What I'm adding on top is that I believe in most cases it will be sufficient to make an atomic CAS to a member of the private structure pointed to by the jvp, under the implementing types' responsibility. This may be much more efficient than a common requirement to make a swap of the full jvp structure.

@nicowilliams
Copy link
Contributor

Yes, that's what I had in mind.

@nicowilliams
Copy link
Contributor

Also, no need to CAS a double into place in the case of numbers... since the computation will be deterministic, having two or more threads racing to write the same double into the same address is fine.

@leonid-s-usov
Copy link
Contributor Author

For this branch it's not about the double, apparently, but about this one, for example

... 
  if (plit->literal_data == NULL) {
    int len = jvp_dec_number_ptr(n)->digits + 14;
    plit->literal_data = jv_mem_alloc(len);

    // Preserve the actual precision as we have parsed it
    // don't do decNumberTrim(pdec);

    decNumberToString(pdec, plit->literal_data);
  }```

@wtlangford
Copy link
Contributor

I've chatted with @nicowilliams: I'm going to go ahead and merge this and we'll just watch for any issues that may pop up as time goes on. The thread-safety issues we discussed above can be addressed whenever we're ready to cross that bridge.

Sorry to let this languish here for so long!

@wtlangford wtlangford merged commit 9b51a08 into jqlang:master Oct 22, 2019
@leonid-s-usov
Copy link
Contributor Author

leonid-s-usov commented Oct 22, 2019 via email

@mmotwani
Copy link

mmotwani commented Apr 5, 2022

This was merged back in 2019, but no new builds since then. It would be nice not to have to create custom builds with this fix. @stedolan @leonid-s-usov @itchyny

Any plans to cut a new build?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants