-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: representation of full uint64 range in & out of cbor #413
Conversation
A better approach might be to leave The existing TODO would still be a concern though, calling |
As per @hannahhoward's suggestion, I added a public interface like I also did dagjson, but had to shunt that to a branch because I discovered refmt's json handling doesn't deal with |
I'm considering removing the possibility of negative uints from this - CBOR enables this, you can encode a negative uint and get it out safely, but of course at the language level we need a second signal to say that this value should be negative, and it's a little bit weird. If you're dealing with negative numbers and need to resort to this kind of thing, then you're probably already doing this at an application level and are unlikely to need it at the encoding level I suspect. You probably either already have a "is positive" boolean hanging around (and are likely even going to encode it in your data structure adjacent to the uint value), or it's always known to be negative and therefore no signal is needed. In most languages we're used to not thinking about negative uints, we work around that in other ways (like just using signed ints). |
My primary comment was going to be about the support of negative uint64s, so I guess +1 to removing them. It seems odd. Especially given we use refmt -- as I read the code, we don't actually even have access to the underlying CBOR in the marshal/unmarshal code, just the token stream, which also appears, by my read, to not really support the negative 2^64 range. |
finally I'm pretty sure cbor-gen doesn't support 2^64 negative ints, and compatibility with cbor-gen is the primary thing driving our use case here anyway. |
I removed the And now I have mixed feelings about this—it's not as universal as I hoped this would be, I can't treat any integer as a uint and just use the sign to cover the full range. There's now a much clearer bifurcation between integer types, even when you're dealing with a positive int that's within the uint64 range. There remains the option of interpreting every positive integer out of the codec into a |
@hannahhoward @willscott want to give a (hopefully final?) review here? I've made a trade-off choice here for decoding: we
An alternative is that a Int node implementer, such as So we get this set of conditions if we went the other way:
For now, I'm considering this mostly an internal concern, for bindnode, so keeping current code paths mostly unmolested is ideal. For other cases you're going to encounter errors out of range, at some point. e.g. in a codegen TypedPrototype build, the |
Too many branches stacking up on each other and this is at the bottom of them so I'm going to pull the trigger and merge 🤞 |
I don't think I'd dare do this without Eric's OK so it's a draft for now. This should allow us to access the full positive uint64 range of integers, although you can only get to it by type checking and accessing some custom methods.
I'm mainly thinking about this for bindnode, when you have a
uint64
you can actually encode and decode the full range rather than overflowing the signed bit when casting to int64 for the highest range of integers. The awkward cast to theuintNode
shaped type wouldn't be a big deal tucked away in bindnode and it's not really something we have to advertise publicly unless folks complain and want their full range.This doesn't handle the negative uint64 range, which cbor can do, that would require some plumbing through refmt which currently errors when it encounters a negint below int64 range.