-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve performance of anomaEncode / anomaDecode in the Core evaluator #2975
Conversation
Can't we solve this just by eliminating the use of shifts on Integers? They are the culprits here, but just bit-testing should be constant time. I changed two functions in -- | Binary encode an integer to a vector of bits, ordered from least to most significant bits.
-- NB: 0 is encoded as the empty bit vector is specified by the Hoon serialization spec
writeIntegral :: forall a r. (Integral a, Member BitWriter r) => a -> Sem r ()
writeIntegral x
| x < 0 = error "integerToVectorBits: negative integers are not supported in this implementation"
| otherwise = unfoldBits 0 (fromIntegral x)
where
len = bitLength x
unfoldBits :: Int -> Integer -> Sem r ()
unfoldBits idx n
| idx == len = return ()
| otherwise = writeBit (Bit (testBit n idx)) <> unfoldBits (idx + 1) n
-- | Computes the number of bits required to store the argument in binary
-- NB: 0 is encoded to the empty bit vector (as specified by the Hoon serialization spec), so 0 has bit length 0.
bitLength :: (Integral a) => a -> Int
bitLength n
| n == 0 = 0
| otherwise = fromIntegral (integerLog2 (abs (fromIntegral n))) + 1 and afterwards I think this chunk-division thing might be complicating things too much. Perhaps the most efficient way of doing this conversion would be to use one of the low-level library functions, e.g., |
Thank you, this is a better approach! I also looked at the low-level library functions |
fe59ade
to
2a60339
Compare
@lukaszcz I've reverted to the bitvec implementation using your suggestion for decoding. For me the |
2a60339
to
04a042a
Compare
The encoding function using Ultimately, we should implement this in linear time using low-level bit/byte manipulation. Maybe it's possible to just do it "by hand" given the definition of |
04a042a
to
0d2758f
Compare
Or maybe we could just use |
anomaEncode 'serializes' an arbitrary ByteString to an Integer. So I'm not sure we can use Data.Serialize because |
4cd4f8a
to
0005f1c
Compare
This reduces the number of shiftL performed on a large Integer
…n `byteStringToIntegerBE`
For decoding, use an implementation from @lukaszcz to avoid calling shiftR when writing the bits of an Integer. For encoding I continue to use the chunked encoding of ByteString to Integer. Co-authored-by: Lukasz Czajka <lukasz@heliax.dev>
0005f1c
to
b725f2e
Compare
This PR:
The old implementation used bitvec to manipulate the ByteString. This was far too slow. The new implementation uses bit operations directly on the input integer and ByteArray.
It's now possible to run anoma-app-patterns:
Tests/Swap.juvix
to completion.For encoding, if the size of the output integer exceeds 64 bits (and therefore a BigInt must be used) then the new implementation has quadratic time complexity in the number of input bytes if an implementation of
ByteString -> Integer
is used as follows:I think this is because
shiftL
is expensive for large Integers. To mitigate this I'm splitting the input ByteString into 1024 byte chunks and processing each separately. Using this we get 100x speed up at ~0.25Mb input over the non-chunked approach and linear time-complexity thereafter.Benchmarks
The benchmarks for encoding and decoding 250000 bytes:
The previous implementation would never complete for this input.
Benchmarks for encoding and decoding 2 * 250000 bytes:
Benchmarks for encoding and decoding 4 * 250000 bytes: