Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of anomaEncode / anomaDecode in the Core evaluator #2975

Merged
merged 5 commits into from
Aug 30, 2024

Conversation

paulcadman
Copy link
Collaborator

@paulcadman paulcadman commented Aug 27, 2024

This PR:

  • Adds a new implementation of {decode, encode}ByteString functions, used by anomaEncode and anomaDecode in the Core evaluator
  • Adds property tests for roundtripping and benchmarks for the new functions.

The old implementation used bitvec to manipulate the ByteString. This was far too slow. The new implementation uses bit operations directly on the input integer and ByteArray.

It's now possible to run anoma-app-patterns:Tests/Swap.juvix to completion.

For encoding, if the size of the output integer exceeds 64 bits (and therefore a BigInt must be used) then the new implementation has quadratic time complexity in the number of input bytes if an implementation of ByteString -> Integer is used as follows:

byteStringToIntegerLE :: ByteString -> Integer
byteStringToIntegerLE = BS.foldr (\b acc -> acc `shiftL` 8 .|. fromIntegral b) 0
byteStringToInteger' :: ByteString -> Integer
byteStringToInteger' = BS.foldl' (\acc b -> acc `shiftL` 8 .|. fromIntegral b) 0

I think this is because shiftL is expensive for large Integers. To mitigate this I'm splitting the input ByteString into 1024 byte chunks and processing each separately. Using this we get 100x speed up at ~0.25Mb input over the non-chunked approach and linear time-complexity thereafter.

Benchmarks

The benchmarks for encoding and decoding 250000 bytes:

 ByteString Encoding to/from integer
      encode bytes to integer:   OK
        59.1 ms ± 5.3 ms
      decode bytes from integer: OK
        338  ms ±  16 ms

The previous implementation would never complete for this input.

Benchmarks for encoding and decoding 2 * 250000 bytes:

    ByteString Encoding to/from integer
      encode bytes to integer:   OK
        121  ms ± 8.3 ms
      decode bytes from integer: OK
        651  ms ±  27 ms

Benchmarks for encoding and decoding 4 * 250000 bytes:

    ByteString Encoding to/from integer
      encode bytes to integer:   OK
        249  ms ±  17 ms
      decode bytes from integer: OK
        1.317 s ±  16 ms

@paulcadman paulcadman self-assigned this Aug 27, 2024
@paulcadman paulcadman added this to the 0.6.6 milestone Aug 27, 2024
@paulcadman paulcadman marked this pull request as draft August 28, 2024 07:54
@lukaszcz
Copy link
Collaborator

lukaszcz commented Aug 28, 2024

Can't we solve this just by eliminating the use of shifts on Integers? They are the culprits here, but just bit-testing should be constant time. I changed two functions in Nockma.Encoding.Base to:

-- | Binary encode an integer to a vector of bits, ordered from least to most significant bits.
-- NB: 0 is encoded as the empty bit vector is specified by the Hoon serialization spec
writeIntegral :: forall a r. (Integral a, Member BitWriter r) => a -> Sem r ()
writeIntegral x
  | x < 0 = error "integerToVectorBits: negative integers are not supported in this implementation"
  | otherwise = unfoldBits 0 (fromIntegral x)
  where
    len = bitLength x

    unfoldBits :: Int -> Integer -> Sem r ()
    unfoldBits idx n
      | idx == len = return ()
      | otherwise = writeBit (Bit (testBit n idx)) <> unfoldBits (idx + 1) n

-- | Computes the number of bits required to store the argument in binary
-- NB: 0 is encoded to the empty bit vector (as specified by the Hoon serialization spec), so 0 has bit length 0.
bitLength :: (Integral a) => a -> Int
bitLength n
  | n == 0 = 0
  | otherwise = fromIntegral (integerLog2 (abs (fromIntegral n))) + 1

and afterwards Tests/Swap.juvix in anoma-app-patterns finishes in reasonable time. The decodes seem to be more-or-less instantenous (as they should), encodes still take time, because I didn't change anything there.

I think this chunk-division thing might be complicating things too much.

Perhaps the most efficient way of doing this conversion would be to use one of the low-level library functions, e.g., integerToMutableByteArray (https://hackage.haskell.org/package/base-4.20.0.1/docs/GHC-Num.html#v:integerToMutableByteArray). There's also integerFromByteArray. Unfortunately, my search with Hoogle didn't turn up any higher-level wrappers around those functions.

@paulcadman
Copy link
Collaborator Author

Can't we solve this just by eliminating the use of shifts on Integers? They are the culprits here, but just bit-testing should be constant time. I changed two functions in Nockma.Encoding.Base to:

Thank you, this is a better approach!

I also looked at the low-level library functions integerToMutableByteArray et al. They seem quite complicated to use as they involve doing memory allocations and pointer manipulation.

@paulcadman paulcadman force-pushed the bytestring-integer-encoding-optimized-2 branch from fe59ade to 2a60339 Compare August 28, 2024 17:09
@paulcadman
Copy link
Collaborator Author

Can't we solve this just by eliminating the use of shifts on Integers? They are the culprits here, but just bit-testing should be constant time. I changed two functions in Nockma.Encoding.Base to:

Thank you, this is a better approach!

@lukaszcz I've reverted to the bitvec implementation using your suggestion for decoding. For me the juvix eval Tests/Swap.juvix test is still too slow without the chunking of the bytestring when encoding to an Integer (1m45s without chunking vs 20s with chunking). So I have kept the chunking bytestring -> integer function for encoding.

@paulcadman paulcadman marked this pull request as ready for review August 28, 2024 17:17
@paulcadman paulcadman force-pushed the bytestring-integer-encoding-optimized-2 branch from 2a60339 to 04a042a Compare August 28, 2024 17:17
@lukaszcz
Copy link
Collaborator

The encoding function using shiftL is still quadratic, only the constant factor is reduced by chunking. This is not long-term sustainable -- the encoding should just be linear.

Ultimately, we should implement this in linear time using low-level bit/byte manipulation. Maybe it's possible to just do it "by hand" given the definition of Integer: https://hackage.haskell.org/package/base-4.20.0.1/docs/GHC-Num.html#t:Integer
One can just pattern match on an Integer, box the values, convert to ByteArray, prepend one byte indicating which constructor was used. Decode by reading the first byte (tag), converting the rest to ByteArray, matching on it to get the unboxed value, and creating appropriate Integer using the constructors.

@lukaszcz lukaszcz force-pushed the bytestring-integer-encoding-optimized-2 branch from 04a042a to 0d2758f Compare August 29, 2024 14:55
@lukaszcz
Copy link
Collaborator

Or maybe we could just use Data.Serialize? The serialization/deserialization implementation for Integer should be linear there.

@paulcadman
Copy link
Collaborator Author

Or maybe we could just use Data.Serialize? The serialization/deserialization implementation for Integer should be linear there.

anomaEncode 'serializes' an arbitrary ByteString to an Integer. So I'm not sure we can use Data.Serialize because serialize : Integer -> ByteString is not surjective?

@paulcadman paulcadman force-pushed the bytestring-integer-encoding-optimized-2 branch 2 times, most recently from 4cd4f8a to 0005f1c Compare August 30, 2024 13:41
paulcadman and others added 5 commits August 30, 2024 17:05
This reduces the number of shiftL performed on a large Integer
For decoding, use an implementation from @lukaszcz to avoid calling shiftR when
writing the bits of an Integer.

For encoding I continue to use the chunked encoding of ByteString to
Integer.

Co-authored-by: Lukasz Czajka <lukasz@heliax.dev>
@paulcadman paulcadman force-pushed the bytestring-integer-encoding-optimized-2 branch from 0005f1c to b725f2e Compare August 30, 2024 16:05
@paulcadman paulcadman merged commit 87c5f0a into main Aug 30, 2024
4 checks passed
@paulcadman paulcadman deleted the bytestring-integer-encoding-optimized-2 branch August 30, 2024 17:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants