Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster signed multiplication #103

Closed
oscbyspro opened this issue May 5, 2023 · 3 comments
Closed

Faster signed multiplication #103

oscbyspro opened this issue May 5, 2023 · 3 comments
Labels
brrr such code, much wow
Milestone

Comments

@oscbyspro
Copy link
Owner

oscbyspro commented May 5, 2023

Signed multiplication is currently much slower than unsigned, mostly because the signed overflow check is snail-paced.

if !Self.isSigned {
    overflow = !product.high.isZero // 🚀
} else if self.isLessThanZero != amount.isLessThanZero {
    overflow = product < DoubleWidth(descending: HL(~Self(), Magnitude(bitPattern: Self.min))) // 🐌
} else {
    overflow = product > DoubleWidth(descending: HL( Self(), Magnitude(bitPattern: Self.max))) // 🐌
}

I believe the following is a faster, allocationless, version of the same comparsion. Although, I may have to add tests for it.

if !Self.isSigned {
    overflow = !product.high.isZero // 🚀
}   else if self.isLessThanZero == amount.isLessThanZero {
    // overflow = product > Self.max (but more efficent)
    overflow = !product.high.isZero || product.low.mostSignificantBit // 🏎️
}   else {
    // overflow = product < Self.min (but more efficent)
    overflow = product.high.isLessThanZero && (!product.high.nonzeroBitCount == product.high.bitWidth || !product.low.mostSignificantBit) // 🏎️
}

The idea is to only check bits that are significant for detecting overflow.

@oscbyspro oscbyspro added the brrr such code, much wow label May 5, 2023
@oscbyspro oscbyspro added this to the v2.1.0 milestone May 5, 2023
@oscbyspro oscbyspro changed the title Better signed multipliedReportingOverflow(by:) Faster signed multiplication May 5, 2023
@oscbyspro
Copy link
Owner Author

oscbyspro commented May 5, 2023

1. Test Case '-[ANKFullWidthKitBenchmarks. Int256BenchmarksOnMultiplication testMultipliedReportingOverflow]' passed (0.023 seconds).
1. Test Case '-[ANKFullWidthKitBenchmarks.UInt256BenchmarksOnMultiplication testMultipliedReportingOverflow]' passed (0.014 seconds).

2. Test Case '-[ANKFullWidthKitBenchmarks. Int256BenchmarksOnMultiplication testMultipliedReportingOverflow]' passed (0.017 seconds).
2. Test Case '-[ANKFullWidthKitBenchmarks.UInt256BenchmarksOnMultiplication testMultipliedReportingOverflow]' passed (0.014 seconds).

@oscbyspro
Copy link
Owner Author

The following assertions test all overflowing signed paths:

ANKAssertMultiplication(T.max, T( 1), T.max,        T( 0), false)
ANKAssertMultiplication(T.max, T(-1), T.min + T(1), T(-1), false)
ANKAssertMultiplication(T.min, T( 1), T.min,        T(-1), false)
ANKAssertMultiplication(T.min, T(-1), T.min,        T( 0), true )

ANKAssertMultiplication(T.max, T( 2), T(-2),        T( 0), true )
ANKAssertMultiplication(T.max, T(-2), T( 2),        T(-1), true )
ANKAssertMultiplication(T.min, T( 2), T( 0),        T(-1), true )
ANKAssertMultiplication(T.min, T(-2), T( 0),        T( 1), true )

@oscbyspro
Copy link
Owner Author

oscbyspro commented May 6, 2023

I find this rewrite, which leverages matches(repeating:) (#104), amazingly straight forward:

if !Self.isSigned {
    overflow = !(product.high.matches(repeating: false)) // 🚀
}   else if self.isLessThanZero == amount.isLessThanZero {
    // overflow = product > Self.max, but more efficient
    overflow = !(product.high.matches(repeating: false) && !product.low.mostSignificantBit) // 🏎️
}   else {
    // overflow = product < Self.min, but more efficient
    overflow = !(product.high.matches(repeating: true ) &&  product.low.mostSignificantBit) && product.high.mostSignificantBit // 🏎️
}

Alternatively:

if !Self.isSigned {
    overflow = !(product.high.isZero) // 🚀
}   else if self.isLessThanZero == amount.isLessThanZero {
    // overflow = product > Self.max, but more efficient
    overflow = !(product.high.isZero && !product.low.mostSignificantBit) // 🏎️
}   else {
    // overflow = product < Self.min, but more efficient
    overflow = !(product.high.isFull &&  product.low.mostSignificantBit) && product.high.mostSignificantBit // 🏎️
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
brrr such code, much wow
Projects
None yet
Development

No branches or pull requests

1 participant