Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-93476: Fraction.limit_denominator speed increase #93477

Closed
Closed
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 25 additions & 6 deletions Lib/fractions.py
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,13 @@ def __new__(cls, numerator=0, denominator=None, *, _normalize=True):
Fraction(147, 100)

"""
# private _normalize=False should only be set if the Fraction is
mscuthbert marked this conversation as resolved.
Show resolved Hide resolved
# already asserted to be normalized.
# (see discussion: at https://github.com/python/cpython/pull/93477)
# if a non-normalized Fraction is passed in with _normalize=False
# then API calls may give inconsistent results on equivalent
# Fraction objects.

self = super(Fraction, cls).__new__(cls)

if denominator is None:
Expand Down Expand Up @@ -234,10 +241,11 @@ def limit_denominator(self, max_denominator=1000000):
if max_denominator < 1:
raise ValueError("max_denominator should be at least 1")
if self._denominator <= max_denominator:
return Fraction(self)
return Fraction(self._numerator, self._denominator, _normalize=False)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this change make a significant difference to performance? If not, I'd suggest reverting it, both for readability and for the purposes of keeping the PR focused. From looking at the code, a simple Fraction(self) shouldn't end up doing any gcd calculation here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It actually saves 30% on this case based on tests above -- however, I hadn't realized at the time that the return self was executed in this branch and no normalization was done. The speed increase is probably based on the isinstance() check, which just keeps getting faster and faster in subsequent versions, so it might eventually be the faster route. I've removed the change.

If Fractions are eventually to be truly immutable and hashable, then return self is the correct response here, but that is probably backwards incompatible with real code where people change _numerator or _denominator after instantiation.


p0, q0, p1, q1 = 0, 1, 1, 0
n, d = self._numerator, self._denominator
n_original, d_original = n, d
while True:
a = n//d
q2 = q0+a*q1
Expand All @@ -247,12 +255,23 @@ def limit_denominator(self, max_denominator=1000000):
n, d = d, n-a*d

k = (max_denominator-q0)//q1
bound1 = Fraction(p0+k*p1, q0+k*q1)
bound2 = Fraction(p1, q1)
if abs(bound2 - self) <= abs(bound1-self):
return bound2
bound1_n = p0 + k*p1
bound1_d = q0 + k*q1
bound2_n = p1
bound2_d = q1
mscuthbert marked this conversation as resolved.
Show resolved Hide resolved

# diff1_n = numerator of (bound1 minus self) as a Fraction;
# etc. for diff1_d, diff2_n, diff2_d
diff1_n = abs(bound1_n*d_original - n_original*bound1_d)
diff1_d = d_original * bound1_d
diff2_n = abs(bound2_n*d_original - n_original*bound2_d)
diff2_d = d_original * bound2_d

if diff1_n * diff2_d >= diff2_n * diff1_d:
# bound2 is closer (or equal) to original as bound 1
return Fraction(bound2_n, bound2_d, _normalize=False)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just realised that I'm not 100% convinced that the result here is always normalised. There are two parts to being normalised: the numerator must be relatively prime to the denominator, and the denominator must be positive. I'd thought about the first part (which is fine), but not about the second.

Is it clear that bound2_d and bound1_d are positive at this point?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Answer: yes, it's true that bound2_d and bound1_d are positive at this point, though perhaps not 100% clear. In the Euclidean loop above, on the very first iteration a can be negative, but on all subsequent iterations a is nonnegative. And on the first iteration q1 = 0, so that potentially negative a does no harm, and q0 and q1 are guaranteed nonnegative at all times. Then it's easy to check that they must in fact be positive at the point where the loop exits.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was my reading of the loop as well, but I didn't write the code (nor name the variables) so I didn't want to change this without another set of eyes; thanks!

else:
return bound1
return Fraction(bound1_n, bound1_d, _normalize=False)

@property
def numerator(a):
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
:meth:`fractions.Fraction.limit_denominator()` performance enhancements.
Patch by Michael Scott Asato Cuthbert.