Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generalize sieving #64

Open
wants to merge 13 commits into
base: master
Choose a base branch
from
Open

Conversation

fjarri
Copy link
Member

@fjarri fjarri commented Nov 10, 2024

Fixes #61

Introduces a SieveFactory trait and a generic sieve_and_find() and par_sieve_and_find() functions allowing one to combine an arbitrary sieve factory and an arbitrary predicate (like is_prime_with_rng()).

I changed the multicore tests to be more randomized (no predefined RNG seed) to do a more reliable comparison. Perhaps it should be done for all the tests?

Why we need a factory: we need to be able to produce a new sieve if the previous one is exhausted.
Why does sieve_and_find() return an Option: with a custom sieve and predicate it's not guaranteed that the loop will ever stop. The user should make sure of that by returning None from their SieveFactory::make_sieve() impl.

@lleoha's usecase is really covered by a custom predicate and the default sieve. Although a custom sieve wrapper may be needed to cover corner cases.

Design choices to be made:

  • Use an RNG parameter explicitly in SieveFactory and sieve_and_find(), or leave it to the factory implementations and predicates. The latter requires an additional Clone bound on the RNG since it has to be both kept in the factory and used in the predicate. The former (currently implemented) allows us to avoid the Clone requirement for the single-threaded version.
  • Allow make_sieve() to take &mut self (to modify the factory)? Currently that's the case.
  • Take the factory by reference or by value in sieve_and_find()? Currently it's taken by value (which makes make_sieve(&mut self) less valuable - only further calls to make_sieve() will be able to use the mutated state).

Copy link

codecov bot commented Nov 10, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 99.44%. Comparing base (4ea84ca) to head (3095e20).

Additional details and impacted files
@@            Coverage Diff             @@
##           master      #64      +/-   ##
==========================================
+ Coverage   99.37%   99.44%   +0.07%     
==========================================
  Files           9       10       +1     
  Lines        1280     1444     +164     
==========================================
+ Hits         1272     1436     +164     
  Misses          8        8              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@fjarri fjarri marked this pull request as draft November 10, 2024 23:27
Copy link
Contributor

@dvdplm dvdplm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code lgtm.

A couple of thoughts though:

  1. Shouldn't searching for safe primes be implemented as a custom sieve? The extra bool argument seems a bit like a left-over from the current API. Like, if someone came and asked for a "twin primes" sieve, we'd point them to implementing a "TwinPrimeSieve" (or whatever), so we should perhaps do the same for safe primes.

  2. I'm a bit hesitant about how this PR assumes that users will want to create a new sieve when a candidate is not prime. Is that always the best strategy? Maybe? Is it always safe to do so? Probably? This objection is pretty much the same point I made on Add support for custom Sieves. #61 when asking if the current (admittedly minimal) API is not enough.

  3. The make_sieve method taking the exhausted sieve. I assume this is to allow custom impls to inspect the state of the old sieve before making a new one? Perhaps that's worth documenting.

Comment on lines 258 to 259
/// If `safe_primes` is `true`, additionally filters out such `n` that `(n - 1) / 2` are divisible
/// by any of the small factors tested.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// If `safe_primes` is `true`, additionally filters out such `n` that `(n - 1) / 2` are divisible
/// by any of the small factors tested.
/// If `safe_primes` is `true`, filters out `n` such that `(n - 1) / 2` are divisible
/// by any of the small factors tested.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "additionally" is justified here, since the original filtering of "n" is still in effect.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, but there's something wrong with the grammar around "additionally" (the reason I suggested we remove it was I couldn't quite figure out what was wrong and suggest a correction).
How about "If safe_primes is true, extend the filter to skip ns such that (n - 1) / 2 are divisible…"?

src/traits.rs Outdated Show resolved Hide resolved

/// Sieves through the results of `sieve_factory` and returns the first item for which `predicate` is `true`.
///
/// If `sieve_factory` signals that no more results can be created, returns `None`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When would this be the case?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I mentioned in another comment, for some algorithms you would only want to create one sieve (e.g. #65). After that goes through the whole range, there is nothing else to create.

@fjarri
Copy link
Member Author

fjarri commented Nov 11, 2024

Shouldn't searching for safe primes be implemented as a custom sieve?

It could be, but they'd share 90% of the (somewhat complicated) code.

I'm a bit hesitant about how this PR assumes that users will want to create a new sieve when a candidate is not prime.

Could you elaborate? The way I see it the new sieve is created when the previous one is exhausted (for the default one, if it reached the increment limit). It does not depend on the predicate.

The make_sieve method taking the exhausted sieve. I assume this is to allow custom impls to inspect the state of the old sieve before making a new one? Perhaps that's worth documenting.

Yes, but also to distinguish the cases when a new sieve is created from the cases when another sieve replaces the previously exhausted one. For example, the algorithm in #65 would only ever want to create one sieve.

@dvdplm
Copy link
Contributor

dvdplm commented Nov 12, 2024

I'm a bit hesitant about how this PR assumes that users will want to create a new sieve when a candidate is not prime.

Could you elaborate? The way I see it the new sieve is created when the previous one is exhausted (for the default one, if it reached the increment limit). It does not depend on the predicate.

The context here is generalized sieving, with emphasis on "generalized". My worry is not that the code in this PR is bad or doesn't address the request from #61. It's more about not being sure what the API should look like for custom sieves. What is the correct abstraction that accommodates most reasonable sieves? I don't know enough about the topic to definitely tell but I do know there are plenty of sieving strategies. Do they all fit this mold?

In other words: are we ready to commit to an API for generalized sieving?

The code proposed here suggests that the sieving flow for all sieves is:

  1. Run the sieve until a candidate is found
  2. Test the candidate for primality (return if true)
  3. Construct a new sieve and repeat

My worry is that the above flow is fine for some strategies but sub-optimal for others.

@fjarri
Copy link
Member Author

fjarri commented Nov 12, 2024

Sub-optimal in what sense, not allowing to do some optimizations, or not covering some use cases?

In any case, we could have asked ourselves the same questions at the v0.5 release. It is possible that someone else will file an issue mentioning a scenario we haven't anticipated, and we'll have to expand the API. I don't see a big problem with it. And it's not like we even decide here what should be available and what's not: #61 can already be solved with existing hazmat API, we're just adding some convenience shortcuts.

@fjarri fjarri marked this pull request as ready for review November 13, 2024 01:29
@fjarri fjarri self-assigned this Nov 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for custom Sieves.
2 participants