-
Notifications
You must be signed in to change notification settings - Fork 629
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(random/unstable): basic randomization functions #5626
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #5626 +/- ##
========================================
Coverage 96.24% 96.24%
========================================
Files 485 491 +6
Lines 39269 39440 +171
Branches 5787 5811 +24
========================================
+ Hits 37793 37958 +165
- Misses 1432 1438 +6
Partials 44 44 ☔ View full report in Codecov by Sentry. |
Note that we already have const numbers = [1, 2, 3, 4];
const random = sample(numbers, { weights: [1, 9999, 2, 8888] });
Is there a particular reason why this algorithm is added?
I think this would be a cool addition to |
Only that it was the algorithm suggested in the linked issue, it has a period high enough for any sensible use case, and it seems to give a good distribution of results. One other plus is that arbitrarily chosen positive integer seeds still seem to give good entropy, even if they're apparently "low entropy" numbers like Edit 2: Upon looking into it further I think something like PCG64 (numpy's default) would be a better choice. |
I quite like the idea of making it an option. It does require "unzipping" the values from their weights, but it can still be pretty versatile in terms of usage: const weighted = new Map([["a", 5], ["b", 3], ["c", 2]]);
const result = sample([...weighted.keys()], { weights: [...weighted.values()] }); I'll keep the |
@lionel-rowe Can you move Also I'd prefer to see |
random/between_test.ts
Outdated
Deno.test("randomBetween() throws if min or max are NaN", () => { | ||
assertThrows(() => randomBetween(NaN, 1), AssertionError); | ||
assertThrows(() => randomBetween(1, NaN), AssertionError); | ||
}); | ||
|
||
Deno.test("randomBetween() throws if max is less than min", () => { | ||
assertThrows(() => randomBetween(10, 1), AssertionError); | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice checks 👍
@kt3k Wouldn't that be confusing to have two
|
I think we can deprecate
Or how about the name like |
random/_types.ts
Outdated
* randomization. | ||
* @default {Math.random} | ||
*/ | ||
random: () => number; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this option is not very web api-ish. I think it would be better to remove it and add some kind of algorithm
option if needed, that calls different functions internally. Doing customRandomFunction() * (max - min) + min
seems very trivial.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this option is not very web api-ish. I think it would be better to remove it and add some kind of
algorithm
option if needed, that calls different functions internally. DoingcustomRandomFunction() * (max - min) + min
seems very trivial.
Callbacks seem pretty web API-ish to me, and they're more versatile than specifying a name of an algorithm — with a callback, it's easy to use with third-party randomization functions/sources. But perhaps more importantly, specifying an algorithm name would surely make any and all of the all implemented algorithms into non-tree-shakeable dependencies of any function taking such options.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kt3k do we have a guide on how to name option functions? I thought thy have a suffix of Fn
-> randomFn
. Not sure though.
random/seeded.ts
Outdated
|
||
// For convenience, allowing destructuring and direct usage in callbacks | ||
// equivalently to Math.random() | ||
this.random = this.random.bind(this); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think destructuring can be done without this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think destructuring can be done without this.
Destructuring loses this
context of unbound methods (as does passing by reference in a callback).
random/seeded.ts
Outdated
* assertEquals(prng.state, [13062938915293834817n, 10846994826184652623n]); | ||
* ``` | ||
*/ | ||
get state(): Readonly<State> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does state
need to be accessible outside the class? I only see it for testing purposes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think state
should definitely be gettable, as it allows persisting a PRNG's current state for resumption later on. It's arguable whether or not it needs to be settable though, perhaps setting it should only be possible via the constructor or a from
static method.
As with seeding, I'm still not too settled on how this should be exposed, maybe it should have explicit state
and inc
properties (and probably an algorithm
property too, even though there's only one currently implemented).
Edit: I've kept it settable for now (in case it's desirable to have a global PRNG that can be mutated) and also added fromState
static method to get a new instance from state. The SeededRandomState
format is now { algorithm, state, inc }
. I don't really love that it's state.state
, open to suggestions on better naming (notable that the numpy+randomgen implementation is even worse in this respect as it has state.state.state
🫣)
Also notable is that the state
and inc
properties of SeededRandomState
are both bigints, representing uint64s. It's a little unfortunate that this means they can't directly be JSON.stringify
ed, as JSON.stringify
rejects bigints. However, they can still be serialized/deserialized in some other way or stored in Deno KV, indexedDB, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's try get this landed soon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I don't follow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems a lot nicer. Right now, we hardcode the PCG32 algorithm. Should this instead be configurable? In other words, what if we allowed for BYO PRNG algorithm? As to not delay this PR, we can explore this in a follow-up issue/PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah there should be plenty of room to add other options in future. Possibly moving back to a class-based model for the algorithm itself will be desirable (but keep randomSeeded
as a HOF). Then roughly, algorithm classes compatible with randomSeeded
must implement an interface providing static methods like fromSeed
, instance methods like nextUint
, and static properties like numSeedBytes
and nextUintBitSize
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One last thing. Can you please go through all exceptions and ensure they following the contributing guidelines? See https://github.com/denoland/std/blob/main/archive/tar_stream.ts for an example.
@iuioiua I think they already do, with the exception that error messages beginning with a variable name don't capitalize the variable name: throw new RangeError("max must be greater than or equal to min"); How should those be handled? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggested some updates in error messages based on the style guide at https://github.com/denoland/std/blob/1432018f0080422453103994513c782aaf180959/.github/CONTRIBUTING.md#error-messages
Co-authored-by: Yoshiya Hinosawa <stibium121@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Excellent work. Thank you, and thank you to the contributors that chimmed in. Let's track any possible points of improvements in a tracking issue for discussions, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Closes #4848
Currently includes the following:
@std/random/between
:randomBetween
function@std/random/integer-between
:randomIntegerBetween
function@std/random/sample
: enhanced version of (to-be-deprecated?)@std/collections/sample
'ssample
function, with newweights
andrandom
options@std/random/seeded
:randomSeeded
function implementing the PGC32 algorithm@std/random/shuffle
:shuffle
function implementing Fisher–Yates shuffle