Random for Int64 #41412

vorotynsky · 2020-08-26T18:13:16Z

I created methods to generate random floats and longs.

Can you help me to find tests for System.Random?

Created method in System.Random for generating longs and floats. Fix dotnet#26741

Dotnet-GitSync-Bot · 2020-08-26T18:13:19Z

I couldn't figure out the best area label to add to this PR. If you have write-permissions please help me learn by adding exactly one area label.

dnfadmin · 2020-08-26T18:13:31Z

All CLA requirements met.

It should make the algorithm more stable. Fix dotnet#26741

Clockwork-Muse · 2020-08-26T19:06:23Z

src/libraries/System.Private.CoreLib/src/System/Random.cs

+                unchecked
+                {
+                    part = (short) InternalSample();
+                }


This has something similar to the effect of a modulus operation, but I haven't done the math to tell whether it will evenly generate all values in the range (due to InternalSample() not returning `int.MaxValue). Possibly suspect.

Clockwork-Muse · 2020-08-26T19:14:36Z

src/libraries/System.Private.CoreLib/src/System/Random.cs

+                result |= (long) part;
+                result <<= 16;


Suspect. part can be negative (when InternalSample() returns a sufficiently large result), so when the cast happens this sets bits that were probably not intended (and doesn't set bits that probably were). Additionally, due to the fact that part can be negative, this means the overall result can be negative as well.

Note that if part was restricted to only be positive, bits 16, 32, or 48 would never be set, which would be a problem.

Is this case better?

int i = InternalSample(); long result = 0; result = result | (long) InternalSample(); result = result | (1u << 31 & (i << 2)); result = result | (1u & i); // upd: (i >> 2) result <<= 32; result = result | (long) InternalSample(); result = result | (1u << 31 & (i << 3)); result = result | (1u & (i >> 3));

Seems overly complicated, this would read better:

long result = 0; result |= ((long) InternalSample()) << 32; result |= ((long) InternalSample()) << 1; result |= ((long) InternalSample()) & 1L;

(Not tested, and I'm not sure of the statistical implications of this)

Point 2 here (the one about 0 not being returned from Next()) seem somewhat off, given that of course there aren't going to be many values in the range they indicate (there's more numbers available above that range than below it - it's not half the range, as the writer seems to expect, but 1/2^32 of it, or so).

You can't use (as they suggest there):

long result = NextDouble() * long.MaxValue;

... because doubles only have 2^53 unique values in the range returned, and there are some integral values that a double can't represent (some values above 2^53-ish), which means some of them will be getting skipped.

result |= ((long) InternalSample() >> 1) & 1L;

It should be shifted due to unreachable int.MaxValue.

Do 1-30 bits of InternalSample() have equal probability?

It should be shifted due to unreachable int.MaxValue

You're right. (I woke up this morning realizing this)

However, it turns out that won't be sufficient, because that leaves bits 32 and 1 with a distribution problem as well. It needs to be modified to:

long result = 0; result |= ((long) InternalSample()) << 32; // InternalSample() returns _numbers_ evenly distributed in the range [0, int.MaxValue), // but this results in the 0th bit having a slight decrease in probability, // because there is one fewer odd value than even values. result |= ((long) InternalSample()) << 2; result |= ((long) InternalSample() >> 1) & 3L;

Do 1-31 bits of InternalSample() have equal probability?

Assuming bits are indexed [0, 31]:

bit 0 has the aforementioned problem

bits 1-30 should have equal probability, or the numbers won't be evenly distributed (.... I think....?)

bit 31 will never be set

... this is, however, assuming that InternalSample() actually returns a value in the given range uniformly (that is, ignoring bugs). That I won't comment on.

I found, that number 0x0111_1111_1111_1111_1111_1111_1111_1110 accurs once in range [0; int.MaxValue) with deleted first and last bits. Other numbers do twice.

last mid first

0 111..111 0 value in range [0; int.MaxValue)

0 111..111 1 int.MaxValue

1 111..111 0 -2

1 111..111 1 -1

It means that InternalSample() is a bad choice to get random bits if distribution of InternalSample() is uniform.

I found, that number 0x0111_1111_1111_1111_1111_1111_1111_1110 accurs once in range [0; int.MaxValue) with deleted first and last bits. Other numbers do twice.

Argh, I think you're right. The result of this is that any individual bit has a slightly smaller chance of being set than it does not being set (that is, it's more likely for the bit to not be set).

This means our options are:

Acknowledge this issue, and accept it.

Still do this via bit-twiddling, but find a better source of randomness.

Do this mathematically instead of via bit twiddling. I think that might be possible, but haven't worked out the operations that would be needed. I don't think you'd need to call InternalSample() any additional times.

Anybody else want to comment on the direction that should be taken here?

(This is another time I want to ~~Do Not~~ Spindle, Fold, orand Mutilate somebodyover API choices. Like, argh)

Clockwork-Muse · 2020-08-26T19:15:16Z

src/libraries/System.Private.CoreLib/src/System/Random.cs

+        ==============================================================================*/
+        public virtual long NextInt64()
+        {
+            return FullLong();


See problems with FullLong(), but the current implementation does not obey the documented contract.

Clockwork-Muse · 2020-08-26T19:19:15Z

src/libraries/System.Private.CoreLib/src/System/Random.cs

+                return 0;
+
+            long fullLong = (long) FullLong();
+            return (long.MaxValue & fullLong) % maxValue;


This distorts results, for two reasons:

The bitwise-and, possibly in an attempt to make the range positive, subtly distorts the number of results, due to there being one more negative number than there are positive (~~or two, if~~ because long.MaxValue is never returned). If FullLong() returned the correct range, the bitwise-and should be unnecessary.

The modulus operator distorts the results, because usually maxValue will not be an integral factor of the range of the random. See this for a discussion, and the potential fix if modulus will continue to be used.

(comment edited because I was wrong about number distribution)

Clockwork-Muse · 2020-08-26T19:20:40Z

src/libraries/System.Private.CoreLib/src/System/Random.cs

+                throw new ArgumentOutOfRangeException(nameof(minValue), SR.Format(SR.Argument_MinMaxValue, nameof(minValue), nameof(maxValue)));
+            }
+
+            long range = maxValue - minValue;


Silent overflow. Compare with int NextInt(int, int).

Fixed in a new commit.

Max and min are both positive or both negative. Thay cant create overflow.

Clockwork-Muse · 2020-08-26T19:28:27Z

src/libraries/System.Private.CoreLib/src/System/Random.cs

+            long range1 = range / 2;
+            long range2 = range - range1;
+
+            return NextInt64(range1 + 1) + NextInt64(range2) + minValue;


This is attempting to dodge additional overflow issues, but I'm unsure whether this will properly distribute values over the entire range. Possibly suspect.

Clockwork-Muse · 2020-08-26T19:31:33Z

src/libraries/System.Private.CoreLib/src/System/Random.cs

+        ==============================================================================*/
+        public virtual float NextSingle()
+        {
+            return (float) Sample();


The conversion from double to float is going to round unrepresentable double values to representable float ones - and it's not clear that the distribution of that is going to be even. Is there a reference that can be added as a comment as to whether this is safe?

vcsjones · 2020-08-26T19:35:37Z

Can you help me to find tests for System.Random?

https://github.com/dotnet/runtime/blob/master/src/libraries/System.Runtime.Extensions/tests/System/Random.cs

vcsjones · 2020-08-26T19:45:29Z

Re, the current build failures.

When you add a new public API, you also need to update the reference source. Details on how to do that are here. https://github.com/dotnet/runtime/blob/2a4284b9f3eb2ad95fdf324f0249d372abff96df/docs/coding-guidelines/updating-ref-source.md

If the ref source updated successfully, you should see the ref source modified locally. It should add your new public members somewhere here:

runtime/src/libraries/System.Runtime/ref/System.Runtime.cs

Lines 3126 to 3137 in afa4b2f

    
           public partial class Random 
        
           { 
        
               public Random() { } 
        
               public Random(int Seed) { } 
        
               public virtual int Next() { throw null; } 
        
               public virtual int Next(int maxValue) { throw null; } 
        
               public virtual int Next(int minValue, int maxValue) { throw null; } 
        
               public virtual void NextBytes(byte[] buffer) { } 
        
               public virtual void NextBytes(System.Span<byte> buffer) { } 
        
               public virtual double NextDouble() { throw null; } 
        
               protected virtual double Sample() { throw null; } 
        
           }

Updating the reference source will probably change some other things that you didn't intend (new or removed attributes on unrelated changes) - that's okay, just undo those changes and keep the changes that changed the ref source for System.Random.

vorotynsky · 2020-08-26T19:53:39Z

@Clockwork-Muse, @vcsjones, thanks!
I'll think about math problems.

Fix dotnet#26741 Signed-off-by: Vorotynsky Maxim <vorotynsky.maxim@gmail.com>

stephentoub · 2020-10-23T14:26:06Z

@vorotynsky, thanks for your efforts here. Are you still working on this?

vorotynsky · 2020-10-23T18:31:43Z

No, I had a problems in building tests and gave up. And I don't know how to fix it.

Random for Int64

96cb58a

Created method in System.Random for generating longs and floats. Fix dotnet#26741

Better long random generation

5f0f238

It should make the algorithm more stable. Fix dotnet#26741

vorotynsky force-pushed the long-random-numbers branch from 3026e5e to 5f0f238 Compare August 26, 2020 18:52

Clockwork-Muse reviewed Aug 26, 2020

View reviewed changes

vorotynsky force-pushed the long-random-numbers branch from d1ee749 to 1bf5b92 Compare August 26, 2020 20:16

Update public API for System.Random

3088738

Fix dotnet#26741 Signed-off-by: Vorotynsky Maxim <vorotynsky.maxim@gmail.com>

vorotynsky force-pushed the long-random-numbers branch from 1bf5b92 to 3088738 Compare August 26, 2020 20:17

vcsjones mentioned this pull request Aug 26, 2020

Http2_MultipleConnectionsEnabled_IdleConnectionTimeoutExpired_ConnectionRemovedAndNewCreated test failure #40115

Closed

Improve calculations in random long

0441e69

vorotynsky force-pushed the long-random-numbers branch from dae4918 to 0441e69 Compare August 27, 2020 22:28

vorotynsky closed this Oct 23, 2020

ghost locked as resolved and limited conversation to collaborators Dec 7, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Random for Int64 #41412

Random for Int64 #41412

vorotynsky commented Aug 26, 2020

Dotnet-GitSync-Bot commented Aug 26, 2020

dnfadmin commented Aug 26, 2020 •

edited

Loading

Clockwork-Muse Aug 26, 2020

Clockwork-Muse Aug 26, 2020

vorotynsky Aug 26, 2020 •

edited

Loading

Clockwork-Muse Aug 26, 2020

vorotynsky Aug 27, 2020

vorotynsky Aug 27, 2020 •

edited

Loading

Clockwork-Muse Aug 27, 2020

vorotynsky Aug 27, 2020

Clockwork-Muse Aug 27, 2020

Clockwork-Muse Aug 26, 2020

Clockwork-Muse Aug 26, 2020 •

edited

Loading

Clockwork-Muse Aug 26, 2020

vorotynsky Aug 27, 2020

Clockwork-Muse Aug 26, 2020

Clockwork-Muse Aug 26, 2020

vcsjones commented Aug 26, 2020

vcsjones commented Aug 26, 2020

vorotynsky commented Aug 26, 2020

stephentoub commented Oct 23, 2020

vorotynsky commented Oct 23, 2020

last	mid	first
0	111..111	0	value in range [0; int.MaxValue)
0	111..111	1	int.MaxValue
1	111..111	0	-2
1	111..111	1	-1

Random for Int64 #41412

Random for Int64 #41412

Conversation

vorotynsky commented Aug 26, 2020

Dotnet-GitSync-Bot commented Aug 26, 2020

dnfadmin commented Aug 26, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vorotynsky Aug 26, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vorotynsky Aug 27, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Clockwork-Muse Aug 26, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vcsjones commented Aug 26, 2020

vcsjones commented Aug 26, 2020

vorotynsky commented Aug 26, 2020

stephentoub commented Oct 23, 2020

vorotynsky commented Oct 23, 2020

dnfadmin commented Aug 26, 2020 •

edited

Loading

vorotynsky Aug 26, 2020 •

edited

Loading

vorotynsky Aug 27, 2020 •

edited

Loading

Clockwork-Muse Aug 26, 2020 •

edited

Loading