Skip to content

Latest commit

 

History

History
39 lines (33 loc) · 1.56 KB

uniform-distribution.md

File metadata and controls

39 lines (33 loc) · 1.56 KB

Generate a uniform distribution

Explore this snippet here

Description

Generating a uniform distribution of (pseudo-) random numbers is often useful when performing statistical tests on data. This snippet demonstrates a method for generating a deterministic sample from a uniform distribution:

  1. Create a column of row indices - here the GENERATE_ARRAY function is used, though the ROW_NUMBER window function may also be useful.
  2. Convert the row indices into pseudo-random numbers using the FARM_FINGERPRINT hashing function.
  3. Map the hashes into the correct range
with params as (
  select
    10000 as num_samples, -- How many samples to generate
    1000000 as precision, -- How much precision to maintain (e.g. 6 decimal places)
    2 as min,             -- Lower limit of distribution
    5 as max,             -- Upper limit of distribution
),
samples as (
  select
    -- Create hashes (between 0 and precision)
    -- Map [0, precision] -> [min, max]
    mod(abs(farm_fingerprint(cast(indices as string))),  precision) / precision * (max - min) + min as samples
  from
    params,
    unnest(generate_array(1, num_samples)) as indices
)

select * from samples
samples
3.55
3.316
2.146
...