Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add random distribution utilities from AnotherWorld #43

Closed
wants to merge 2 commits into from

Conversation

naalit
Copy link
Contributor

@naalit naalit commented Aug 10, 2021

This adds two utilities for working with different probability distributions which were previously part of AnotherWorld: PDist representing a probability distribution which can be sampled from, and ChanceRandomizer which picks from a list of weighted choices. I'm not sure if CoreWorlds is the best place for them, but they need to be somewhere since they're used by GrowingFlora, and they seem generally useful as well.

@skaldarnar
Copy link
Contributor

Since we've sunsetted our math library https://github.com/MovingBlocks/TeraMath and instead went full force on joml we've put our custom extensions and utility methods into https://github.com/MovingBlocks/joml-ext. However, since this is not really geometry-related I'm not sure whether that's a good fit.

There are some utility methods left in the engine itself, see org/terasology/math.

I'd think these random distribution utilities also make sense for non-worldgen modules, e.g., when defining loot tables (https://github.com/Terasology/Drops) or when handling breeding/evolution (https://github.com/Terasology/SoundyGenetics).

Given that' I'd even consider adding these classes to the engine.

@pollend what do you think?

@naalit
Copy link
Contributor Author

naalit commented Aug 10, 2021

I guess it makes most sense to me to put them in the same place as things like noise implementations, which would be the engine. I wonder if there are any plans to add a Noise module or something like that?

@pollend
Copy link
Member

pollend commented Aug 11, 2021

I don't really like these objects. ChanceRandomizer can be simply replaced with a fixed array there is no need to do this class abstraction. not sure what to think about PDist.

@skaldarnar
Copy link
Contributor

I don't really like these objects. ChanceRandomizer can be simply replaced with a fixed array there is no need to do this class abstraction. not sure what to think about PDist.

I really like the abstraction this class provides by hiding the fixed array. I can understand that we try to avoid unnecessary instantiations, but I don't think using some objects here is our main problem... In addition, I'd rather have streamlined way to handle randomization like this, and only optimize it if it actually becomes a problem.

Maybe we can compare the usage sides for a better assessment.

@pollend
Copy link
Member

pollend commented Aug 12, 2021

I don't really like these objects. ChanceRandomizer can be simply replaced with a fixed array there is no need to do this class abstraction. not sure what to think about PDist.

I really like the abstraction this class provides by hiding the fixed array. I can understand that we try to avoid unnecessary instantiations, but I don't think using some objects here is our main problem... In addition, I'd rather have streamlined way to handle randomization like this, and only optimize it if it actually becomes a problem.

Maybe we can compare the usage sides for a better assessment.

might be more useful to have something like this: https://en.cppreference.com/w/cpp/numeric/random/discrete_distribution

[10,12,13,4,15]

[10, 22, 35, 39, 54]

generate a random number from (0 to 54) and to a binary search to find the closest value. log n cost but most of our use cases don't need to be O(1) at the cost of space complexity.

@naalit
Copy link
Contributor Author

naalit commented Aug 12, 2021

Maybe we can compare the usage sides for a better assessment.

Here's how it's used in GrowingFlora:

Random random = new FastRandom(seed);
ChanceRandomizer<PlantSpawnDefinition> definitionsForBiome = getDefinitionsForBiome(biome, foliageDefinitionsCache, foliageDefinitions);
PlantSpawnDefinition foliageDefinition = definitionsForBiome.randomizeObject(random);
// getDefinitionsForBiome():

result = new ChanceRandomizer<>(100);
for (PlantSpawnDefinition floraDefinition : definitions.get(biome.getId())) {
    result.addChance(floraDefinition.getRarity(), floraDefinition);
}
result.initialize();

And without ChanceRandomizer:

Random random = new FastRandom(seed);
PlantSpawnDefinition[] definitionsForBiome = getDefinitionsForBiome(biome, foliageDefinitionsCache, foliageDefinitions);
PlantSpawnDefinition foliageDefinition = definitionsForBiome[random.nextInt(definitionsForBiome.length)];
// getDefinitionsForBiome():

result = new PlantSpawnDefinition[100];
Collection<PlantSpawnDefinition> biomeDefinitions = definitions.get(biome.getId());
float chanceSum = biomeDefinitions.stream().map(PlantSpawnDefinition::getRarity).reduce(0f, Float::sum);
int index = 0;
for (PlantSpawnDefinition floraDefinition : biomeDefinitions) {
    int maxIndex = index + ((int) ((floraDefinition.getRarity() / chanceSum) * result.length));
    for (; index < maxIndex; index++) {
        result[index] = floraDefinition;
    }
}

So not much more code, but it's somewhat less readable. I'm not sure whether it's worth it or not.

@pollend
Copy link
Member

pollend commented Aug 12, 2021

@tolziplohu

I was thinking about this for a bit and maybe a better abstraction would be

GrabBag<Object>
DiscreteGrapbBag<Objecct>
DiscreteDistribution

for a grabbag you can add items multiple times to the bag. A DiscreteDistribution will use that algorithm I mentioned in my previous post where you collect a cumulative list of values and do some kind of binary search to find the value. DiscreteGrabBag will use a DiscreteDistribution to pull items from a list. This should be a bit more generic and usable.

@naalit
Copy link
Contributor Author

naalit commented Aug 12, 2021

@pollend

I'm not sure what the advantage of a DiscreteDistribution that produces indices by default is? Why not just use a ChanceRandomizer<Integer>, or DiscreteGrabBag<Integer> or something? The alternative implementation strategy could be a good idea, though, since it wouldn't have to deal with granularity.

@pollend
Copy link
Member

pollend commented Aug 13, 2021

int arr[] = new int[10];
def dist = new DiscreteDistrubtion({2,1,2,2,2,1,1}) equivalent [0,0,1,2,2,3,3,4,4,5,6]
arr[dist.get()]++;

I was thinking we can abstract this across a couple different types of distributions. there is a pretty good list of implementations in the C++ standard library. wondering if we could just provide that in joml-ext @tolziplohu . It's generic enough that it it can be used in quite a few ways.

image

@skaldarnar
Copy link
Contributor

The alternative implementation strategy could be a good idea, though, since it wouldn't have to deal with granularity.

Yep, that's sounds like a good idea. I want to hide any complexity around managing arrays and chances and things like that behind a simple interface.

I'm not really sure what the granularity is in ChanceRandomizer to be honest...

@pollend I think we have distributions like that in some places in the code base. I do like that we can describe chances on a higher abstraction level than handling the distribution arrays ourselves - what we do under the hood I don't care too much about.

I do like code like this, where all the nitty-gritty details about the math are hidden in DiscreteDistribution - with all kinds of optimizations we can think of.

DiscreteDistribution<Item> itemDrops = 
  new DiscreteDistribution(random)
    .add(crudeAxe, 30)
    .add(crudeHammer, 40)
    .add(crudeShovel, 25)
    .add(treasure, 5);
    
// ...

Item drop = itemDrops.next();

We could think about re-using the Map.Entry class for key-value pairs to have better support for filling in those distributions programmatically.

Apache Commons does have some Probability Distributions. Can we re-use those instead of re-implementing them?

@naalit
Copy link
Contributor Author

naalit commented Aug 13, 2021

The ChanceRandomizer<T>/DiscreteDistribution<T> seems to be equivalent to Apache Commons' EnumeratedDistribution<T>, and the PDist class here is something like a RealDistribution, so I think it does cover these use cases - but a lot of the pages on the docs seem to be broken and some of it is outdated, and it would mean adding a dependency.

@naalit
Copy link
Contributor Author

naalit commented Aug 16, 2021

Superseded by MovingBlocks/Terasology#4849, with the rewritten DiscreteDistribution<T>.

@naalit naalit closed this Aug 16, 2021
@naalit naalit deleted the feat/random-distribution-utilities branch August 16, 2021 18:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants