Ambiguous system ordering #1312

alice-i-cecile · 2021-01-25T15:17:33Z

alice-i-cecile
Jan 25, 2021
Maintainer

Problem definition

In Bevy's initial scheduler, systems could automatically run in parallel, as long as they didn't access the same data. The previous scheduling rules ensure that, for each piece of data:

All mutable (read-or-write) access to that piece of data occur in a consistent, well-defined order.
All immutable access to that piece of data occur between two precise mutable accesses to the data.

In order to break ties in the ordering of systems within the same stage, the old scheduler relies on the insertion order of systems into the AppBuilder.

However, #1144 removes these strict guarantees, allowing (by default) any two systems to run in parallel as long as they are not attempting to access the same data (on a per archetype-component / resource basis) at the same time as a system has mutable access to it.

(For context, an archetype-component is a contiguous chunk of the data for a single component associated with a single archetype, the collection of entities that have the same components. This allows us to mutably access the same component in two different systems at once, as long as we know that there are no archetypes shared by those systems.)

This was done for the following very good reasons:

System ordering should not be implicit, driven by insertion order, because this makes refactoring and understanding the code very hard.
If the system ordering is fixed, largely by accident, than unintended behavior can persist, invisible to tests, until a refactor breaks things.
Not all system orderings that violate the previous scheduling rules result in a logic bug in the game. This caused the previous scheduler to be too restrictive, limiting parallel execution of systems.

However, this resulted in the possibility of ambiguous system orderings, where a system that writes to a piece of data and a system that reads that same piece of data are not guaranteed to run in the same relative order every tick.

This can result in surprising, difficult to track logic bugs within the game, as changes may skip being processed (see #68) or important calculations may be done inconsistently. This problem is especially severe for some forms of networked code, where determinism of even inconsequential logic is important.

Note that, even with the previous scheduler, the absolute ordering of our systems was not guaranteed. This is generally fine, because we control data read / write access via SystemParameters. However, this is not always the case and can result in latency or unreliable behavior.

alice-i-cecile · 2021-01-25T15:27:53Z

alice-i-cecile
Jan 25, 2021
Maintainer Author

Handling ambiguous system ordering

Broadly speaking, there are three possible ways to handle ambiguous system ordering:

Proceed silently.
Proceed with warnings.
Fail.

Option 1 is appealing, because it is trivial to implement, allows any behaviour the developer could want, and avoids creating a large number of warnings. However, it is also dangerous, because it does not provide any tools to detect these errors.

Option 3 is strict, and guarantees correctness. However, it limits parallelism, and seriously impacts prototyping speed due to the requirement to manually specify dependencies (for the reasons discussed in the post above).

Option 2 is the middle ground: detecting and warning about these ambiguities to allow but not require developers to fix it. Unless a further reason to forbid ambiguous system ordering completely arises (along with a better way to rapidly specify dependencies), this is the correct choice: all the power of option 1, but with some guide rails.

2 replies

aevyrie Jan 25, 2021
Collaborator

I agree that option 2 is the right way to go. In the current system ordering paradigm, I was able to prototype and get to something that "just works" pretty quickly. I didn't start working on system ordering until I felt like the system logic was correct, and I could focus on optimizing system execution. At that point, having warnings to work from would've been a boon for my productivity, instead of piecing together the implicit rules that govern system ordering, or using traces. In short: "all the power of option 1, but with some guide rails"

damccull Jan 25, 2021

Can we get linter support options here for a choice between the 3? Default to 2 #[warn()] but set it up to allow or deny as well.

alice-i-cecile · 2021-01-25T15:39:28Z

alice-i-cecile
Jan 25, 2021
Maintainer Author

Specifying system dependencies

In order to solve ambiguities, we need some way to specify system dependencies (within a stage). Of note: dependencies are directed, and systems can have more than one dependency, but the final graph of system dependencies must be acyclic in order to be resolved, creating a directed acyclic graph. There are three categories of system dependencies:

Soft: The dependent system is blocked if the prerequisite system is scheduled to run and has not yet been evaluated. If the prerequisite system is not run due to run criteria failing, the dependent system can still proceed. These are specified with the after method.
Hard: As above, except the dependent system is also blocked if the run criteria on the prerequisite system is not yet met. These are specified with the only_after family of methods. This need to be broken down into only_after_check_once and only_after_check_always in order to support looping system sets.
Explicit ambiguities: Allow two (or more) systems to run in an ambiguous order, even though doing so is not usually safe. This is important for silencing warnings. The syntax for this is not yet decided.

Soft dependencies are the default, as they avoid intermingling "should this system run" with "when does this system run", and prevents tangled non-local run criteria from making a mess of your systems in the way that hard dependencies do.

Use soft dependencies for when running B before A is a logic error, use hard dependencies when B depends on A's results, and use explicit ambiguities when you don't care about the order in which A and B run, even though at least one of them has mutable access to data in common.

3 replies

alice-i-cecile Jan 26, 2021
Maintainer Author

Commutativity is a powerful tool for thinking about ambiguities:

If and only if your systems are commutative, then their order is guaranteed not to matter (from a logical consistency perspective), even if they trigger the data-access ambiguity checker.

Because commutativity is transitive (if you don't care what order A and B run in, or what order B and C run in, you also don't care what order A and C run in), this inspires a good API for declaring explicit ambiguities. Rather than specifying them pairwise, we should take advantage of sets (ala HashSet), and add systems one at a time to various sets of mutually ambiguous systems. These sets should be mutually exclusive, in order to avoid accidentally merging them.

kaoet Feb 8, 2021

I don't quite understand why commutativity is transitive. Suppose I have three systems A, B, and C. The system C depends on the result of A so must run after A. The system B is doing totally unrelated task and can run at any time.

start ─> A ─> C ─> end
      └──> B ───┘

In this case, I don't care what order A and B run in, or what order B and C run in. But I have to ensure C runs after A.

alice-i-cecile Feb 8, 2021
Maintainer Author

@kaoet I'm slowly working away on a better understanding of how exactly this puzzle works, but you're right. There are some cases where the pairwise-commutativity that I mentioned works well (think applying physics forces in multiple systems) but others like your example where it doesn't hold.

A really simple counterexample. Suppose we have A, which must run before B. Then, we add the "identity system" I, which does literally nothing. Clearly, we don't care about the order it runs in relative to either A or B. But this does not break our previous ordering on A and B, so not all forms of commutative relations are transitive.

alice-i-cecile · 2021-01-25T15:40:14Z

alice-i-cecile
Jan 25, 2021
Maintainer Author

Reporting ambiguities

A good warning system for ambiguities has the following technical properties:

It can be completed statically, without needing to run the stages to gather information.
Conflicts are detected at the component level, rather than the archetype-component level, because archetype-component independence can readily be changed both at runtime (via adding components or spawning entities) and through small changes to gameplay code.

It also has the following ergonomic properties:

It does not prevent users from running code that has unresolved ambiguities. This is important for prototyping.
It clearly specifies the systems involved, and ideally where to find them.
It provides information about the data in conflict.
It allows you to silence intended ambiguities, as discussed in the post above. This reduces the noise level.
It is reasonably fast.
Specifying dependencies and ambiguities in complex projects doesn't result in excessive amounts of manual boilerplate scattered around the code base.
All possible issues should be detected, to avoid providing a false sense of security (see technical criteria Sound #2).
There should be a way to differentiate between multiple copies of a system with the same name. This may just be by encouraging / enforcing labels for ambiguous cases if needed.

Initial experiments

@Ratysz has done very early experiments on reporting ambiguities.

This approach reports ambiguities to the command line, separated by stage, and uses the type name to report conflicting systems when labels are unavailable.

It provides some useful results, but fails the two technical criteria listed above and struggles badly with noisiness due to the lack of explicit ambiguities.

Visually reporting ambiguities

This is universally acknowledged as "would be great" at some point in the future. We need a solid way to create data for it first though, and a better understanding of the patterns involved.

Commutativity checker

Because commutativity is such a powerful property for safely allowing non-obvious ambiguities, we could use property-based testing to check for commutativity automatically, notifying the user if two systems that are commutative are not allowed to run in parallel, or vice versa.

This is very useful guidance, but will typically not be able to be done via static analysis at compile time (except perhaps in the simplest cases), so is likely suited to a manually triggered tool, to be run as part of integration testing.

P.S. There's some nastiness around float math not being commutative: using a crate like float_eq and restricting to values seen in actual gameplay will help improve practical reliability.

0 replies

alice-i-cecile · 2021-01-25T15:40:41Z

alice-i-cecile
Jan 25, 2021
Maintainer Author

More ergonomic dependency specification

Placeholder post to summarize future suggestions.

3 replies

TheRawMeatball Feb 8, 2021
Collaborator

Here's my vision for a better API which removes the 1-1 mapping between systems and labels to increase ergonomics:

| A_________________________________________________________| H_____________________|
|                                                           |                       |
| B__________________________| Sync__| C____________________+_______________________|
|                            |       | D____________________+_______________________|
| E________| F_______________|                              |
|          | G______________________________________________|

Graph description:
+ means no intersection
| X_______| means a label
If multiple | are lined up, this means they become a "sync point".

the graph above could be expressed as

.add_label("Sync")
.add_label("A")
.add_label("B").before("Sync")
.add_label("C").after("Sync")
.add_label("D").after("Sync")
.add_label("E").before("F")
.add_label("F").before("Sync")
.add_label("G").after("E").before("H")
.add_label("H").after("A")

Then, systems can be added to the graph in three ways.

// This adds the system to an anonymous label which only has the constraint `.before("B")`
.add_system(system_1.before("D"))
// This adds the system to an anonymous label which only has the constraint `.after("B")`
.add_system(system_1.after("D"))
// This is what the other two methods simplify to: 
// if a system has a label, the label will grow to accomodate the system, and the system will wait for
// all systems which are part of a label that runs before this. A system can have multiple labels
.add_system(system_1.label("D"))

alice-i-cecile Feb 8, 2021
Maintainer Author

This question, and especially the comment above, also ties into #1375.

TheRawMeatball Feb 8, 2021
Collaborator

A future extension of this proposed api could be to drop "label"s completely, and merge their functionality into labeled SystemSet s which manage the if and when of how systems run

Ratysz · 2021-01-25T16:16:48Z

Ratysz
Jan 25, 2021
Collaborator

A couple of corrections:

[...] the old scheduler only considers data on a per-component basis [...]

It does take archetypes into account, too. At least, it looked like it does - I did not write the archetype-component stuff, and it wouldn't have existed just because, so...

[...] because we control data read / write access via SystemParameters, we don't care about the ordering of two systems

Not entirely true: not all systems that need to have a consistent execution order share common accessed data. Current executor assumes that they all do, it doesn't actually know.

2 replies

alice-i-cecile Jan 25, 2021
Maintainer Author

Removed

(Unlike the new scheduler, the old scheduler only considers data on a per-component basis, rather than breaking it apart by archetype as well.)

alice-i-cecile Jan 25, 2021
Maintainer Author

Not entirely true: not all systems that need to have a consistent execution order share common accessed data. Current executor assumes that they all do, it doesn't actually know.

Right, I saw an example of this the other day from @aevyrie's bevy_mod_picking code where it was causing latency issues.

alice-i-cecile · 2021-02-07T20:46:00Z

alice-i-cecile
Feb 7, 2021
Maintainer Author

Specifying Explicit Ambiguities

The ambiguity checker introduced in #1144 produces very noisy outputs. Many of these are false positives, and should be silenced.

Explicit ambiguities between systems should be specified the same way #1144 lets you specify explicit dependencies: via system descriptors.

An initial solution should be quick-to-implement and simple, in order to hit a release in 0.5.

Proposal: Pairwise Ambiguities
The obvious choice would be to silence warnings on a line-by-line basis, by marking an ambiguity between two systems as spurious using the system's label.

Sample syntax

.add_system(system_1.system().ambiguous("system_2_label")

This uses system labels, just like .after.
Thoughts:

natural and simple
results in a quadratic explosion of boilerplate as you get more systems that are marked as ambiguous

Proposal: Ambiguity Sets via Set Labels

IMO, explicit ordering ambiguities should be specified as a set for the following reasons:

By declaring that the order of two systems doesn't matter, you're making a promise about the (consequential) commutativity of your operations.
Commutativity is a transitive property. If the order of A and B doesn't matter, and the order of B and C doesn't matter, it is necessarily true that the order of A and C doesn't matter.

During reporting, ignore any ambiguity between systems that are part of the same ambiguity set.

During AppBuilder construction, error (or warn?) if systems belong to more than one label of this sort, since the sets should agglomerate into each other when there's any common members. Produce an error if two systems share a label but have an explicit dependency between them.

Sample syntax for a "set label"-based approach:

.add_system(my_system.system().ambiguous("ambiguity_set_label")

Thoughts:

a natural fit to be eventually extended by System organization overhaul and the road to a stageless schedule #1375.
encourages clear, explicit reasoning about the identity of systems in each ambiguity set
requires creation of a large number of label names that lack much semantic content
hard to extend across crates / modules due to namespacing issues
all of the problems with stringly-typed labels discussed in Why use strings as labels? #103

Proposal: Ambiguity Sets via Agglomeration

Use the same underlying construct of ambiguity sets, but specify the relationship on a system-to-system basis. Agglomerate any sets produced in this way.

During reporting, ignore any ambiguity between systems that are part of the same ambiguity set.
Produce an error if two systems are added to the same ambiguity set but have an explicit dependency between them.

Sample syntax

.add_system(system_1.system().ambiguous("system_2_label")

Thoughts:

this uses the same syntax (and looks identical) to the obvious pairwise solution, and works as expected in simple cases
you can very quickly construct large sets in this way as you prototype / build
it's much harder to debug the exact identity of systems in an ambiguity set than with labels
errors when constructing inappropriate ambiguity sets are a really valuable safety tool to prevent accidentally merging two huge otherwise-disjoint sets

0 replies

alice-i-cecile · 2021-02-07T21:13:54Z

alice-i-cecile
Feb 7, 2021
Maintainer Author

Archetype invariants

The basic idea was proposed by @TheRawMeatball on Discord.

Specify some invariant about the archetypes available. E.g. the Player and Enemy components will never occur together.
Use this guarantee to inform the ambiguity checker, silencing warnings that only occur due to this type of conflict.
At runtime during debug mode, dump the current list of archetypes to allow a check to occur that these constraints were satisfied.

7 replies

TheRawMeatball Feb 8, 2021
Collaborator

B and C are disjoint
C always occurs with B

? so C cannot exist, because its existence would have to violate at least one of the rules

alice-i-cecile Feb 8, 2021
Maintainer Author

My attempts at refining this a bit further:

ambiguities are detected on a worst-case basis by default when checking at the component level
if an overlap can occur, we must conclude that it does occur
thus, simple archetype invariants of the form "A always occurs with B" are not useful
the most useful (only?) invariant that we care about is "entities never have both A and B": aka "A and B are disjoint"
~~disjoint sets are transitive: if A and B are disjoint, and B and C are disjoint, then A and C are similarly disjoint~~
you can similarly define "A only occurs with B", implicitly creating a disjoint relationship between
this allows us to quickly specify k-partite subgraphs in our system dependencies

Refined:

In AppBuilder, specify either .disjoint(A: Component, B: Component) for each pair of only_with(A, (B, C...).
Components that are .only_with are disjoint from all other components.
In the ambiguity checker, create phantom archetypes that contain the largest possible archetypes.
Check for ambiguities on an ArchetypeComponent level using these phantom archetypes.
At runtime during debug mode, record the complete list of archetypes that were created to allow a check to occur that these constraints were satisfied.

Edit: Since disjointedness isn't fully transitive, like @BoxyUwU says below, we need a more sophisticated algorithm to generate the worst case scenario.

TheRawMeatball Feb 8, 2021
Collaborator

In the ambiguity checker, create phantom archetypes that contain the largest possible archetypes.

Would this even be possible? We don't have a list of all possible components. I feel like generating the warnings based on what component gets accessed, and then trying to prove that its impossible would be easier

alice-i-cecile Feb 8, 2021
Maintainer Author

We don't have a list of all possible components.

Ah, right... If we did you could construct them easily enough, but I don't see any way around that limitation.

So: generate the warnings via component access, then filter out impossible collisions based on the defined invariants.

aevyrie Feb 10, 2021
Collaborator

Adding this suggestion from Discord, as @alice-i-cecile suggested it might make for a useful debug lint.

Has there been any talk about explicit component dependencies? For example, it's common that a component relies on the entity having some other components, or its queries will return empty. Similar to implicit system ordering, I find myself documenting component requirements in comments. I'd love to have the ability to express this in code.

e.g.:

struct MyComponent {
    _deps: Dependencies<(Camera, GlobalTransform)>,
}

This might then generate an error or warning at runtime if a component is added but its prerequisite components don't exist.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ambiguous system ordering #1312

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 7 comments 17 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Ambiguous system ordering #1312

alice-i-cecile Jan 25, 2021 Maintainer

Problem definition

Replies: 7 comments · 17 replies

alice-i-cecile Jan 25, 2021 Maintainer Author

Handling ambiguous system ordering

aevyrie Jan 25, 2021 Collaborator

damccull Jan 25, 2021

alice-i-cecile Jan 25, 2021 Maintainer Author

Specifying system dependencies

alice-i-cecile Jan 26, 2021 Maintainer Author

kaoet Feb 8, 2021

alice-i-cecile Feb 8, 2021 Maintainer Author

alice-i-cecile Jan 25, 2021 Maintainer Author

Reporting ambiguities

Initial experiments

Visually reporting ambiguities

Commutativity checker

alice-i-cecile Jan 25, 2021 Maintainer Author

More ergonomic dependency specification

TheRawMeatball Feb 8, 2021 Collaborator

alice-i-cecile Feb 8, 2021 Maintainer Author

TheRawMeatball Feb 8, 2021 Collaborator

Ratysz Jan 25, 2021 Collaborator

alice-i-cecile Jan 25, 2021 Maintainer Author

alice-i-cecile Jan 25, 2021 Maintainer Author

alice-i-cecile Feb 7, 2021 Maintainer Author

Specifying Explicit Ambiguities

alice-i-cecile Feb 7, 2021 Maintainer Author

Archetype invariants

TheRawMeatball Feb 8, 2021 Collaborator

alice-i-cecile Feb 8, 2021 Maintainer Author

TheRawMeatball Feb 8, 2021 Collaborator

alice-i-cecile Feb 8, 2021 Maintainer Author

aevyrie Feb 10, 2021 Collaborator

alice-i-cecile
Jan 25, 2021
Maintainer

Replies: 7 comments 17 replies

alice-i-cecile
Jan 25, 2021
Maintainer Author

aevyrie Jan 25, 2021
Collaborator

alice-i-cecile
Jan 25, 2021
Maintainer Author

alice-i-cecile Jan 26, 2021
Maintainer Author

alice-i-cecile Feb 8, 2021
Maintainer Author

alice-i-cecile
Jan 25, 2021
Maintainer Author

alice-i-cecile
Jan 25, 2021
Maintainer Author

TheRawMeatball Feb 8, 2021
Collaborator

alice-i-cecile Feb 8, 2021
Maintainer Author

TheRawMeatball Feb 8, 2021
Collaborator

Ratysz
Jan 25, 2021
Collaborator

alice-i-cecile Jan 25, 2021
Maintainer Author

alice-i-cecile Jan 25, 2021
Maintainer Author

alice-i-cecile
Feb 7, 2021
Maintainer Author

alice-i-cecile
Feb 7, 2021
Maintainer Author

TheRawMeatball Feb 8, 2021
Collaborator

alice-i-cecile Feb 8, 2021
Maintainer Author

TheRawMeatball Feb 8, 2021
Collaborator

alice-i-cecile Feb 8, 2021
Maintainer Author

aevyrie Feb 10, 2021
Collaborator