Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Constant combinator set_signal function is very slow (50% of the execution time on big blueprints with lots of combinators) #28

Open
louga31 opened this issue Jun 15, 2022 · 3 comments
Labels
enhancement New feature or request performance This issue is performance related
Milestone

Comments

@louga31
Copy link
Contributor

louga31 commented Jun 15, 2022

I'm doing some profiling on the project to find why it takes so much time to create my big blueprints, and I found that the set_signal function is very slow because of the check for signal validity.
I did not found what makes this check so slow, so I'm posting this as an issue instead of fixing it directly.

The check for signal validity takes an enormous 24-25s out of the 28s the function takes to run (87%)
image

When trying to find the slow part of it, I checked the normalize_signal_id function, but it does not seem to be responsible for the slowdown as it took only 117 ms to run.
image

@redruin1 redruin1 added enhancement New feature or request performance This issue is performance related labels Jun 16, 2022
@redruin1
Copy link
Owner

This is somewhat unsurprising, as I had a hunch that validating the signal every time would get slow quickly, though I'm surprised that it says the signal_dict function is responsible for the majority of the time taken, and not something like Schema itself. Is there a minimum reproducible example to illustrate the volume/type of signals you're setting?

@redruin1
Copy link
Owner

After coming back to this after a while, this seems to be mostly at fault of Schema's validate function. Given a minimal testbed:

from draftsman.entity import ConstantCombinator

entity = ConstantCombinator()
n_iter = 1000000
for i in range(n_iter):
    entity.set_signal(i % entity.item_slot_count, "signal-A", 100)

The current culprit appears to be this line:

signal = signatures.SIGNAL_ID_OR_NONE.validate(signal)

Running the above script with that line in set_signal results in a runtime of ~127 seconds for the 1,000,000 iterations. If this line is swapped with this (almost) equivalent statement here:

signal = signals.signal_dict(signal) if signal is not None else None

This finishes the same script in ~14 seconds, or about a 90% decrease in runtime.

Given that this is primarily a Schema issue, I'm not sure what the best precedent to take moving forward should be. Draftsman was always designed from an "API first, performance secondary" philosophy, as due to the number of safety checks that Draftsman performs it will never be as fast as an optimized, prototyped routine. This case is particularly ridiculous, so I might investigate more performant data validation libraries, or consider writing one myself; another particular slow point in the code seems to be the evaluation of the CONTROL_BEHAVIOR schema.

Note that if performance is what you're really after, then there's nothing stopping you from manually manipulating each underlying attribute individually; though this gets rid of all safety/sanity checks that Draftsman enforces:

from draftsman.entity import ConstantCombinator

entity = ConstantCombinator()

n_iter = 1000000
entity._control_behavior["filters"] = [None] * entity.item_slot_count
for i in range(n_iter):
    # entity.set_signal(i % entity.item_slot_count, "signal-A", 100)
    index = i % entity.item_slot_count
    entity._control_behavior["filters"][index] = {
        "index": index + 1, 
        "signal": {"name": "signal-A", "type": "virtual"},
        "count": 100
    }

This finishes the same 1 million iterations in just under a second, at the cost of readability.

@louga31
Copy link
Contributor Author

louga31 commented Jul 25, 2022

Wonderful findings, I'm not that much after performance (I prefer a safe library and that's why I choose Draftsman). The problem is that I have massive blueprints that take an insane amount of time to process (my two main projects are an automated base and a computer, and they take around 30 minutes to process). I'm starting to see any way I can globally optimize them to a reasonable amount of time. I might start manipulating the underlying attribute like you showed in your example, but only after I have testes with all the safety (I may start working on some sort of conversion program, as manually doing it everywhere is very error prone).

@redruin1 redruin1 added this to the 2.0 milestone Mar 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request performance This issue is performance related
Projects
None yet
Development

No branches or pull requests

2 participants