Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Flyte execution IDs collide with past executions #2800

Closed
2 tasks done
annadcunningham opened this issue Aug 23, 2022 · 2 comments
Closed
2 tasks done

[BUG] Flyte execution IDs collide with past executions #2800

annadcunningham opened this issue Aug 23, 2022 · 2 comments
Labels
bug Something isn't working untriaged This issues has not yet been looked at by the Maintainers

Comments

@annadcunningham
Copy link

Describe the bug

When running a Flyte workflow that generates subworkflows, the execution IDs of the subworkflows can collide with past executions of the same subworkflow. This causes the results to be read from the cache even though it is a new and unrelated execution.

Expected behavior

We would expect that subworkflow execution IDs should not repeat, and new executions of the subworkflow should be run instead of read from the cache.

Additional context to reproduce

We are using the following FlytePropeller image:
cr.flyte.org/flyteorg/flytepropeller:v1.1.15@sha256:6630864d9adc1e6ecd980376b244f826cbee5962a36fb9b7760a840bad70447b.

Given that this issue happens very rarely and only after many many executions, I am not sure the best steps to reproduce.
I believe the best way to reproduce this issue would be:

  • Write a workflow that launches many instances of the same subworkflow
  • Run it many many times
  • Eventually an execution ID will repeat

I can try to help come up with something more solid if needed.

Screenshots

Some additional information from the thread on slack:

after some major digging, we found the colliding input strings:
fbffih2q-n2-0-n3-0-dn0-0 and f24nu5ka-n2-0-n3-0-dn4-0 hash to the same value: fldyczra
fbffih2q-n2-0-n3-0-dn1-0 and f24nu5ka-n2-0-n3-0-dn5-0 hash to: fkyq4zuy
so between these 2 executions fbffih2q and f24nu5ka we had like 7 or so collisions

Are you sure this issue hasn't been raised already?

  • Yes

Have you read the Code of Conduct?

  • Yes
@annadcunningham annadcunningham added bug Something isn't working untriaged This issues has not yet been looked at by the Maintainers labels Aug 23, 2022
@kumare3
Copy link
Contributor

kumare3 commented Aug 23, 2022

Cc @EngHabu

@hamersaw
Copy link
Contributor

tracked in duplicate issue - #2778

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working untriaged This issues has not yet been looked at by the Maintainers
Projects
None yet
Development

No branches or pull requests

3 participants