Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reduce setting global variables to reduce torch compile graph breaks #6541

Conversation

NirSonnenschein
Copy link
Contributor

setting global variables during training will create a graph breaks when using torch.compile (reading global variables doesn't). this commit attempts to reduce the setting of global variables in the checkpointing flows.
there are 2 main uses setting global variables:

  1. Share data between functions
  2. Establish that this is the first call to the code

For most of the cases the data in the global variables is data that can be computed on demand or set once in an initial state in a configure function.
For "check that this is the first run" use case the code was moved to the configure function.

setting global variables when running will create a graph break
when using torch.compile (reading globals is ok however).
this commit attempts to reduce the setting of globals in the
checkpointing flow.
there are 2 main uses setting globals:
1. share data between functions
2. establish that this is the first call to the code

for most of the cases the data in the globals is values
that can be computed on demand or set once in an initial state
in a configure funtion. for "check that this is the first run" use
case the code was moved to the configure funciton.
auto-merge was automatically disabled October 1, 2024 11:39

Head branch was pushed to by a user without write access

@tjruwase tjruwase added this pull request to the merge queue Oct 10, 2024
Merged via the queue into microsoft:master with commit d7ca3d8 Oct 10, 2024
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants