Skip to content
This repository has been archived by the owner on Dec 20, 2024. It is now read-only.

Commit

Permalink
cleaning
Browse files Browse the repository at this point in the history
  • Loading branch information
anaprietonem committed Dec 9, 2024
1 parent 6f4e025 commit 963e482
Showing 1 changed file with 1 addition and 10 deletions.
11 changes: 1 addition & 10 deletions src/anemoi/training/diagnostics/callbacks/checkpoint.py
Original file line number Diff line number Diff line change
Expand Up @@ -77,20 +77,11 @@ def model_metadata(self, model: torch.nn.Module) -> dict:

return self._model_metadata

def on_validation_end(self, trainer: pl.Trainer, pl_module: pl.LightningModule) -> None:
"""Save a checkpoint at the end of the validation stage."""
del pl_module
if not self._should_skip_saving_checkpoint(trainer) and not self._should_save_on_train_epoch_end(trainer):

This comment has been minimized.

Copy link
@Rilwan-Adewoyin

Rilwan-Adewoyin Dec 10, 2024

Member

@anaprietonem
This all looks good , do we know the exact conditions for which
"not self._should_skip_saving_checkpoint and not self._should_save_on_train_epoch_end(trainer)" evaluates to true?

monitor_candidates = self._monitor_candidates(trainer)
if self._every_n_epochs >= 1 and (trainer.current_epoch + 1) % self._every_n_epochs == 0:
self._save_topk_checkpoint(trainer, monitor_candidates)
self._save_last_checkpoint(trainer, monitor_candidates)

def on_fit_end(self, trainer: pl.Trainer, pl_module: pl.LightningModule) -> None:
del pl_module
if not self._should_skip_saving_checkpoint(trainer) and not self._should_save_on_train_epoch_end(trainer):
monitor_candidates = self._monitor_candidates(trainer)
# Need to correct the checkpoint epoch to the last epoch
# PTL advances one epoch at end of training, Need to correct the checkpoint epoch to the last epoch
monitor_candidates["epoch"] = trainer.current_epoch - 1
self._save_topk_checkpoint(trainer, monitor_candidates)
self._save_last_checkpoint(trainer, monitor_candidates)
Expand Down

0 comments on commit 963e482

Please sign in to comment.