Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lightning Tuner functions #1503

Closed
solalatus opened this issue Jan 21, 2023 · 0 comments · Fixed by #1609
Closed

Lightning Tuner functions #1503

solalatus opened this issue Jan 21, 2023 · 0 comments · Fixed by #1609
Labels
triage Issue waiting for triaging

Comments

@solalatus
Copy link
Contributor

Is your feature request related to a current problem? Please describe.
The Tuner functions of Lightning would come in handy, and could be relatively easy to implement.

Describe proposed solution
Currently it is possible to access them, but in a super hacky way in two steps:

  1. Start a model fit, and interrupt it (This initializes the trainer, the datasets and all the necessary bells and whistles)
  2. Then access it via some very awkward syntax:

lr_finder = model_nhits.model.trainer.tuner.lr_find(model_nhits.model, train_dataloaders=model_nhits.model.trainer._data_connector._train_dataloader_source.instance)

(see here )

The observation is, that the fit() and from it the fit_from_dataset() calls are essentially doing all the setup, and the only thing not needed would be one line here, which is actually calling the training.

Suggestion:

  • Adding a flag like setup_only=True to fit (default False), thus letting people use the fit function for setup
  • Making the lr_find() and scale_batch_size() functions available from the Darts model, not having to use this awkward way of accessing it.

Describe potential alternatives
Alternative would be to refactor the fit in a way, that it has an internal setup() call, and then continues with training, then letting people call this setup() externally, but I think this is way more effort for basically no benefit.

Additional context
In case of more straining model trainings, both tuner options are pretty beneficial, the scale_batch_size() for maximizing GPU utilization, the lr_find() for more efficient LR policies, like 1cycle that depend heavily on max LR measurement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triage Issue waiting for triaging
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant