Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Towards New Trainer API #458

Closed
ppwwyyxx opened this issue Oct 28, 2017 · 0 comments
Closed

Towards New Trainer API #458

ppwwyyxx opened this issue Oct 28, 2017 · 0 comments
Labels
enhancement feature or enhancement

Comments

@ppwwyyxx
Copy link
Collaborator

ppwwyyxx commented Oct 28, 2017

New trainer API is pushed to master and examples are updated. I expect this to be the final API and stay stable, and if thing goes well we will go to tensorpack 1.0. This is an issue to track related problems.

What's new

New docs about this change are in Trainer, Training Interface, and Write a Trainer. Also API docs are changed.

Why

New trainer APIs are isolated from ModelDesc and TrainConfig, which arguably packs arbitrary training options together and is therefore not a good design. Some related discussion in #318 (comment) .
Now ModelDesc and TrainConfig are only used in wrappers on top of trainers, in order to keep trainer interface clean.

What will happen

Use export TENSORPACK_TRAIN_API=v2 to use the new API.
For backwards-compatibility, we will gradually go towards the new API.

  1. (now ~ +1 month) v1 is still the default. Users should set v2 option manually in .bashrc, etc. All old code should run the same, because you'll import the old trainer. But all examples set the envvar to v2 and use v2 API.
  2. (+1 month ~ +6 months) v2 will be the default. Old code can still run, due to some hacks to maintain compatibility.
  3. (+6 months ~ ) Old trainer code may be cleaned up.

Also, new features such as easier Keras model training, horovod, will be only in v2.

What to do

Use v2 today for new code!

  1. export TENSORPACK_TRAIN_API=v2 (not doing this and use v2 API directly should also work for single-cost training, but with warnings)
  2. If you use old trainers, replace SomeTrainer(config, ...).train() with launch_train_with_config(config, SomeTrainer(...)).
  3. If you use custom trainer, checkout the new docs, as well as the GAN trainer
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement feature or enhancement
Projects
None yet
Development

No branches or pull requests

1 participant