Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gradient clipping doesn't work with dict params #6

Open
EdwardTyantov opened this issue Jul 5, 2017 · 2 comments
Open

gradient clipping doesn't work with dict params #6

EdwardTyantov opened this issue Jul 5, 2017 · 2 comments

Comments

@EdwardTyantov
Copy link

When using per-layer LR I get an exception:

  File "/home/tyantov/workspace/kaggle-planet/planet/train.py", line 375, in main
    tr.run(config)
  File "/home/tyantov/workspace/kaggle-planet/planet/train.py", line 183, in run
    train_score = boilerplate.train(train_loader, self._model, criterion, optimizer, epoch)
  File "/home/tyantov/workspace/kaggle-planet/planet/boilerplate.py", line 217, in train
    optimizer.step()
  File "/home/tyantov/workspace/kaggle-planet/planet/generic_models/yellowfin.py", line 202, in step
    torch.nn.utils.clip_grad_norm(self._var_list, self._clip_thresh)
  File "/home/tyantov/anaconda2/lib/python2.7/site-packages/torch/nn/utils/clip_grad.py", line 17, in clip_grad_norm
    parameters = list(filter(lambda p: p.grad is not None, parameters))
  File "/home/tyantov/anaconda2/lib/python2.7/site-packages/torch/nn/utils/clip_grad.py", line 17, in <lambda>
    parameters = list(filter(lambda p: p.grad is not None, parameters))
AttributeError: 'dict' object has no attribute 'grad'

Code:

     if exact_layers:
        logger.info('Learning exact layers, number=%d', len(exact_layers))
        parameters = []
        for i, layer in enumerate(exact_layers):
            if isinstance(layer, tuple) and len(layer) == 2:
                layer, multiplier = layer
                init_multiplier = 1
            elif isinstance(layer, tuple) and len(layer) == 3:
                layer, init_multiplier, multiplier = layer
            else:
                multiplier = 1
                init_multiplier = 1
            lr = config.lr * multiplier
            init_lr = config.lr * multiplier * init_multiplier
            logger.info('Layer=%d, lr=%.5f', i, init_lr)
            parameters.append({'params': layer.parameters(), 'lr': init_lr, 'after_warmup_lr': lr})
    else:
        logger.info('Optimizing all parameters, lr=%.5f', config.lr)
        parameters = model.parameters()

Exact line: parameters.append({'params': layer.parameters(), 'lr': init_lr,
standart optimizers work with dict params, YF not.

@EdwardTyantov
Copy link
Author

I've created PR to fix this: #7

@mitliagkas
Copy link

Thank you for your pull request! The changes seem reasonable to me and definitely useful to have. We will test soon and incorporate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants