- Dynamic Routing Between Capsules by Sara Sabour, Nicholas Frosst and Geoffrey Hinton
- Official implementation (TensorFlow) by Sara Sabour
Image source: Mike Ross, A Visual Representation of Capsule Network Computations
- For details, run
python main.py --help
- PyTorch (http://www.pytorch.org)
- NumPy (http://www.numpy.org/)
- GPU
- Per-GPU
batch_size
= 128 - Initial
learning_rate
= 0.001 - Exponential
lr_decay
= 0.96 - Number of routing iteration (
num_routing
) = 3
Loss function hyper-parameters (see loss.py):
- Lambda for Margin Loss = 0.5
- Scaling factor for reconstruction loss = 0.0005
(with above mentioned hyper-parameters)
- Single GeForce GTX 1080Ti - 35.6s per epoch
- Two GeForce GTX 1080Ti - 35.8s per epoch (twice the batch size -> half the iteration)