Multi GPU training of the discriminator #29

S-Abdelnabi · 2019-12-18T14:53:01Z

Hi,

I was wondering if anyone has tried customizing the code to multi GPU training instead of TPUs.
The current code works for a single GPU without a lot of modifications (set use_tpu = False). However, I am facing some trouble in running it with multi GPU.

I changed the configuration as follows (tensorflow-gpu 1.13.1):

distribution = tf.contrib.distribute.MirroredStrategy(num_gpus=FLAGS.num_gpus)
run_config = tf.estimator.RunConfig( log_step_count_steps = 10, save_summary_steps = 10, model_dir=FLAGS.output_dir, save_checkpoints_steps=FLAGS.iterations_per_loop, keep_checkpoint_max=5, train_distribute = distribution )

estimator = tf.estimator.Estimator( model_fn=model_fn, config=run_config, model_dir = FLAGS.output_dir, params = {'batch_size': FLAGS.batch_size} )

estimator.train(input_fn=train_input_fn, steps=num_train_steps)

However, I have the following error:

raise ValueError("You must specify an aggregation method to update a "
ValueError: You must specify an aggregation method to update a MirroredVariable in Replica Context.

Has anyone maybe found a solution to this?
Thanks.

The text was updated successfully, but these errors were encountered:

wind91725 · 2020-05-12T08:00:39Z

I also have this problem. Did you solve it? Can you help me solve this problem? Thank you！

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multi GPU training of the discriminator #29

Multi GPU training of the discriminator #29

S-Abdelnabi commented Dec 18, 2019

wind91725 commented May 12, 2020

Multi GPU training of the discriminator #29

Multi GPU training of the discriminator #29

Comments

S-Abdelnabi commented Dec 18, 2019

wind91725 commented May 12, 2020