Marking a plasma manager as dead does not mark its local scheduler as dead. #569

robertnishihara · 2017-05-19T03:49:09Z

The file monitor-008015.err on the head node looks like this.

WARNING:root:Timed out b'plasma_manager'
WARNING:root:Removed b'plasma_manager', client ID 00fb29d393f227ce044542f05065560325fb72fd
WARNING:root:Marked 1274 objects as lost.

The entry of ray.global_state.client_table() for this node is the following.

'172.31.30.57': [
  {'ClientType': 'plasma_manager',
   'DBClientID': '00fb29d393f227ce044542f05065560325fb72fd',
   'Deleted': True},
  {'AuxAddress': '172.31.30.57:11227',
   'ClientType': 'local_scheduler',
   'DBClientID': '46139b8d82494ce2480dfd37d98b05fea6da1984',
   'Deleted': False,
   'LocalSchedulerSocketName': '/tmp/scheduler40743926',
   'NumCPUs': 8.0,
   'NumGPUs': 0.0}]

So the plasma manager has been marked as dead, but the local scheduler on the same node has not.

When I run new workloads, it looks like tasks are scheduled on the node with the "dead" plasma manager. Note that when I run `ps aux | grep "plasma_manager " on the relevant node, the manager seems to still be alive.

What is the intended behavior here. If Ray thinks that the manager is dead, then shouldn't we stop assigning work that node?

The text was updated successfully, but these errors were encountered:

robertnishihara · 2018-10-27T02:14:49Z

No longer relevant.

robertnishihara closed this as completed Oct 27, 2018

This was referenced Apr 16, 2021

[tune](deps): Bump pytorch-lightning-bolts from 0.2.5 to 0.3.2 in /python/requirements vuoristo/ray#1

Closed

[tune](deps): Bump pytorch-lightning-bolts from 0.2.5 to 0.3.2 in /python/requirements brandonJY/ray#2

Closed

This was referenced May 22, 2021

[tune](deps): Bump pytorch-lightning-bolts from 0.2.5 to 0.3.2 in /python/requirements/tune #16003

Closed

[tune](deps): Bump pytorch-lightning-bolts from 0.2.5 to 0.3.2 in /python/requirements/tune bladesaber/ray#7

Open

This was referenced Jun 8, 2021

[tune](deps): Bump pytorch-lightning-bolts from 0.2.5 to 0.3.2 in /python/requirements/tune audreyccheng/ray#8

Closed

[tune](deps): Bump pytorch-lightning-bolts from 0.2.5 to 0.3.2 in /python/requirements/tune ckw017/ray#7

Closed

This was referenced Jun 16, 2021

[tune](deps): Bump pytorch-lightning-bolts from 0.2.5 to 0.3.2 in /python/requirements/tune doru1004/ray#8

Closed

[tune](deps): Bump pytorch-lightning-bolts from 0.2.5 to 0.3.2 in /python/requirements/tune zhisbug/ray#22

Open

dependabot bot mentioned this issue Jun 27, 2021

[tune](deps): Bump pytorch-lightning-bolts from 0.2.5 to 0.3.2 in /python/requirements/tune suquark/ray#20

Closed

This was referenced Jul 14, 2021

[tune](deps): Bump pytorch-lightning-bolts from 0.2.5 to 0.3.2 in /python/requirements/tune antoine-galataud/ray#14

Open

[tune](deps): Bump pytorch-lightning-bolts from 0.2.5 to 0.3.2 in /python/requirements/tune fishbone/ray#30

Closed

dependabot bot mentioned this issue Jul 30, 2021

[tune](deps): Bump pytorch-lightning-bolts from 0.2.5 to 0.3.2 in /python/requirements/tune holdenk/ray#42

Open

dependabot bot mentioned this issue Aug 17, 2021

[tune](deps): Bump pytorch-lightning-bolts from 0.2.5 to 0.3.2 in /python/requirements/tune simon-mo/ray#73

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Marking a plasma manager as dead does not mark its local scheduler as dead. #569

Marking a plasma manager as dead does not mark its local scheduler as dead. #569

robertnishihara commented May 19, 2017

robertnishihara commented Oct 27, 2018

Marking a plasma manager as dead does not mark its local scheduler as dead. #569

Marking a plasma manager as dead does not mark its local scheduler as dead. #569

Comments

robertnishihara commented May 19, 2017

robertnishihara commented Oct 27, 2018