Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

expanded workflow status deltas #212

Merged
merged 1 commit into from
May 26, 2021

Conversation

dwsutherland
Copy link
Member

@dwsutherland dwsutherland commented Apr 25, 2021

These changes partially address cylc/cylc-ui#543

Sibling to cylc/cylc-flow#4206
(that one should go in first)

This PR:

  • Adds statuses installed/uninstalled as seen at the UI Server in the form of deltas.
  • Waits until all deltas are send to subscribers on unregistering the workflow and before purging.

Needed for the UI to know when it can remove old/uninstalled workflows.

From Element (Oliver) (note: held => paused):

There are a number of other exotic state transitions that are possible between scans (ran into this with the scan work).

Here are the rules I would expect and the full state transition matrix (basically the same as the scan code).

The Flow States

  • none (not installed)
  • held (running, task pool held)
  • running (running, task pool unheld)
  • stopping (running, will shutdown soon)
  • stopped (previously run, no active scheduler)

Delta Rules

->
When the state changes from to (over any time period)
the UI would expect to receive...

none -> *
An added delta.

  • -> none
    A pruned delta.

{held,running,stopping} -> stopped
A shutdown and an update delta.

{held,running,stopping,stopped} -> {held,running,stopping,stopped}
An update delta.

*(UUID1) -> *(UUID2)
(scheduler UUID has change i.e. flow has been restarted)
A shutdown? and an update delta.

All State Transitions

->
(description)

  • delta component 1
  • delta component 2
  • ...

none -> stopped
(installed)

  • added(state=stopped)

none -> held
(installed and run in held mode)

  • added(state=held)

none -> running
(installed and run)

  • added(state=running)

none -> stopping
(installed, run and requested to stop)

  • added(state=stopping)

held -> none
(was running, deleted)

  • shutdown
  • pruned

held -> running
(held held, running)

  • updated(state=running)

held -> stopping
(requested to stop)

  • updated(state=stopping)

held -> stopped
(stopped)

  • shutdown
  • updated(state=stopped)

running -> none
(was running, deleted)

  • shutdown
  • pruned

running -> held
(paused)

  • updated(state=held)

running -> stopping
(requested to stop)

  • updated(state=stopping)

running -> stopped
(shutdown)

  • shutdown
  • updated(state=stopped)

stopping -> none
(was running, deleted)

  • shutdown
  • pruned

stopping -> held
(pending shutdown canceled)

  • updated(state=held)

stopping -> running
(pending shutdown canceled)

  • updated(state=running)

stopping -> stopped
(stopped)

  • updated(state=stopped)

stopped -> none
(uninstalled)

  • pruned

stopped -> held
(running, held mode)

  • updated(state=held)

stopped -> running
(running)

  • updated(state=running)

stopped -> stopping
(running, requested to stop)

  • updated(state=stopping)

To test this I ran the following:

#!/bin/bash

cylc install fox
sleep 6
rm -rf ~/cylc-run/fox
sleep 3
cylc install fox
sleep 6
cylc play --pause fox/run1
sleep 3
cylc play fox/run1
sleep 3
rm -rf ~/cylc-run/fox
sleep 1
pkill -f '/home/sutherlander/.envs/flow/bin/python /home/sutherlander/.envs/flow/bin/cylc play --pause fox/run1'
sleep 5
cylc install fox
sleep 4
cylc play --pause fox/run1
sleep 3
cylc play fox/run1
sleep 3
cylc stop fox/run1
sleep 8
while true; do
    if ! cylc ping $suite 2>/dev/null || \
       $(( $(cylc scan | wc -l) == 0 )); then
        echo 'workflow not running!!'
        break
    else
        echo 'workflow still running'
        sleep 3
    fi
done
# if the suite is in the process of·
sleep 4
rm -rf ~/cylc-run/fox
pkill -f '/home/sutherlander/.envs/flow/bin/python /home/sutherlander/.envs/flow/bin/cylc play --pause fox/run1'

workflow_status

Requirements check-list

  • I have read CONTRIBUTING.md and added my name as a Code Contributor.
  • Contains logically grouped changes (else tidy your branch by rebase).
  • Does not contain off-topic changes (use other PRs for other changes).
  • Already covered by existing tests.
  • No change log entry required (invisible to users).
  • No documentation update required.
  • No dependency changes.

Copy link
Member

@kinow kinow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only had time to review the code today. Going home earlier due to our evening meeting. Couple comments. One about any… I think we don't need the has_queues array. It could be a bool instead, and just be set to True if any delta_queue is not empty? Brain is in the javascript-mode, so I couldn't write Python to test it now, but can try it another day with ☕ feel free to ignore if that doesn't make sense 👍

cylc/uiserver/data_store_mgr.py Outdated Show resolved Hide resolved
cylc/uiserver/workflows_mgr.py Outdated Show resolved Hide resolved
@dwsutherland
Copy link
Member Author

Tests won't pass until cylc/cylc-flow#4206 is in.

@oliver-sanders
Copy link
Member

oliver-sanders commented May 7, 2021

I don't quite get the uninstalled state, once uninstalled there is nothing there to track, shouldn't it be a pruned delta?

@kinow kinow mentioned this pull request May 9, 2021
9 tasks
@kinow
Copy link
Member

kinow commented May 9, 2021

Using the Cylc UI GScan branch to test this PR now. The first thing I noticed was that the only workflows displayed using this branch are the running ones. Quickly debugging it, looks like the GScan filters isn't happy with the new states. I'll update that branch to work with this PR so that it displays the same as on Cylc UI's master branch first. Then will try to use the new states — which should fix the issue highlighted in that pull request, when a workflow is renamed/removed.

@kinow
Copy link
Member

kinow commented May 9, 2021

Adding new states for the UI, with new icons. Chose folder-plus and folder-remove, but these icons may change later; we can also change states here later but updating the Cylc UI branch will be very simple.

image

Copy link
Member

@kinow kinow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dwsutherland I had a workflow/run five-running/run19 which I simply rm -rf ~/cylc-run/five-running/run19. Then I received two deltas:

{"id": "1", "type": "data", "payload": {"data": {"deltas": {"id": "kinow|five-running/run19", "shutdown": false, "added": {"workflow": {"id": "kinow|five-running/run19", "name": "five-running/run19", "status": "uninstalled", "owner": "kinow", "host": "", "port": 0, "__typename": "Workflow"}, "__typename": "Added"}, "updated": {"workflow": {"id": "kinow|five-running/run19", "__typename": "Workflow"}, "__typename": "Updated"}, "__typename": "Deltas"}}}}

{"id": "1", "type": "data", "payload": {"data": {"deltas": {"id": "kinow|five-running/run19", "shutdown": false, "added": {"workflow": {"id": "kinow|five-running/run19", "name": "five-running/run19", "status": "uninstalled", "owner": "kinow", "host": "", "port": 0, "__typename": "Workflow"}, "__typename": "Added"}, "updated": {"workflow": {"id": "kinow|five-running/run19", "__typename": "Workflow"}, "__typename": "Updated"}, "__typename": "Deltas"}}}}

They appear to have the exact same content. Any idea why it would result in two identical deltas for the same query (you can see it's the same query as it has the same UI subscription id 1).

Also, I was expecting to see the workflow updated to have the state uninstalled. But instead I have the workflow added with the state uninstalled. Is that expected?

@dwsutherland
Copy link
Member Author

dwsutherland commented May 10, 2021

Also, I was expecting to see the workflow updated to have the state uninstalled. But instead I have the workflow added with the state uninstalled. Is that expected?

No point in showing uninstalled state.

@kinow
Copy link
Member

kinow commented May 10, 2021

So instead of the unistalled state we will get a pruned delta? I just need something yhat I can use to remove the workflow from gscan if it was deleted or renamed.

@dwsutherland
Copy link
Member Author

So instead of the unistalled state we will get a pruned delta? I just need something yhat I can use to remove the workflow from gscan if it was deleted or renamed.

No, use it as the prune indicator.. but just don't show it.

@kinow
Copy link
Member

kinow commented May 10, 2021

So instead of the unistalled state we will get a pruned delta? I just need something yhat I can use to remove the workflow from gscan if it was deleted or renamed.

No, use it as the prune indicator.. but just don't show it.

Aaaahh! D'oh moment 🤣 will update that branch tomorrow handling the uninstalled state. Great spot David! 🤦‍♂️

@dwsutherland
Copy link
Member Author

I don't quite get the uninstalled state, once uninstalled there is nothing there to track, shouldn't it be a pruned delta?

It does make sense (in my head anyway) to say "the workflow is uninstalled" in the same way you'd say "the workflow is running/installed/paused"..

It essentially is the pruned delta (what's the difference?), and it is the state of the workflow... (we can put it anyway)
And besides, the delta.pruned/delta is only one type of subscription, so this way it'll show up in the gscan subscription also.

@oliver-sanders
Copy link
Member

It essentially is the pruned delta (what's the difference?)

Once a workflow is uninstalled it doesn't exist any more so there is nothing for the UIS/UI to track. So there shouldn't be an entry in the UIS or UI data store for it any more. So I think we want to send a "pruned" message telling remote data stores that the workflow has been removed.

The UI shouldn't have to interpret the updated deltas to find out whether or not it needs to purge its data store:

if ((delta.updated && delta.status && delta.status == WorkflowState.UNINSTALLED) || delta.pruned) {
    prune(delta.id)
}

@kinow
Copy link
Member

kinow commented May 10, 2021

Update the Cylc UI PR to ignore the uninstalled state.

The UI shouldn't have to interpret the updated deltas to find out whether or not it needs to purge its data store:

Having a delta would definitely be simpler. Also, since we have pruned tasks, jobs, etc... I think it would be OK to have a workflow being pruned too. But I can implement either way in the UI, so I will leave the decision up to you guys, @hjoliver , and others 👍

@dwsutherland
Copy link
Member Author

Update the Cylc UI PR to ignore the uninstalled state.

The UI shouldn't have to interpret the updated deltas to find out whether or not it needs to purge its data store:

Having a delta would definitely be simpler. Also, since we have pruned tasks, jobs, etc... I think it would be OK to have a workflow being pruned too. But I can implement either way in the UI, so I will leave the decision up to you guys, @hjoliver , and others 👍

Ok.. I was originally trying to avoid touching the scheduler end with code it knows nothing about... Will add a boolean field to the pruned deltas ...

However, I'm tempted to keep uninstalled in the workflow status (but I won't), because it means we will need an extra subscription just to track deltas.pruned.workflow if we are subscribed to workflows (not delta)... In fact, I may just add an extra field in workflow also for this purpose.

@dwsutherland dwsutherland force-pushed the expanded-workflow-status-deltas branch from 1642d49 to afbb17e Compare May 12, 2021 04:57
@dwsutherland
Copy link
Member Author

dwsutherland commented May 12, 2021

Update includes:

  • Removed uninstalled workflow status.
  • Added pruned flag in both deltas.pruned.workflow and workflow.pruned boolean fields
  • Fixed the issue of flickering workflow.id deltas.
  • Fixed the issue where an extra delta was sent post workflow prune/unregister

Animation 1: installed, play, stop, delete cylc-run dir (pruned).
workflow_pruned_sub1

Animation 2: installed, delete cylc-run dir (pruned).
workflow_pruned_sub2

@hjoliver
Copy link
Member

Looks good!

@dwsutherland dwsutherland force-pushed the expanded-workflow-status-deltas branch from afbb17e to 418e2f0 Compare May 13, 2021 04:13
@dwsutherland
Copy link
Member Author

Changed the delta.pruned entry to be an ID:
image

@oliver-sanders
Copy link
Member

oliver-sanders commented May 13, 2021

Code looks good.

Run a few tests:

  • The workflow pruned deltas are looking good!
  • I haven't seen the workflow status flickering 👍.
  • There are still a couple of oddities in the deltas but I think this is for future work.

Here's the result of a test I performed using the following subscription:

subscription {
  deltas {
    added {
      workflow(stripNull:true) {
        id
        status
      }
    }
    updated {
      workflow(stripNull:true) {
        id
        status
      }
    }
    pruned {
      workflow
    }
  }
}

Installing/playing/cleaning a workflow resulted in the following deltas:

$ cylc install one --no-run-name
added(one, installed)

$ cylc play one
added(one, running)  # should be updated(one, running)
updated(one)  # empty delta not needed for this sub caused by task pool changes
updated(one, stopped)

$ cylc clean one
updated(one)  # should not be an updated delta here
pruned(one)

Here's the returned JSON in full:

$ cylc install one --no-run-name
{"id": "1", "type": "data", "payload": {"data": {"deltas": {"added": {"workflow": {"id": "oliver|one", "status": "installed"}}, "updated": {}, "pruned": {}}}}}

$ cylc play one
{"id": "1", "type": "data", "payload": {"data": {"deltas": {"added": {"workflow": {"id": "oliver|one", "status": "running"}}, "updated": {}, "pruned": {}}}}}
{"id": "1", "type": "data", "payload": {"data": {"deltas": {"added": {}, "updated": {"workflow": {"id": "oliver|one"}}, "pruned": {}}}}}
{"id": "1", "type": "data", "payload": {"data": {"deltas": {"added": {}, "updated": {}, "pruned": {}}}}}
{"id": "1", "type": "data", "payload": {"data": {"deltas": {"added": {}, "updated": {"workflow": {"id": "oliver|one", "status": "stopped"}}, "pruned": {}}}}}

$ cylc clean one
{"id": "1", "type": "data", "payload": {"data": {"deltas": {"added": {}, "updated": {"workflow": {"id": "oliver|one"}}, "pruned": {"workflow": "oliver|one"}}}}}

@hjoliver
Copy link
Member

hjoliver commented May 14, 2021

Tested with same result as @oliver-sanders

I applied the subscription above in GraphiQL, then did an install, play, pause, play, stop, clean while watching the network response in Firefox Dev Tools (because you can't tell in GraphiQL if the same response comes back multiple times).

  • got an empty updated delta whenever the task pool changes, containing just the workflow ID.
  • got another empty updated delta on cleaning the workflow, along with the correct pruned delta.

@dwsutherland - can the server strip / not send these empty (ID only) deltas?

@kinow
Copy link
Member

kinow commented May 14, 2021

Just wondering if stripNull would help here @hjoliver ? In @oliver-sanders ' example query I think it doesn't have stripNull: true at the top level. Did you try that?

@oliver-sanders
Copy link
Member

Just wondering if stripNull would help here

Worth a shot sticking it a level up on the deltas bit:

subscription {
  deltas(stripNull:true) {
    added {
      workflow(stripNull:true) {
        id
        status
      }
    }
    updated {
      workflow(stripNull:true) {
        id
        status
      }
    }
    pruned {
      workflow
    }
  }
}
$ cylc install one --no-run-name
added(installed)

$ cylc play one
added(running)       # should be updated(running)
updated()            # empty updated delta
null                 # empty delta
updated(stopped)

$ cylc clean one
updated()            # empty updated delta
pruned()
$ cylc install one --no-run-name
{"id": "1", "type": "data", "payload": {"data": {"deltas": {"added": {"workflow": {"id": "oliver|one", "status": "installed"}}, "updated": {}, "pruned": {}}}}}

$ cylc play one
{"id": "1", "type": "data", "payload": {"data": {"deltas": {"added": {"workflow": {"id": "oliver|one", "status": "running"}}, "updated": {}, "pruned": {}}}}}
{"id": "1", "type": "data", "payload": {"data": {"deltas": {"added": {}, "updated": {"workflow": {"id": "oliver|one"}}, "pruned": {}}}}}
{"id": "1", "type": "data", "payload": {"data": {"deltas": {"added": {}, "updated": {}, "pruned": {}}}}}
{"id": "1", "type": "data", "payload": {"data": {"deltas": {"added": {}, "updated": {"workflow": {"id": "oliver|one", "status": "stopped"}}, "pruned": {}}}}}

$ cylc clean one
{"id": "1", "type": "data", "payload": {"data": {"deltas": {"added": {}, "updated": {"workflow": {"id": "oliver|one"}}, "pruned": {"workflow": "oliver|one"}}}}}

@dwsutherland
Copy link
Member Author

It's inherited so you can declare at the top:

subscription {
  deltas (stripNull: true) {
    added {
      workflow {
        id
        status
      }
    }
    updated {
      workflow {
        id
        lastUpdated
        status
      }
    }
    pruned {
      workflow
    }
  }
}

However, the reason added, updated, and pruned aren't stripped (yet) is because they aren't protobuf fields (this data structure is a mix of dictionary and protobuf)..

@dwsutherland - can the server strip / not send these empty (ID only) deltas?

(as mentioned on riot) It should be the topic of a separate PR, this PR is only providing the installed and pruned messages to the client. The fixes included in this PR is to make sure it's a clean signal.

It may take some work to figure out how to make some graphql fields show up only when other fields are not empty.

Additionally, I don't think I can have strip null as an argument to the parent/root subscription or query..
Which leads to the next question; can we not push after the stripping? (i.e. we strip down to nothing to send)

@hjoliver
Copy link
Member

Which leads to the next question; can we not push after the stripping? (i.e. we strip down to nothing to send)

Presumably we do push after stripping? (There's no point in pushing before stripping!) ... I must be misunderstanding you?

@hjoliver
Copy link
Member

@dwsutherland so you've explained the empty (ID only) deltas. But I don't think you commented on this?:

$ cylc install one --no-run-name
added(installed)     # <--- good

$ cylc play one
added(running)       # <---- should be updated(running)

@kinow
Copy link
Member

kinow commented May 25, 2021

Started reviewing this one today with the sibling cylc-flow, and the UI gscan deltas branches, but it will take a few more days to complete the review & tests :+

@dwsutherland
Copy link
Member Author

@dwsutherland so you've explained the empty (ID only) deltas. But I don't think you commented on this?:

$ cylc install one --no-run-name
added(installed)     # <--- good

$ cylc play one
added(running)       # <---- should be updated(running)

Tricky one, because to the scheduler it is added..
Hence, that running would have come through with the initial burst, any subsequent status changes will be in updated..

The only status changes in added should be on initial burst.

So I'd suggest leaving it as is and have the UI treat this transition the same as a reload (where you'd also get a new added to overwrite/recreate the previous UI store)...

@dwsutherland
Copy link
Member Author

Which leads to the next question; can we not push after the stripping? (i.e. we strip down to nothing to send)

Presumably we do push after stripping? (There's no point in pushing before stripping!) ... I must be misunderstanding you?

Yes, what I meant was: If there's nothing to push after stripping, can we back out of yielding a result to the web socket?

@oliver-sanders
Copy link
Member

oliver-sanders commented May 25, 2021

So I'd suggest leaving it as is and have the UI treat this transition the same as a reload (where you'd also get a new added to overwrite/recreate the previous UI store)...

I wasn't aware of the added delta on reload (and I doubt the UI is either). We should be able to turn that added into an updated before broadcast somehow? Really don't want to have to make the UI interpret the information it's being provided in order to sync it's internal data store (the UI doesn't know much about Cylc).

Going to bump to a new issue to avoid convoluting this one - #221

Copy link
Member

@oliver-sanders oliver-sanders left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested the pruned delta, works fine for me, bumped the other stuff into an issue for debate to allow us to focus on one thing at a time.

cylc/uiserver/workflows_mgr.py Outdated Show resolved Hide resolved
@oliver-sanders
Copy link
Member

Yes, what I meant was: If there's nothing to push after stripping, can we back out of yielding a result to the web socket?

Is this in the async generator while True loop? If so I think continue would do it?

@hjoliver
Copy link
Member

@dwsutherland - a few tests failing here.

@dwsutherland
Copy link
Member Author

@dwsutherland - a few tests failing here.

Yes, it will fail until the cylc-flow sibling is in.

@hjoliver
Copy link
Member

Oh, of course - sorry 🤕

Copy link
Member

@hjoliver hjoliver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problems here on re-testing. 👍 Let's get this in ...

@hjoliver hjoliver merged commit ec753a4 into cylc:master May 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants