-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix submission failed handler on bad host select #2631
Fix submission failed handler on bad host select #2631
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Recreated the KeyError
outlined on master
, which is resolved on the PR branch so that the failed handler is flagged as an ERROR, stalling the suite instead of shutting it down, as (I think?) is the desired/correct behaviour. New test suitable & passes locally.
New log viewer report:
2018-04-20T15:07:56+01 ERROR - [remote-host-select cmd] timeout 10 bash -c false
[remote-host-select ret_code] 1
2018-04-20T15:07:56+01 ERROR - false: host selection failed:
COMMAND FAILED (1): false
2018-04-20T15:07:56+01 ERROR - [jobs-submit cmd] (remote host select)
[jobs-submit ret_code] 1
[jobs-submit err]
false: host selection failed:
COMMAND FAILED (1): false
2018-04-20T15:07:56+01 ERROR - [t1.1] -submission failed
2018-04-20T15:07:56+01 WARNING - suite stalled
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, scratch that! The functionality is fine as per my original comment, but there is a subtlety in the GUI which means one gets a ValueError
when trying to access an item on the 'View Jobs Jogs (Viewer)' menu for the submit-failed
task with the failed handler (t.1
as in this example suite):
Traceback (most recent call last):
File "/net/home/h06/sbarth/cylc.git/lib/cylc/gui/app_gcylc.py", line 1271, in view_task_logs
self._popup_logview(task_id, task_state_summary, choice)
File "/net/home/h06/sbarth/cylc.git/lib/cylc/gui/app_gcylc.py", line 2210, in _popup_logview
nsubmits, self.get_remote_run_opts())
File "/net/home/h06/sbarth/cylc.git/lib/cylc/gui/combo_logviewer.py", line 47, in __init__
logviewer.__init__(self)
File "/net/home/h06/sbarth/cylc.git/lib/cylc/gui/logviewer.py", line 37, in __init__
self.create_gui_panel()
File "/net/home/h06/sbarth/cylc.git/lib/cylc/gui/combo_logviewer.py", line 69, in create_gui_panel
combobox2.set_active(snums.index(self.nsubmit))
ValueError: list.index(x): x not in list
Tracing this through the code it is due to nsubmits = len(task_state_summary.get('job_hosts', {}))
(app_gcylc.py
, line 2207) with the default empty list, so that in the ComboLogViewer
class in combo_logviewer.py
the argument of the same name sets self.nsubmit = nsubmits = 0
, creating snums
as an empty list on line 65 of that file. So a small change will be needed to fix this issue.
The KeyError would take down the suite.
3aad403
to
a89bd2b
Compare
GUI issue addressed. (You will still be unable to see any log files, but at least we are not going to bring down the GUI any more.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GUI issue resolved by the squashed change. Now attempted log file access from the 'viewer' menu displays the same as in the 'editor' i.e. ERROR: file not found: <file>
.
All good now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good.
To reproduce.
On current master, the failed remote host select will cause a
KeyError
that will take down the suite with something like this: