Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cat-log: remote cylc sub-command #2503

Merged
merged 27 commits into from
Apr 16, 2018
Merged

Conversation

hjoliver
Copy link
Member

@hjoliver hjoliver commented Dec 5, 2017

Address first [update: and second] check-box of #2302

Not for review yet.

[UPDATED description]: Refactored and improved the cat-log command in order to put remote host functionality behind a cylc command (for ssh whitelisting):

  • remote host interaction, which was a bunch of exposed shell commands, is now behind cylc cat-log (the command re-invokes itself on remote hosts after doing what's needed on the local host - i.e. option and global config processing)
  • a single function implements all view mode options, and is called on the file host (depending on whether the target log file is local or remote)
  • rationalized command line options: one option for view mode (cat, tail, edit, print-path, print-dir, list-dir; can cat or tail with "qcat"-like commands if available) and another for the target file (out, err, xtrace, activity, status, diff, or custom; or suite logs). Not backward compatible with the previous zoo of options, but I don't think that matters for a CLI tool.
  • tail-follow processes, via CLI and GUI, local and remote, now die properly when the parent process exits (this is not the case on master)
  • also addressed a bunch of issues flagged by codacy, and Ignored two subprocess-related issues after adding explanatory comments to the source.

@hjoliver
Copy link
Member Author

hjoliver commented Dec 5, 2017

I have this working for all combos of mode (print, list, cat, tail) and file (job.out etc.), local and remote, and custom batch system out viewer and err viewer.

But ... I noticed we also have global config for custom batch system out tailer and err tailer - which seems not to be used anywhere. @matthewrmshin - (a) I think you introduced these ~2 years ago, presumably they should be used; (b) to use them in cylc cat-log for remote jobs I will presumably need to ensure the remote out/err tailer processes die when the ssh connection is killed (as per use of the -pid option in the standard tail command) - do you know how to do this easily?

@matthewrmshin
Copy link
Contributor

@matthewrmshin
Copy link
Contributor

(The SSH tail problem does not appear to be handled correctly here. The custom tailer will eventually die when the batch job completes, so it has not been too problematic so far.)

@matthewrmshin matthewrmshin added this to the next release milestone Dec 5, 2017
@hjoliver
Copy link
Member Author

hjoliver commented Dec 6, 2017

Looks like the 2 options are used by the GUI only.

Ah, my quick grep was a little simplistic!

@hjoliver
Copy link
Member Author

hjoliver commented Dec 6, 2017

The SSH tail problem does not appear to be handled correctly here.

In the GUI? In what way is it not handled correctly? It would be better to do it properly than wait for the remote job to end before the tail process dies.

@matthewrmshin
Copy link
Contributor

We have a custom command supplied by another team, so not absolutely straightforward at the time. With your wrapper, I suppose we can handle this the same way as you handle tail-F.

@matthewrmshin
Copy link
Contributor

matthewrmshin commented Dec 6, 2017

We can easily set up the wrapper (while in tail mode) to exit if it becomes its own session leader or if its parent's PID changes.

@hjoliver
Copy link
Member Author

hjoliver commented Dec 8, 2017

Yep, that works well.

@hjoliver hjoliver force-pushed the cat-log-refactor branch 2 times, most recently from 80bf0e0 to 13827ac Compare December 13, 2017 04:09
@hjoliver hjoliver modified the milestones: next release, soon Jan 16, 2018
@hjoliver hjoliver modified the milestones: soon, next release Feb 7, 2018
@hjoliver
Copy link
Member Author

(this was close to done when left off in Dec.; I expect to finish it off quickly on returning from Melbourne in 1.5 weeks).

@hjoliver hjoliver force-pushed the cat-log-refactor branch 2 times, most recently from e35fa9c to 1951509 Compare March 4, 2018 09:19
@hjoliver hjoliver force-pushed the cat-log-refactor branch 4 times, most recently from 93d8a3e to f5ad644 Compare March 24, 2018 09:28
@hjoliver hjoliver force-pushed the cat-log-refactor branch 4 times, most recently from 70b2434 to ac060f0 Compare March 26, 2018 01:35
hjoliver and others added 2 commits April 14, 2018 12:09
Allow --debug.
Use argument list instead of argument string for SSH command.
@hjoliver
Copy link
Member Author

(Branch rebased - was a trivial conflict)

@hjoliver
Copy link
Member Author

Hold off a bit: I found a problem viewing logs of local tasks while they are still running. I'll fix then add a another test...

Start all tests from global-tests.rc.
test_header functions to set host and owner vars.
Get rid of old CYLC_TEST_TASK_(OWNER|HOST) vars.
@hjoliver
Copy link
Member Author

hjoliver commented Apr 16, 2018

@matthewrmshin @sadielbartholomew - you might want to review just the three new commits.
Main things to note:

  • fb76690 test battery changes are somewhat off-topic - see the commit log for what it does - but there was overlap with changes in this PR so I couldn't easily separate it. The only possibly controversial change is create_test_globalrc is now automatically called at end of test_header for all tests (I can't think of any good reason not to have a clean separation from the normal site/user global config, for tests - I was often getting myself into trouble after modifying one or the other while playing around with the test battery).
  • 7c721c2 refactors some remaining legacy code in the cat-log command that was overly confusing (namely how to determine when "qcat"-type commands are needed, and misleading use of "user_at_host")

@hjoliver
Copy link
Member Author

hjoliver commented Apr 16, 2018

Two more new commits, after observing the output of a suite run to update the "Remote Job Host Interaction" transcript in the user guide: 1) only send the tailer template to the remote side if tail mode is requested [reversed] 2) export CYLC_VERSION on the remote, not local, side of the ssh command.

@matthewrmshin
Copy link
Contributor

Still 2 breakages.

  • key is not defined in the first case.
  • batchview_cmd must be in the correct argument position for the logic in line 111 to skip checking of existence of logpath to work.

Applying this diff should do the trick, but sort of reverse the spirit of f6e00bd.

diff --git a/bin/cylc-cat-log b/bin/cylc-cat-log
index 375d8cdea..d72f0f77e 100755
--- a/bin/cylc-cat-log
+++ b/bin/cylc-cat-log
@@ -375,7 +375,7 @@ def main():
                 conf = glbl_cfg().get_host_item("batch systems", host, user)
                 batchview_cmd_tmpl = None
                 try:
-                    batchview_cmd_tmpl = conf[batch_sys_name][key]
+                    batchview_cmd_tmpl = conf[batch_sys_name][conf_key]
                 except KeyError:
                     pass
                 if batchview_cmd_tmpl is not None:
@@ -397,8 +397,10 @@ def main():
             if cylc.flags.debug:
                 cmd.append('--debug')
             for item in [logpath, mode, tail_tmpl, batchview_cmd]:
-                if item is not None:
+                if item:
                     cmd.append('--remote-arg=%s' % quote(item))
+                else:
+                    cmd.append('--remote-arg=')
             cmd.append(suite_name)
             capture = (mode == 'edit')
             try:

@sadielbartholomew sadielbartholomew added bug and removed bug labels Apr 16, 2018
@sadielbartholomew
Copy link
Collaborator

sadielbartholomew commented Apr 16, 2018

Starting re-review. Matt has suggested I put review on hold while he has a look.

My apologies for mistakenly adding the 'bug' label - I keep thinking I am typing in my terminal when I am actually active in the browser instead, and it is too easy to add some label by typing a few keys!

@hjoliver
Copy link
Member Author

Still 2 breakages.

Ah yes, sorry, I rushed that one (school holiday chaos here today).

@hjoliver
Copy link
Member Author

@matthewrmshin - I've rebased again to remove my --remote-args commit. Your original logic is easier to understand than sending a null arg value to keep the list order on the remote side, IMO.

@matthewrmshin
Copy link
Contributor

Should probably add conf/global-tests.rc to .gitignore as well?

@hjoliver
Copy link
Member Author

Done.

@matthewrmshin
Copy link
Contributor

Two final issues:

  1. tests/authentication/08-shared-fs.t should be reverted, because the new set_test_remote_host function does not cover the remote host with shared fs case.

  2. Still need this change:

diff --git a/bin/cylc-cat-log b/bin/cylc-cat-log
index 121fd7642..287f59362 100755
--- a/bin/cylc-cat-log
+++ b/bin/cylc-cat-log
@@ -368,7 +368,7 @@ def main():
                 conf = glbl_cfg().get_host_item("batch systems", host, user)
                 batchview_cmd_tmpl = None
                 try:
-                    batchview_cmd_tmpl = conf[batch_sys_name][key]
+                    batchview_cmd_tmpl = conf[batch_sys_name][conf_key]
                 except KeyError:
                     pass
                 if batchview_cmd_tmpl is not None:

@hjoliver
Copy link
Member Author

Done. Thanks again (I can't run shared-fs tests today, working at home).

Copy link
Contributor

@matthewrmshin matthewrmshin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests all good in my environment.

@sadielbartholomew
Copy link
Collaborator

Re-approving after review of commits from 7c721c2 inclusive. All the tests pass locally for me.

@hjoliver I will leave you to merge in case you want to conduct tests on your own work environment (or would like to wait for Oliver to also review), though I think we should merge or else create a follow-on PR before we make many more comments as comments numbering the hundreds can render a PR page inaccessible, as I discovered with Rose docs reviewing!

@hjoliver hjoliver merged commit f33703c into cylc:master Apr 16, 2018
@hjoliver hjoliver deleted the cat-log-refactor branch April 16, 2018 21:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants