-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
job file: improve style #2141
job file: improve style #2141
Conversation
|
cc80989
to
b76c9de
Compare
Brilliant 💥 |
b76c9de
to
186609a
Compare
Please feel free to raise any concerns, objections, etc that you may feel with this change. In particular:
|
Having another thought about this.
I have not heard of a use case for this, but real users often do the unexpected. I have separated
This is now essential because I have positioned error trapping in the static main function. This means bash syntax error in user scripts will not be trapped. A quick Please comment on preference. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. I think that syntax checking is a good idea. Will approve once finalised.
lib/cylc/job_file.py
Outdated
@@ -89,21 +84,32 @@ def write(self, local_job_file_path, job_conf): | |||
os.chmod(local_job_file_path, mode) | |||
|
|||
@classmethod |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These class methods could be static methods
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
7850a68
to
5f60d7b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking really good, but I'm not quite done yet, sorry (including thinking about your preference question above...)
lib/cylc/job.sh
Outdated
if declare -F "cylc::job::inst::${NAME}" 1>'/dev/null' 2>&1; then | ||
"cylc::job::inst::${NAME}" | ||
fi | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Insert blank line, for consistency?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed.
5f60d7b
to
e6b784a
Compare
(I have now added |
I can still set |
I can't think of a use case either, so I'm fine with this so long as we document that the script items have to be valid on their own. |
I like this idea, but perhaps only as a non-default option - it might really slow down validation of very large suites? Actually we might want run time checking anyway, to handle changes via broadcast or edit runs. |
I was concerned about this, but it seems you do mean literally just syntax errors, which are now handled by |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just the ksh question above, otherwise all good here.
On ksh, we have advertised to our users that cylc job script should all be written in bash, so I hope we don't have any suites doing On trapping syntax error... Yes, only bash syntax error is essential until the error trap is switched on - because everything is wrapped in functions, nothing actually gets executed until the main function is called. However, if bash fails to parse the script, it will not get to the main function. |
ACK! We still are using Korn shell, although most of the scripts should validate as bash scripts as we're not doing anything really fancy. For the sake of our operational partners, we also have the -S set in the job submission tasks. Apologies that this is late to the game - I've been behind and decided to drop in and see what was new with pull requests! |
@matthewrmshin - looks like @trwhitcomb has thrown a spanner in the works! Our own fault for support ksh in the first place I suppose. Tim, if you look at a job script generated by a suite running with this branch, it's a beautiful thing - all the boiler-plate code is gone, leaving just what users want to see. But as the job file becomes more sophisticated it gets harder to support both shells because they have less in common in terms of function syntax etc. If your users don't use screeds of inlined shell scripting (?do they?) it's very likely their tasks will still work as-is under bash. What do you think - is this a justifiable change to force on users? |
@trwhitcomb Thanks for the info. Some quick questions:
|
Reduce number of environment variables in job file. * Mark old variables as duplicates. * Move communication related variables to contact file. * Remove variables that are pointless. Improve diagnostics. * Job script and `cylc message` no longer repeat one another. * Report only relevant items for jobs. E.g. don't report suite host, port, etc. Quote shell and environment variable values. Remove space from keys in "job_conf".
Static part of the job script now pull out, and loaded into the job script in runtime. This allows the generated part of the job script to be reasonably small.
Anything is possible. If we only need to consider ksh93 and not ksh88, I'll only need to change:
|
I have now brought back support for ksh (but only ksh93). I have also introduced |
doc/suiterc.tex
Outdated
This is any custom script to be invoked at the end the error trap. | ||
The output of this will always be sent to STDERR and \lstinline=$1= is set to | ||
the name of the signal caught by the error trap. The script should fast and | ||
uses very little system resource to ensure that the error trap can return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uses
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed.
graph = foo | ||
[runtime] | ||
[[foo]] | ||
script = test "${CYLC_TASK_SUBMIT_NUMBER}" -ge 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could potentially use some ksh only syntax e.g. print
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
doc/cug.tex
Outdated
|
||
When a task is ready cylc generates a {\em task job script} that encapsulates | ||
the runtime settings (environment and scripts) defined for it in the suite.rc | ||
file. The job script is submitted to run by the {\em job submission method} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This old terminology should be updated ("job submission method" --> "batch system")
doc/cug.tex
Outdated
|
||
Therefore, commands in user scripts should abort with non-zero exit status if a | ||
fatal error occurs (this is just good coding practice anyway). To avoid a | ||
command that expects a non-zero return code from triggering a trap, you can use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably should to "to prevent a command..."
doc/cug.tex
Outdated
Like other runtime properties, you can set a suite default job submission | ||
method and override it for specific tasks or families: | ||
When a task is ready cylc generates a job script (see~\ref{JobScripts}). The | ||
job script is submitted to run by the {\em job submission method} chosen for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"batch system"
doc/suiterc.tex
Outdated
|
||
\begin{myitemize} | ||
\item {\em type:} string | ||
\item {\em default:} (none) | ||
\item {\em example:} \lstinline@env-script = "echo Hello World"@ | ||
\end{myitemize} | ||
|
||
\paragraph[err-script]{[runtime] \textrightarrow [[\_\_NAME\_\_]] \textrightarrow err-script} | ||
|
||
This is any custom script to be invoked at the end the error trap. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"... at the end of the error trap, if the task fails."?
doc/suiterc.tex
Outdated
location. | ||
{\em Note: It has no bearing on any sub-shells that may be called by the job script.} | ||
|
||
Setting this to the path of a ksh93 interpretter is supported but not recommended. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One 't' in "interpreter".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we give a deprecation warning for ksh93 (here, and print one if detected on file parsing) - so we can drop support sometime in the future?
lib/cylc/job.sh
Outdated
export CYLC_TASK_CYCLE_POINT="$(cut -d '/' -f 1 <<<"${CYLC_TASK_JOB}")" | ||
export CYLC_TASK_NAME="$(cut -d '/' -f 2 <<<"${CYLC_TASK_JOB}")" | ||
# The "10#" part ensures that the submit number is interpretted in base 10. | ||
# Otherwise, a zero padded number will be interpretted as an octal. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One 't' in 'interpreted'.
lib/cylc/job.sh
Outdated
} | ||
|
||
############################################################################### | ||
# Run a function in the task job instant file, if possible. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"instance"?
A few minor doc and comment typos, otherwise this is all good from my perspective. |
I have reinstated |
would you mind explaining that (so I don't have to figure it out)? Also, could you list in one place any other of the old env vars removed? |
A minimal local test suite on this branch now has task messaging failures:
|
Here's a diff of Questions:
1,11c1,3
< JOB SCRIPT STARTING
< cylc Suite and Task Identity:
< Suite Name : foo
< Suite Host : localhost
< Suite Port : 43076
< Suite Owner : vagrant
< Task ID : foo.1
< Task Host : vagrant-ubuntu-trusty-64
< Task Owner : vagrant
< Task Submit No.: 1
< Task Try No.: 1
---
> Suite : foo
> Task Job : 1/foo/01 (try 1)
> User@Host: vagrant@vagrant-ubuntu-trusty-64
14,17c6
< cylc (scheduler - 2017-02-17T22:18:45Z): started at 2017-02-17T22:18:45Z
< CYLC_SUITE_RUN_DIR_ON_SUITE_HOST=/home/vagrant/cylc-run/foo
< CYLC_TASK_SSH_LOGIN_SHELL=True
< CYLC_DEBUG=False
---
> 2017-02-17T22:17:44Z NORMAL - started
21d9
< CYLC_TASK_MSG_TIMEOUT=30.000000
29d16
< CYLC_MODE=scheduler
40d26
< CYLC_TASK_MSG_RETRY_INTVL=5.000000
43d28
< CYLC_JOB_PID=9475
45,46d29
< CYLC_DIR_ON_SUITE_HOST=/home/vagrant/cylc
< CYLC_TASK_MSG_MAX_TRIES=7
49,50d31
< CYLC_SUITE_PORT=43076
< CYLC_TASK_COMMS_METHOD=default
56,57c37,38
< cylc (scheduler - 2017-02-17T22:18:49Z): succeeded at 2017-02-17T22:18:49Z
< JOB SCRIPT EXITING (TASK SUCCEEDED)
---
> CYLC_TASK_JOB=1/foo/01
> 2017-02-17T22:17:54Z NORMAL - succeeded |
To answer my own question, I think it is clear that none of those variables would be useful in user scripting (although perhaps occasionally useful diagnostic info?). So just internal use of |
Sorry. I'll try to fix the issue on Monday. I don't think things like port number needs to be in the job environment. They are now in the contact file. Other variables such as job pid are now in the job status file. |
The contact file is now not written to the remote job host if communication method is "poll". It does not make sense to have a contact file when backward communication is not possible. |
(Travis CI is happy again.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, all good now.
Hi guys - sorry for the delay coming back around on this. Was all set then we lost power for a few days! I'm OK with this now that I've had a chance to look at what we're doing. For posterity, here are the answers to questions from @matthewrmshin and why I think we're fine now:
I don't think we'll be able to convert all our scripts over to Bash due to the aforementioned policy. However, almost all our tasks have a "script" command that just points out to run another script, so you'd just be in the situation where the Bash script written out by cylc will call a Korn shell script written by us. Any inline scripting we have looks to be bash and Korn compatible. Nice work, guys! |
@trwhitcomb you should now be okay as we have brought back ksh support (ksh93 only, however). I can understand policy like what you have described. It hasn't been long since we moved away, and there are still many users uncomfortable with the idea of moving away from ksh. |
Separate the job script into static part and generated part.
lib/cylc/job.sh
, which will be loaded by the job file in run time.cylc.job_file
. It is now quite small and contains:Reduce number of environment variables in job file.
Improve diagnostics.
cylc message
no longer repeat one another.err-script
allows user defined scripts to run at the end of the error trap. Close Dump environment to stderr on job fail? #2150.Quote and brace shell and environment variable values.
Remove space from keys in "job_conf".
(Rationale behind this change is that I want to review the logic for job script generation before I move on to look at #2043.)