Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for array jobs and include errors and (portion of) log file in the email #7

Closed
wants to merge 5 commits into from

Conversation

mkhodabandeh
Copy link

No description provided.

@mkhodabandeh
Copy link
Author

If you accept PRs and also think that this PR is worth merging, I can spend some time cleaning it up :)

@neilmunday
Copy link
Owner

Hi,

Thanks for the PR.

If you accept PRs and also think that this PR is worth merging, I can spend some time cleaning it up :)

Yes, I do accept pull requests. I'll try to take a look at your modifications later this week.

Cheers,

Neil.

Comment on lines +368 to +375
# if this server is an SMTP server:
s = smtplib.SMTP('localhost')
# Otherwise use a secondary server for example gmail:
# s = smtplib.SMTP('smtp.gmail.com', 587, timeout=60)
# s.ehlo()
# s.starttls()
# s.ehlo()
# s.login(USERNAME, APP_PASSWORD)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that it might be useful to add additional configuration options in conf.d/slurm-mail.conf to allow the user to define which mail server to use.

I'll add this as a separate issue to be addressed.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now implemented in 1.7 branch.

Comment on lines -308 to +362
msg['subject'] = 'Job %s.%d: %s' % (cluster, jobId, state)
msg['subject'] = 'Job %s.%s: %s' % (cluster, job_name, state)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer the job ID to be in the subject line of the e-mail. However, I could add an option to conf.d/slurm-mail.conf that allows the format of the subject to be customisable, a bit like the date format can. I'll add an issue for this.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Customisable e-mail subject now implemented in 1.7 branch.

@neilmunday neilmunday linked an issue Jul 12, 2020 that may be closed by this pull request
Comment on lines +269 to +309
try:
if '\n' in stdout:
stdout = stdout.split('\n')[job_count-1]
except:
stdout_last_100 += '\n\n Error in parsing this file \n \n'
rtnCode = 1
if rtnCode == 0:
jobDic = {}
if IS_PYTHON_3:
stdout = stdout.decode()
for i in stdout.split(' '):
x = i.split('=', 1)
if len(x) == 2:
jobDic[x[0]] = x[1]
stdoutFile = jobDic['StdOut']
stdoutFiles += stdoutFile
logging.info('stdoutFile: {}'.format(stdoutFile))
stderrFile = jobDic['StdErr']
stderrFile = '' if stderrFile == stdoutFile else stderrFile
stderrFiles += stderrFile
try:
with open(stdoutFile, 'r') as fread:
stdout_last_100 += stdoutFile+':\n'
for i, line in enumerate(fread):
stdout_last_100 += line
if i == 100:
break
stdout_last_100 += '\n\n'
except:
logging.exception("couldn't read the stdout file")
if stderrFile != '':
try:
with open(stderrFile, 'r') as fread:
stderr_content += stderrFile+':\n'
stderr_content += fread.read()+'\n\n'
except:
logging.exception("couldn't read the stderr file")
else:
logging.error('failed to run: %s' % cmd)
logging.error(stdout)
logging.error(stderr)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice idea adding some of the standard out to the e-mail. One slight issue though is that it is not always the case that the server running slurmctld also mounts the file system where the user's job output has been written to. For example, on supercomputers that use Slurm, often the parallel file system (e.g. Lustre) where the user's job output is written to is not mounted on the Slurm servers. I think it would better to remove this code for now in this particular pull request and just focus on job arrays.

@neilmunday
Copy link
Owner

Job arrays now implemented in 1.7 branch.

@neilmunday
Copy link
Owner

All features now implemented in version 1.7 plus additional enhancements

@neilmunday neilmunday closed this Sep 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for job arrays
2 participants