Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bug in BaseRestartWorkChain.inspect_process #4166

Merged

Conversation

sphuber
Copy link
Contributor

@sphuber sphuber commented Jun 8, 2020

Fixes #4165

If at the end of the method, last_report is defined, meaning at least
one handler returned a report, it should return the corresponding exit
code, except it was erroneously accessing that of the report variable,
which could very well be None if the last called handler returned
nothing.

@sphuber sphuber added the pr/blocked PR is blocked by another PR that should be merged first label Jun 8, 2020
@codecov
Copy link

codecov bot commented Jun 8, 2020

Codecov Report

Merging #4166 into develop will decrease coverage by 0.02%.
The diff coverage is 100.00%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #4166      +/-   ##
===========================================
- Coverage    78.90%   78.89%   -0.01%     
===========================================
  Files          467      467              
  Lines        34481    34481              
===========================================
- Hits         27203    27199       -4     
- Misses        7278     7282       +4     
Flag Coverage Δ
#django 70.82% <100.00%> (-0.01%) ⬇️
#sqlalchemy 71.69% <100.00%> (-<0.01%) ⬇️
Impacted Files Coverage Δ
aiida/engine/processes/workchains/restart.py 54.06% <100.00%> (-0.67%) ⬇️
aiida/engine/daemon/client.py 72.58% <0.00%> (-1.14%) ⬇️
aiida/transports/plugins/local.py 80.21% <0.00%> (-0.25%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6f029df...f8a26ae. Read the comment docs.

@sphuber sphuber force-pushed the fix/4165/base-restart-report-exit-code branch from ff0b1e1 to 5d8f1d4 Compare June 9, 2020 10:12
@sphuber sphuber requested a review from ramirezfranciscof June 9, 2020 10:13
Copy link
Member

@ramirezfranciscof ramirezfranciscof left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! As a more general design question, what is the reason that you decided to return the last non-null report instead of all non-null reports?

Comment on lines 105 to 114
# Exit status 400 of the last calculation job will be handled and so should reset the flag
process.ctx.children.append(generate_calculation_node(exit_status=400))
result = process.inspect_process()

# Even though `handler_a` was followed by `handler_b`, we should retrieve the exit_code from `handler_a` because
# `handler_b` returned `None` which should not overwrite the last report.
assert isinstance(result, ExitCode)
assert result.status == 0
assert result.status == 418
assert result.message == 'IMATEAPOT'
assert process.ctx.unhandled_failure is False
Copy link
Member

@ramirezfranciscof ramirezfranciscof Jun 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are testing the reset, and although it might be redundant because you already check that a calculation with an exit status != 0 sets off the process.ctx.unhandled_failure, shouldn't it be anyways better to check that the status actually changed and was set back (instead of maybe just always staying False)? (so perhaps assert process.ctx.unhandled_failure is True just before calling result = process.inspect_process())

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may misunderstand your point, but the unhandled_failure is only changed by the workchain in the inspect_process call. So adding the assert process.ctx.unhandled_failure is True check before the call will fail.

Copy link
Member

@ramirezfranciscof ramirezfranciscof Jun 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I think you understood correctly. But then maybe the docstring of the test should say "Test ctx.unhandled_failure is set to False in inspect_process when a failed process is handled." or something like this? I mean, this is probably a nitpick, but using "reset" makes me think that at some point the ctx.unhandled_failure is set to True (like it happens in the previous test), which is not the case here from what I am seeing in the code right? Or perhaps I am not understanding what you mean by "reset"...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see now, you are right. There is a mistake. The idea of the test is that there is first a failed calculation, which sets unhandled_failure=True followed by another failed calculation, but that one returns a report, so it is not considered unhandled and so should reset the flag. The first calculation shouldn't be 0 then but some non-zero code. Will push a fix.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, ok, that makes more sense, haaha. This was why I was asking why you needed this previous calculation in the other PR. Good catch (Y)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch (Y)

Right back at ya. Thanks for the review 👍

@sphuber sphuber force-pushed the fix/4165/base-restart-report-exit-code branch from 5d8f1d4 to b87e2fd Compare June 9, 2020 11:27
If at the end of the method, `last_report` is defined, meaning at least
one handler returned a report, it should return the corresponding exit
code, except it was erroneously accessing that of the `report` variable,
which could very well be `None` if the last called handler returned
nothing.
@sphuber sphuber force-pushed the fix/4165/base-restart-report-exit-code branch from b87e2fd to f8a26ae Compare June 9, 2020 11:42
@sphuber sphuber removed the pr/blocked PR is blocked by another PR that should be merged first label Jun 9, 2020
@sphuber sphuber merged commit 5b89d6f into aiidateam:develop Jun 9, 2020
@sphuber sphuber deleted the fix/4165/base-restart-report-exit-code branch June 9, 2020 12:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BaseRestartWorkChain.inspect_process erroneously references report instead of last_report for final exit code
2 participants