Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZocaloResults: add parameter to use results from GPU #763

Open
wants to merge 21 commits into
base: main
Choose a base branch
from

Conversation

olliesilvester
Copy link
Collaborator

@olliesilvester olliesilvester commented Aug 29, 2024

Fixes #559

See related mx_bluesky PR: DiamondLightSource/mx-bluesky#445

Instructions to reviewer on how to test:

Check this PR adds and tests the following behavior:

If the GPU is toggled on, then:

  • Wait for results twice
  • If CPU arrives before GPU, then warn and continue
  • if CPU results differ from GPU within a tolerance, log the two results and continue
  • if GPU times out but CPU results arrive, warn and continue
  • if GPU results arrive but CPU results timeout, error
  • Always use CPU results after comparing the two
    If it's off:
  • Ignore results from GPU

Checks for reviewer

  • Would the PR title make sense to a scientist on a set of release notes
  • If a new device has been added does it follow the standards
  • If changing the API for a pre-existing device, ensure that any beamlines using this device have updated their Bluesky plans accordingly
  • Have the connection tests for the relevant beamline(s) been run via dodal connect ${BEAMLINE}

olliesilvester and others added 2 commits August 29, 2024 16:25
Co-authored-by: Nicholas Devenish <ndevenish@gmail.com>
Copy link

codecov bot commented Aug 29, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 94.98%. Comparing base (3f59ce0) to head (ff848a6).
Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #763      +/-   ##
==========================================
+ Coverage   94.91%   94.98%   +0.07%     
==========================================
  Files         115      115              
  Lines        4677     4709      +32     
==========================================
+ Hits         4439     4473      +34     
+ Misses        238      236       -2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ndevenish
Copy link

Do you have pre-commits installed/activated? It looks like it hasn't been linted/auto-formatted (the unspaced comment #Only is a tell)

@olliesilvester olliesilvester marked this pull request as ready for review September 3, 2024 08:32
await zocalo_results.stage()
zocalo_results._raw_results_received.get = MagicMock(side_effect=Empty)
with pytest.raises(NoResultsFromZocalo):
await zocalo_results.trigger()
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was a strange thing here where this test would fail if I used RE(bps.trigger(zocalo_results) instead, even though debugging the pytest showed the exception was being raised

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The RE might swallow the exception or transform it?

Copy link
Contributor

@DominicOram DominicOram left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thank you. Couple of comments in the code.

src/dodal/devices/zocalo/zocalo_results.py Show resolved Hide resolved
@@ -79,6 +92,7 @@ def __init__(
self._prefix = prefix
self._raw_results_received: Queue = Queue()
self.transport: CommonTransport | None = None
self.use_cpu_and_gpu_zocalo = use_cpu_and_gpu_zocalo
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: This device is already has zocalo in the name so having this variable with zocalo in is a bit redundant. It would also be good to have a docstring on it

src/dodal/devices/zocalo/zocalo_results.py Outdated Show resolved Hide resolved
)
else:
# Only add to queue if results are from CPU
if not recipe_parameters.get("gpu"):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should: A unit test covering this would be good.


if self.use_cpu_and_gpu_zocalo:
self._raw_results_received.put(
{"results": results, "ispyb_ids": recipe_parameters}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could: ispyb_ids isn't a great name any more as it contains other stuff.

name="zocalo", zocalo_environment="dev_artemis", timeout_s=2
name="zocalo",
zocalo_environment="dev_artemis",
timeout_s=2,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could: It's not great that we're waiting on this timeout for the test. Ideally we would mock it out so there's no wait on it but if not could we at least drop it down to as low as possible?

# Wait for results from CPU and GPU, warn and continue if one timed out, error if both time out
if self.use_cpu_and_gpu_zocalo:
if source_of_first_results == "CPU":
LOGGER.warning("Recieved zocalo results from CPU before GPU")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit:

Suggested change
LOGGER.warning("Recieved zocalo results from CPU before GPU")
LOGGER.warning("Received zocalo results from CPU before GPU")

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dperl-dls and I combined have misspelt this word 10 times across mx_bluesky and dodal!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So have I tbf


except Empty:
LOGGER.warning(
f"Zocalo results from {source_of_second_results} timed out. Using results from {source_of_first_results}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@neil-i03 To confirm here, this will mean that if we have the GPU results turned on and we do not get a result from CPU zocalo within half the usual timeout time (180/2=90s) we will use the GPU result. Do we feel confident enough in trusting this or do we still want to only ever use the CPU result for now, even if it means we will timeout?

await zocalo_results.stage()
zocalo_results._raw_results_received.get = MagicMock(side_effect=Empty)
with pytest.raises(NoResultsFromZocalo):
await zocalo_results.trigger()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The RE might swallow the exception or transform it?

zocalo_results = ZocaloResults(
name="zocalo",
zocalo_environment="dev_artemis",
timeout_s=2,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, not ideal to be actually waiting on this timeout in tests

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Prepare ZocaloResults to use multiple results sources
3 participants