Move the worker into a separate microservice #504

stan-dot · 2024-06-13T07:50:30Z

We should strive to avoid bloat like with GDA and possibly those microservices could be deployed separately.

graph TD
    CLI[CLI Tool] -->|Submits jobs| A[FastAPI Frontend]
    CLI -->|Previews IoT devices| IoT[IoT Device Representation in ophyd-async]
    CLI -->|Listens to| D[Message Bus]
    A -->|Reads jobs| B[Job Queue]
    B --> C[Worker Service]
    C -->|Emits results| D
    D --> E[Result Processing Service]
    C -->|Protobuf| F[Other Microservices]
    style CLI fill:#fff,stroke:#000,stroke-width:2px
    style A fill:#fff,stroke:#000,stroke-width:2px
    style B fill:#fff,stroke:#000,stroke-width:2px
    style C fill:#fff,stroke:#000,stroke-width:2px
    style D fill:#fff,stroke:#000,stroke-width:2px
    style E fill:#fff,stroke:#000,stroke-width:2px
    style F fill:#fff,stroke:#000,stroke-width:2px
    style IoT fill:#fff,stroke:#000,stroke-width:2px

callumforrester · 2024-06-13T11:58:01Z

The subprocess already does some of this. We could separate the management and worker processes out completely, I'm not opposed to that. See also #371

What is the IoT device service for?

stan-dot · 2024-06-13T12:12:20Z

Yeah that inaccurate but basically ophyd async

callumforrester · 2024-06-18T06:38:30Z

Are the ophyd-async devices instantiated twice? Once in the worker service and once in the IoT service? There's no arrow between them.

stan-dot · 2024-06-18T07:39:52Z

I was not precise, now I changed it from 'service' to 'representation' to be more abstract. in essence this would entail splitting of the context:


@dataclass
class BlueskyContext:
    """
    Context for building a Bluesky application.

    The context holds the RunEngine and any plans/devices that you may want to use.
    """

    run_engine: RunEngine = field(
        default_factory=lambda: RunEngine(context_managers=[])
    )
    plans: dict[str, Plan] = field(default_factory=dict)
    devices: dict[str, Device] = field(default_factory=dict)
    plan_functions: dict[str, PlanGenerator] = field(default_factory=dict)

    _reference_cache: dict[type, type] = field(default_factory=dict)

we would keep the devices and plans (as everything available on the beamline) and only send 1 plan with the devices needed there to the worker service. The worker service would keep the RunEngine.

callumforrester · 2024-06-18T14:23:18Z

I think I'm going to need even more precision, what out of the following objects lives in which process?

RunEngine
Plan functions
Plan metadata
Device objects
Device metadata

stan-dot · 2024-06-18T14:33:30Z

worker service

RunEngine
the current plan function
plan metadata
devices instantiated for this plan

REST gateway

all the plans for a beamline
DeviceModel s for devices on a beamline

the metadata part is less familiar to me, not sure.

callumforrester · 2024-06-18T14:44:11Z

So devices are instantiated lazily (when a plan that needs them is run)? And then destroyed when the plan finishes?

stan-dot · 2024-06-19T08:57:00Z

not sure, it's a parameter independent of whether RE is separate or not. I haven't got the picture in that details

callumforrester · 2024-06-19T14:34:43Z

@stan-dot How about this?

graph TD
    CLI[Client] -->|Submits jobs| A[FastAPI Frontend]
    D[Message Bus] -->|Provides results| CLI
    A -->|Protobuf| C[Worker Service]
    C -->|Emits results| D
    D --> E[Result Processing Service]
    style CLI fill:#fff,stroke:#000,stroke-width:2px
    style A fill:#fff,stroke:#000,stroke-width:2px
    style C fill:#fff,stroke:#000,stroke-width:2px
    style D fill:#fff,stroke:#000,stroke-width:2px
    style E fill:#fff,stroke:#000,stroke-width:2px

I've chopped out the IoT service because it is out of scope, but it can be added later.

stan-dot · 2024-06-19T15:09:14Z

yeah that's good, one place that could get more detail is message bus -> provides results

There are raw results there, and additional subscription from the CLI client to get the results processed?

unless the result processing service sends back processed results to the message bus. If so , I'd split the message bus into 2 channels - one for raw and the other for processed data

callumforrester · 2024-06-20T06:27:53Z

That's left vague on purpose because it is a case-by-case thing. In reality there will be many downstream services that do things to the data, and that is out-of-scope for this project, our job is just to provide data in a nice, well-known format (a.k.a. event-model)

I think we're beginning to get a design out of this thing though, which is good. Another important detail is hot reloading. We've found this to be a very useful feature on current blueapi. When we change a plan, we just need to poke an endpoint to reload the code into blueapi (~5s), where before we had to restart the pod (~30s). We get this via the subprocess arrangement at the moment and are keen not to lose it.

stan-dot · 2024-06-20T09:14:25Z

I mean we wouldn't lose that the logic would just be carried over like everything else.

stan-dot · 2024-10-03T14:36:01Z

focusing on different things at the moment

stan-dot added enhancement New feature or request question Further information is requested dependencies Pull requests that update a dependency file python Pull requests that update Python code labels Jun 13, 2024

This was referenced Jun 20, 2024

Support lazy devices #515

Closed

Failure to reload the subprocess gets blueapi into an unrecoverable state #512

Closed

Delete the 'worker' abstract class #479

Closed

stan-dot mentioned this issue Jul 4, 2024

Explore how to keep local cache for beamline alignment #537

Open

callumforrester mentioned this issue Aug 8, 2024

Subprocess functions can't be picked when decorated for observability #590

Closed

stan-dot mentioned this issue Aug 21, 2024

Handle parameterised generics #598

Merged

stan-dot self-assigned this Aug 27, 2024

stan-dot changed the title ~~Consider moving the worker around (run engine) into a separate microservice~~ Move the worker into a separate microservice Oct 1, 2024

stan-dot removed their assignment Oct 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Move the worker into a separate microservice #504

Move the worker into a separate microservice #504

stan-dot commented Jun 13, 2024 •

edited

Loading

callumforrester commented Jun 13, 2024

stan-dot commented Jun 13, 2024

callumforrester commented Jun 18, 2024

stan-dot commented Jun 18, 2024

callumforrester commented Jun 18, 2024

stan-dot commented Jun 18, 2024

callumforrester commented Jun 18, 2024

stan-dot commented Jun 19, 2024

callumforrester commented Jun 19, 2024

stan-dot commented Jun 19, 2024

callumforrester commented Jun 20, 2024

stan-dot commented Jun 20, 2024

stan-dot commented Oct 3, 2024

Move the worker into a separate microservice #504

Move the worker into a separate microservice #504

Comments

stan-dot commented Jun 13, 2024 • edited Loading

callumforrester commented Jun 13, 2024

stan-dot commented Jun 13, 2024

callumforrester commented Jun 18, 2024

stan-dot commented Jun 18, 2024

callumforrester commented Jun 18, 2024

stan-dot commented Jun 18, 2024

callumforrester commented Jun 18, 2024

stan-dot commented Jun 19, 2024

callumforrester commented Jun 19, 2024

stan-dot commented Jun 19, 2024

callumforrester commented Jun 20, 2024

stan-dot commented Jun 20, 2024

stan-dot commented Oct 3, 2024

stan-dot commented Jun 13, 2024 •

edited

Loading