Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move the worker into a separate microservice #504

Open
stan-dot opened this issue Jun 13, 2024 · 13 comments
Open

Move the worker into a separate microservice #504

stan-dot opened this issue Jun 13, 2024 · 13 comments
Labels
dependencies Pull requests that update a dependency file enhancement New feature or request python Pull requests that update Python code question Further information is requested

Comments

@stan-dot
Copy link
Contributor

stan-dot commented Jun 13, 2024

We should strive to avoid bloat like with GDA and possibly those microservices could be deployed separately.

graph TD
    CLI[CLI Tool] -->|Submits jobs| A[FastAPI Frontend]
    CLI -->|Previews IoT devices| IoT[IoT Device Representation in ophyd-async]
    CLI -->|Listens to| D[Message Bus]
    A -->|Reads jobs| B[Job Queue]
    B --> C[Worker Service]
    C -->|Emits results| D
    D --> E[Result Processing Service]
    C -->|Protobuf| F[Other Microservices]
    style CLI fill:#fff,stroke:#000,stroke-width:2px
    style A fill:#fff,stroke:#000,stroke-width:2px
    style B fill:#fff,stroke:#000,stroke-width:2px
    style C fill:#fff,stroke:#000,stroke-width:2px
    style D fill:#fff,stroke:#000,stroke-width:2px
    style E fill:#fff,stroke:#000,stroke-width:2px
    style F fill:#fff,stroke:#000,stroke-width:2px
    style IoT fill:#fff,stroke:#000,stroke-width:2px

Loading
@stan-dot stan-dot added enhancement New feature or request question Further information is requested dependencies Pull requests that update a dependency file python Pull requests that update Python code labels Jun 13, 2024
@callumforrester
Copy link
Contributor

The subprocess already does some of this. We could separate the management and worker processes out completely, I'm not opposed to that. See also #371

What is the IoT device service for?

@stan-dot
Copy link
Contributor Author

Yeah that inaccurate but basically ophyd async

@callumforrester
Copy link
Contributor

Are the ophyd-async devices instantiated twice? Once in the worker service and once in the IoT service? There's no arrow between them.

@stan-dot
Copy link
Contributor Author

I was not precise, now I changed it from 'service' to 'representation' to be more abstract. in essence this would entail splitting of the context:


@dataclass
class BlueskyContext:
    """
    Context for building a Bluesky application.

    The context holds the RunEngine and any plans/devices that you may want to use.
    """

    run_engine: RunEngine = field(
        default_factory=lambda: RunEngine(context_managers=[])
    )
    plans: dict[str, Plan] = field(default_factory=dict)
    devices: dict[str, Device] = field(default_factory=dict)
    plan_functions: dict[str, PlanGenerator] = field(default_factory=dict)

    _reference_cache: dict[type, type] = field(default_factory=dict)

we would keep the devices and plans (as everything available on the beamline) and only send 1 plan with the devices needed there to the worker service. The worker service would keep the RunEngine.

@callumforrester
Copy link
Contributor

I think I'm going to need even more precision, what out of the following objects lives in which process?

  • RunEngine
  • Plan functions
  • Plan metadata
  • Device objects
  • Device metadata

@stan-dot
Copy link
Contributor Author

worker service

  • RunEngine
  • the current plan function
  • plan metadata
  • devices instantiated for this plan

REST gateway

  • all the plans for a beamline
  • DeviceModel s for devices on a beamline

the metadata part is less familiar to me, not sure.

@callumforrester
Copy link
Contributor

So devices are instantiated lazily (when a plan that needs them is run)? And then destroyed when the plan finishes?

@stan-dot
Copy link
Contributor Author

not sure, it's a parameter independent of whether RE is separate or not. I haven't got the picture in that details

@callumforrester
Copy link
Contributor

@stan-dot How about this?

graph TD
    CLI[Client] -->|Submits jobs| A[FastAPI Frontend]
    D[Message Bus] -->|Provides results| CLI
    A -->|Protobuf| C[Worker Service]
    C -->|Emits results| D
    D --> E[Result Processing Service]
    style CLI fill:#fff,stroke:#000,stroke-width:2px
    style A fill:#fff,stroke:#000,stroke-width:2px
    style C fill:#fff,stroke:#000,stroke-width:2px
    style D fill:#fff,stroke:#000,stroke-width:2px
    style E fill:#fff,stroke:#000,stroke-width:2px
Loading

I've chopped out the IoT service because it is out of scope, but it can be added later.

@stan-dot
Copy link
Contributor Author

yeah that's good, one place that could get more detail is message bus -> provides results

There are raw results there, and additional subscription from the CLI client to get the results processed?

unless the result processing service sends back processed results to the message bus. If so , I'd split the message bus into 2 channels - one for raw and the other for processed data

@callumforrester
Copy link
Contributor

That's left vague on purpose because it is a case-by-case thing. In reality there will be many downstream services that do things to the data, and that is out-of-scope for this project, our job is just to provide data in a nice, well-known format (a.k.a. event-model)

I think we're beginning to get a design out of this thing though, which is good. Another important detail is hot reloading. We've found this to be a very useful feature on current blueapi. When we change a plan, we just need to poke an endpoint to reload the code into blueapi (~5s), where before we had to restart the pod (~30s). We get this via the subprocess arrangement at the moment and are keen not to lose it.

@stan-dot
Copy link
Contributor Author

I mean we wouldn't lose that the logic would just be carried over like everything else.

@stan-dot stan-dot self-assigned this Aug 27, 2024
@stan-dot stan-dot changed the title Consider moving the worker around (run engine) into a separate microservice Move the worker into a separate microservice Oct 1, 2024
@stan-dot
Copy link
Contributor Author

stan-dot commented Oct 3, 2024

focusing on different things at the moment

@stan-dot stan-dot removed their assignment Oct 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Pull requests that update a dependency file enhancement New feature or request python Pull requests that update Python code question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants