Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[IO-1176][internal] WorkflowQuery #627

Merged
merged 30 commits into from
Jul 10, 2023
Merged

[IO-1176][internal] WorkflowQuery #627

merged 30 commits into from
Jul 10, 2023

Conversation

owencjones
Copy link
Contributor

Problem

Currently we can't query workflows

Solution

Have implemented a Workflow query object in standard style

Changelog

  • Added WorkflowQuery and tests
  • Refactored get_workflow, and list_workflows to be monadic
  • Some other improvements of a minor nature.

@linear
Copy link

linear bot commented Jul 6, 2023

IO-1176 Meta: Workflow Query

Class WorkflowQuery(Query):
    ...

Client side filtering as it doesn't look like workflows api has filtering, is a simple GET request with the api-key in header.

@@ -12,7 +12,7 @@
from darwin.future.core.types.common import QueryString
from darwin.future.exceptions.client import NotFound, Unauthorized

JSONType = Dict[str, Any] # type: ignore
JSONType = Union[Dict[str, Any], List[Dict[str, Any]]] # type: ignore
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The JSON that is returned from the workflows endpoint is a JSON array of objects, so this needed updating.

@@ -53,8 +53,15 @@ class WFEdge(DefaultDarwin):

id: UUID
name: str
source_stage_id: UUID
target_stage_id: UUID
source_stage_id: Optional[UUID]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WFEdges only have both source_stage_id and target_stage_id if they are in the middle of a workflow. Those at the ends only have one, so added Optionals and a validator requiring one or the other.


from rich.console import Console


Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a potential exception handler, after working out how to discover whether the lib is being used from CLI or SDK.

@@ -2,9 +2,8 @@

from darwin.future.core.client import Client
from darwin.future.core.datasets.list_datasets import list_datasets
from darwin.future.core.types.query import Modifier, Param, Query, QueryFilter
from darwin.future.core.types.query import Param, Query, QueryFilter
from darwin.future.data_objects.dataset import Dataset
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incidental fix along the way.

@owencjones owencjones marked this pull request as ready for review July 7, 2023 10:25
exception : Optional[Union[Exception, List[Exception]]]
The exception(s) to handle
"""
IS_INTERACTIVE_SESSION = sys.stdout and sys.stdout.isatty()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is neat, I didn't know you could infer this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, nor did I, but I realised it was likely possible, and did some Googling.

console = Console()
handler = console.print
else:
logger = logging.getLogger(__name__)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should definitely use more logging in our code, but we also have to define a good strategy of where, what to capture. I think our current strategy just captures the CLI command being run which is kind of useless as the customer can just tell us that anyway.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, agreed, but it feels like a piecemeal job like this, rather than a big all in one job?


return workflows

def _execute_filters(self, workflows: List[Workflow], filter: QueryFilter) -> List[Workflow]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is my devising but for these complex objects I'm less enamoured with this filtering approach... but I can't think of anything better that still allows for the customizability for all the different (and custom) types we will have. Open to any suggestions if you've got them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Best I can think of is maybe a member taking a list of filtering functions?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The order should be reversed, so we only iterate over the list once rather than once per filter. Doesn't really matter here but for big datasets for instance it might become an issue. With the inheritance and typing though it can easily be implemented later for objects that need it. Just have to overwrite (or not even use) generic_execute_filter with one that is O(n) instead.

Maybe filters becomes a list of functions that get called so something like

filtered = [x for x in unfiltered if any([y(x) for y in filters])]

is possible

from darwin.future.meta.objects.base import MetaBase


class WorkflowMeta(MetaBase[Workflow]):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should swap this ordering around, people are going to be using meta objects more than they are Core, so think it makes more sense to have WorfklowCore and the Meta is just Workflow. Doesn't have to be done now though but I might do it at some point.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yeah, that makes sense. Feels like a all of them at once job, otherwise it doesn't make much sense - they fall out of sync with each other.

Copy link
Contributor

@Nathanjp91 Nathanjp91 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't have any changes requested. Some comments we should discuss and think about though.

@owencjones owencjones merged commit e82240d into master Jul 10, 2023
@Nathanjp91 Nathanjp91 deleted the io-1176 branch November 8, 2023 10:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants