-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[IO-1196][internal] Workflow data models #622
Conversation
@@ -0,0 +1,172 @@ | |||
from datetime import datetime |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For ease of review, I'm going to mark those of these that are pretty much what they need to be, and those which are prone to change.
from darwin.future.pydantic_base import DefaultDarwin | ||
|
||
|
||
class WFDataset(DefaultDarwin): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WFDataset
is a simple model, and pretty much exactly what we need it to be already.
name: str | ||
instructions: str = Field(min_length=0) | ||
|
||
def __int__(self) -> int: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added these methods for ease, they may prove useful, they are better that running the Pydantic initial dunders in this case, and if they don't prove useful, they took very little time to do.
|
||
|
||
class WFEdge(DefaultDarwin): | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WFEdge
seems largely complete, we may find more to add to it later, but it works as a graph node already.
Was considering the validity of a graph solver function. E.g. a function that confirms the graph is valid within our parameters, this would be simple (it's similar to linked list traversal), but until we need it, lets not bother.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would constitute an unsolvable system? disconnected stages that can't be visited? That would seem an easy check with recursion and a 'visited' array.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a vague reference, but for instance a workflow with unconnected nodes, or nodes connected in one infinite loop, two loops, etc - these are essentially a graph, but we have imposed logic on that.
I don't know if it's necessary for us to do anything with it, as the backend may already validate it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Validation exists somewhere, but might only be on the frontend. If you do try to create disconnected graphs it complains (only when you click save though, so I assume it's on the backend). It might not be that relevant to double up validation.
|
||
|
||
class WFType(Enum): | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this is complete, but we may find that we need to add an extra stage for a hidden type that isn't apparent, or that I've not thought of.
|
||
|
||
class WFUser(DefaultDarwin): | ||
stage_id: UUID |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems complete by the data structures in the JSON, which are concise and clear in this case
|
||
|
||
class WFStageConfig(DefaultDarwin): | ||
# ! NB: We may be able to remove many of these attributes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I included this to allow validation with completeness, but many of these values had no assigned type at any stage in any of the data I could get.
I suspect we don't need many of these for backend functionality, but validating them won't hurt, and we can evolve this object more freely as the comments reflect
|
||
|
||
class WFStage(DefaultDarwin): | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confident in this one, we may find some are either unnecessary or we need extra info added.
|
||
|
||
class Workflow(DefaultDarwin): | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The complex extraneous items made the final workflow object simple, and I'm confident in this one.
@@ -1,8 +1,9 @@ | |||
import unittest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incidental tidying, unrelated to this
@@ -0,0 +1,7 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Example file used for test - NB: all IDs are randomly generated/madeup, unrelated to production system IDs
@@ -0,0 +1,6 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Example file used for test - NB: all IDs are randomly generated/madeup, unrelated to production system IDs
@@ -0,0 +1,45 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Example file used for test - NB: all IDs are randomly generated/madeup, unrelated to production system IDs
@@ -0,0 +1,26 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Example file used for test - NB: all IDs are randomly generated/madeup, unrelated to production system IDs
@@ -0,0 +1,4 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Example file used for test - NB: all IDs are randomly generated/madeup, unrelated to production system IDs
@@ -0,0 +1,75 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Example file used for test - NB: all IDs are randomly generated/madeup, unrelated to production system IDs
"type": "annotate" | ||
} | ||
], | ||
"team_id": 1337, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Told you they were made up
|
||
|
||
def test_sad_paths() -> None: | ||
dataset = WFDataset.parse_file(validate_dataset_json) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sad path testing is largely unnecessary for pydantic, and when successful conversely ends up E2E testing Pydantic itself.
It was handy in writing the objects correctly with the behaviour I expected, and I used sad path tests less for successive objects.
assert str(parsed_stage.id) == "e69d3ebe-6ab9-4159-b44f-2bf84d29bb20" | ||
|
||
|
||
def test_raises_with_invalid_uuid() -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was originally a social test for a UUID validator that I wrote, before finding out Pydantic had introduced proper support for this.
Leaving it in place provides both a sanity test and a test that the pydantic support remains for all the objects that use this, so I left it in.
def test_Workflow_validates_from_valid_json() -> None: | ||
parsed_set = Workflow.parse_file(validate_json) | ||
|
||
assert isinstance(parsed_set, Workflow) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Included a significant amount of testing here, as the Workflow
object is the object that encompasses all the other objects into one place.
This test passing ensures essentially that the other objects also parse correctly, at least in the main.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like the amount of testing, although with pydantic objects I've just been trusting that they validate correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All looks pretty good, no changes required I think. Data objects are very fleshed out.
def test_Workflow_validates_from_valid_json() -> None: | ||
parsed_set = Workflow.parse_file(validate_json) | ||
|
||
assert isinstance(parsed_set, Workflow) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like the amount of testing, although with pydantic objects I've just been trusting that they validate correctly.
Problem
Introducing workflows will require data transfer of workflows data structures. We don't have these
Solution
Have taken the JSON objects the workflows endpoints return (provided by @Nathanjp91) and created pydantic objects of all applicable objects, for validation and parsing
Changelog
Introduced 5 objects in
darwin/future/data_objects
and relevant tests