[IO-1196][internal] Workflow data models #622

owencjones · 2023-06-30T08:50:39Z

Problem

Introducing workflows will require data transfer of workflows data structures. We don't have these

Solution

Have taken the JSON objects the workflows endpoints return (provided by @Nathanjp91) and created pydantic objects of all applicable objects, for validation and parsing

Changelog

Introduced 5 objects in darwin/future/data_objects and relevant tests

linear · 2023-06-30T08:50:41Z

IO-1196 Core: Workflow object

owencjones · 2023-06-30T09:36:55Z

darwin/future/data_objects/workflow.py

@@ -0,0 +1,172 @@
+from datetime import datetime


For ease of review, I'm going to mark those of these that are pretty much what they need to be, and those which are prone to change.

owencjones · 2023-06-30T09:37:32Z

darwin/future/data_objects/workflow.py

+from darwin.future.pydantic_base import DefaultDarwin
+
+
+class WFDataset(DefaultDarwin):


WFDataset is a simple model, and pretty much exactly what we need it to be already.

owencjones · 2023-06-30T09:38:45Z

darwin/future/data_objects/workflow.py

+    name: str
+    instructions: str = Field(min_length=0)
+
+    def __int__(self) -> int:


I added these methods for ease, they may prove useful, they are better that running the Pydantic initial dunders in this case, and if they don't prove useful, they took very little time to do.

owencjones · 2023-06-30T09:40:31Z

darwin/future/data_objects/workflow.py

+
+
+class WFEdge(DefaultDarwin):
+    """


WFEdge seems largely complete, we may find more to add to it later, but it works as a graph node already.

Was considering the validity of a graph solver function. E.g. a function that confirms the graph is valid within our parameters, this would be simple (it's similar to linked list traversal), but until we need it, lets not bother.

What would constitute an unsolvable system? disconnected stages that can't be visited? That would seem an easy check with recursion and a 'visited' array.

It's a vague reference, but for instance a workflow with unconnected nodes, or nodes connected in one infinite loop, two loops, etc - these are essentially a graph, but we have imposed logic on that.

I don't know if it's necessary for us to do anything with it, as the backend may already validate it.

Validation exists somewhere, but might only be on the frontend. If you do try to create disconnected graphs it complains (only when you click save though, so I assume it's on the backend). It might not be that relevant to double up validation.

owencjones · 2023-06-30T09:41:32Z

darwin/future/data_objects/workflow.py

+
+
+class WFType(Enum):
+    """


I believe this is complete, but we may find that we need to add an extra stage for a hidden type that isn't apparent, or that I've not thought of.

owencjones · 2023-06-30T09:42:00Z

darwin/future/data_objects/workflow.py

+
+
+class WFUser(DefaultDarwin):
+    stage_id: UUID


This seems complete by the data structures in the JSON, which are concise and clear in this case

owencjones · 2023-06-30T09:43:50Z

darwin/future/data_objects/workflow.py

+
+
+class WFStageConfig(DefaultDarwin):
+    # ! NB: We may be able to remove many of these attributes


I included this to allow validation with completeness, but many of these values had no assigned type at any stage in any of the data I could get.

I suspect we don't need many of these for backend functionality, but validating them won't hurt, and we can evolve this object more freely as the comments reflect

owencjones · 2023-06-30T09:45:10Z

darwin/future/data_objects/workflow.py

+
+
+class WFStage(DefaultDarwin):
+    """


Confident in this one, we may find some are either unnecessary or we need extra info added.

owencjones · 2023-06-30T09:45:42Z

darwin/future/data_objects/workflow.py

+
+
+class Workflow(DefaultDarwin):
+    """


The complex extraneous items made the final workflow object simple, and I'm confident in this one.

owencjones · 2023-06-30T09:48:16Z

darwin/future/tests/data_objects/test_validators.py

@@ -1,8 +1,9 @@
-import unittest


Incidental tidying, unrelated to this

owencjones · 2023-06-30T09:49:17Z

darwin/future/tests/data_objects/workflow/data/dataset.json

@@ -0,0 +1,7 @@
+{


Example file used for test - NB: all IDs are randomly generated/madeup, unrelated to production system IDs

owencjones · 2023-06-30T09:49:28Z

darwin/future/tests/data_objects/workflow/data/edge.json

@@ -0,0 +1,6 @@
+{


Example file used for test - NB: all IDs are randomly generated/madeup, unrelated to production system IDs

owencjones · 2023-06-30T09:49:34Z

darwin/future/tests/data_objects/workflow/data/stage.json

@@ -0,0 +1,45 @@
+{


Example file used for test - NB: all IDs are randomly generated/madeup, unrelated to production system IDs

owencjones · 2023-06-30T09:49:45Z

darwin/future/tests/data_objects/workflow/data/stage_config.json

@@ -0,0 +1,26 @@
+{


Example file used for test - NB: all IDs are randomly generated/madeup, unrelated to production system IDs

owencjones · 2023-06-30T09:49:52Z

darwin/future/tests/data_objects/workflow/data/user.json

@@ -0,0 +1,4 @@
+{


Example file used for test - NB: all IDs are randomly generated/madeup, unrelated to production system IDs

owencjones · 2023-06-30T09:49:58Z

darwin/future/tests/data_objects/workflow/data/workflow.json

@@ -0,0 +1,75 @@
+{


Example file used for test - NB: all IDs are randomly generated/madeup, unrelated to production system IDs

owencjones · 2023-06-30T09:50:22Z

darwin/future/tests/data_objects/workflow/data/workflow.json

+            "type": "annotate"
+        }
+    ],
+    "team_id": 1337,


Told you they were made up

owencjones · 2023-06-30T09:51:49Z

darwin/future/tests/data_objects/workflow/test_wfdataset.py

+
+
+def test_sad_paths() -> None:
+    dataset = WFDataset.parse_file(validate_dataset_json)


This sad path testing is largely unnecessary for pydantic, and when successful conversely ends up E2E testing Pydantic itself.

It was handy in writing the objects correctly with the behaviour I expected, and I used sad path tests less for successive objects.

owencjones · 2023-06-30T09:53:12Z

darwin/future/tests/data_objects/workflow/test_wfstage.py

+    assert str(parsed_stage.id) == "e69d3ebe-6ab9-4159-b44f-2bf84d29bb20"
+
+
+def test_raises_with_invalid_uuid() -> None:


This was originally a social test for a UUID validator that I wrote, before finding out Pydantic had introduced proper support for this.

Leaving it in place provides both a sanity test and a test that the pydantic support remains for all the objects that use this, so I left it in.

owencjones · 2023-06-30T09:56:32Z

darwin/future/tests/data_objects/workflow/test_workflow.py

+def test_Workflow_validates_from_valid_json() -> None:
+    parsed_set = Workflow.parse_file(validate_json)
+
+    assert isinstance(parsed_set, Workflow)


Included a significant amount of testing here, as the Workflow object is the object that encompasses all the other objects into one place.

This test passing ensures essentially that the other objects also parse correctly, at least in the main.

Like the amount of testing, although with pydantic objects I've just been trusting that they validate correctly.

Nathanjp91

All looks pretty good, no changes required I think. Data objects are very fleshed out.

Nathanjp91 · 2023-07-03T10:22:38Z

darwin/future/tests/data_objects/workflow/test_workflow.py

+def test_Workflow_validates_from_valid_json() -> None:
+    parsed_set = Workflow.parse_file(validate_json)
+
+    assert isinstance(parsed_set, Workflow)


Like the amount of testing, although with pydantic objects I've just been trusting that they validate correctly.

Owen Jones added 7 commits June 27, 2023 15:11

WIP objects and validators

331fd09

Validator with tests

219c355

Initial test structure

a164e3d

Example JSON files for testing

c9a1b70

WIP: Pydantic tests

a51c07c

WIP Tests nearly done

139b597

Tests complete

e58e6c8

Owen Jones added 2 commits June 30, 2023 09:57

Tidy up unincluded imports

650538e

Isorting imports

fcfc1ea

owencjones commented Jun 30, 2023

View reviewed changes

Some tidying to tests

12c0659

owencjones commented Jun 30, 2023

View reviewed changes

owencjones marked this pull request as ready for review June 30, 2023 09:57

Removed CodeQL as it currently doesn't do anything

80c727d

owencjones changed the title ~~Io 1196~~ [IO-1196][internal] Workflow data models Jun 30, 2023

Removed CodeQL file because commenting it didn't work

a659618

Nathanjp91 approved these changes Jul 3, 2023

View reviewed changes

owencjones merged commit 3ff6c03 into master Jul 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[IO-1196][internal] Workflow data models #622

[IO-1196][internal] Workflow data models #622

owencjones commented Jun 30, 2023

linear bot commented Jun 30, 2023

owencjones Jun 30, 2023

owencjones Jun 30, 2023

owencjones Jun 30, 2023

owencjones Jun 30, 2023

Nathanjp91 Jun 30, 2023

owencjones Jul 3, 2023

Nathanjp91 Jul 3, 2023

owencjones Jun 30, 2023

owencjones Jun 30, 2023

owencjones Jun 30, 2023

owencjones Jun 30, 2023

owencjones Jun 30, 2023

owencjones Jun 30, 2023

owencjones Jun 30, 2023

owencjones Jun 30, 2023

owencjones Jun 30, 2023

owencjones Jun 30, 2023

owencjones Jun 30, 2023

owencjones Jun 30, 2023

owencjones Jun 30, 2023

owencjones Jun 30, 2023

owencjones Jun 30, 2023 •

edited

Loading

owencjones Jun 30, 2023

Nathanjp91 Jul 3, 2023

Nathanjp91 left a comment

Nathanjp91 Jul 3, 2023

		from darwin.future.pydantic_base import DefaultDarwin


		class WFDataset(DefaultDarwin):



		class WFStageConfig(DefaultDarwin):
		# ! NB: We may be able to remove many of these attributes



		def test_sad_paths() -> None:
		dataset = WFDataset.parse_file(validate_dataset_json)

		assert str(parsed_stage.id) == "e69d3ebe-6ab9-4159-b44f-2bf84d29bb20"


		def test_raises_with_invalid_uuid() -> None:

[IO-1196][internal] Workflow data models #622

[IO-1196][internal] Workflow data models #622

Conversation

owencjones commented Jun 30, 2023

Problem

Solution

Changelog

linear bot commented Jun 30, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

owencjones Jun 30, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Nathanjp91 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

owencjones Jun 30, 2023 •

edited

Loading