Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[IO-1046][IO-1035] Team method upates and lazy loading #598

Merged
merged 8 commits into from
Jul 5, 2023
Merged

Conversation

Nathanjp91
Copy link
Contributor

@Nathanjp91 Nathanjp91 commented May 12, 2023

Problem

Update darwin-py v2 to include Team Meta objects, lazy loading and querying.

Solution

Refactor of existing code from Core to Meta objects to enable lazy loading and query chaining.
Changes:

  • Query object now returns Meta objects, this is so that chaining can occur
  • Meta Objects use property fields to call query objects for subfields
  • Query Object now requires second generic T -> MetaBase, R -> DefaultDarwin, where R = the Core data object that T manages, ie T = TeamMeta, R = Team
class Query(Generic[T, R], ABC):
  • MetaBase base object that manages underlying data for Meta Objects, including results if loaded, requires to be instantiated with generics as above, ie, DatasetMeta(MetaBase[Dataset]) or generically class T(MetaBase[R])
  • MetaBase manages most iteration, list and collecting logic, but does require overwriting underlying object with the method to retrieve the objects in question. Examples seen in DatasetMeta.__next__().
  • Point of Note: MetaBase objects should never call the Query objects for that object, otherwise circular logic occurs. They can use Queries for subfields, but to get the underlying object they should call core methods. For eg
class TeamMeta(MetaBase[Team]):
    client: Client

    def __init__(self, client: Client, teams: Optional[List[Team]]=None) -> None:
        # TODO: Initialise from chaining within MetaClient
        self.client = client
        if not teams:
            teams = [get_team(self.client)] # <- Using TeamQuery introduces circular logic, use core method instead
        super().__init__(teams)
    

    @property
    def members(self) -> TeamMemberQuery: # <- Fine to use a query object here 
        return TeamMemberQuery(self.client)
  • Client now loads Meta team, allowing client.team[0].datasets[0]... functionality

Changelog

Team + other meta objects now able to be chained and lazily loaded.

@linear
Copy link

linear bot commented May 12, 2023

IO-1046 Get team

Copy link
Contributor

@owencjones owencjones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved - there's a couple of changes, but nothing I'd want to block this moving on

darwin/future/data_objects/team.py Outdated Show resolved Hide resolved
@@ -51,5 +55,32 @@ class Team(DefaultDarwin):
# Data Validation
_slug_validator = validator("slug", allow_reuse=True)(parse_name)

@staticmethod
def from_client(client: Client, team_slug: Optional[str] = None) -> Team:
"""Returns the team with the given slug"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we do these with full docblocks, so they can be properly doc'd when it comes to it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I'll go back and write up docs for all the stuff I've added

members.append(TeamMember.parse_obj(item))
except Exception as e:
errors.append(e)
return (members, errors)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think the brackets are unnecessary, although totally stylistic, and your choice on this one.

result = self.filters[self.n]
def __next__(self) -> R:
if self.results is None:
self.results = list(self.collect())
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interaction here becomes
self.collect() -> MetaObject
Then MetaObject stores a list of R and implement iteration dunders

@Nathanjp91 Nathanjp91 changed the title [IO-1046] Team from_api method [IO-1046] Team method upates and lazy loading Jun 27, 2023
@@ -24,5 +24,6 @@
"black"
],
"python.testing.pytestEnabled": true,
"python.linting.enabled": true
"python.linting.enabled": true,
"python.analysis.typeCheckingMode": "basic"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have absolutely no idea what introduced this

@Nathanjp91 Nathanjp91 changed the title [IO-1046] Team method upates and lazy loading [IO-1046][IO-1035] Team method upates and lazy loading Jun 27, 2023
Copy link
Contributor

@owencjones owencjones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments to definitely consider, but no outright demanded changes

@@ -24,5 +24,6 @@
"black"
],
"python.testing.pytestEnabled": true,
"python.linting.enabled": true
"python.linting.enabled": true,
"python.analysis.typeCheckingMode": "basic"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the looks of it, turns on pylance highlighting and syntax checking

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pylance being the built in checker that comes with vscode

from darwin.future.pydantic_base import DefaultDarwin

T = TypeVar("T", bound=DefaultDarwin)
T = TypeVar("T", bound=MetaBase)
R = TypeVar("R", bound=DefaultDarwin)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice Genericing, 💯

return self + filter

def __add__(self, filter: QueryFilter) -> Query[T]:
def __add__(self, filter: QueryFilter) -> Query[T, R]:
assert filter is not None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add a note to replace these asserts with my assert_is function which doesn't change in debug mode.

self.n += 1
return result
else:
raise StopIteration

def __getitem__(self, index: int) -> R:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are our keys definitely always integers? __setitem__ and __getitem__ also take strings as native, but I can't immediately think whether we have any issue with that.

Copy link
Contributor Author

@Nathanjp91 Nathanjp91 Jun 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as in just implicitly converting a string to an int here, like query['6'] or would there be some other meaningful interaction with a string as an index that we'd want? Like query['name'] returning the dataset name for instance? I would worry that has too much overlap with like query.where('name=...') and we'd just be maintaining multiple sets of functionality that do the same job. IMO better to just have and maintain the one.



def get_team(client: Client, team_slug: Optional[str] = None) -> Team:
"""Returns the team with the given slug"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment

"""Returns the team with the given slug"""
if not team_slug:
team_slug = client.config.default_team
response = client.get(f"/teams/{team_slug}/")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're starting to get a mix of functions that use the monad exceptions, return response, and those that raise. Not something for now, but we should work out what our approach is for using one vs the other.

Copy link
Contributor Author

@Nathanjp91 Nathanjp91 Jun 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for me it comes down to when we get a single object back or a list of objects. If we're only returning a single 'block', ie something that's all related to each other and if one thing goes wrong it's all poisoned, then it makes sense to just return the object + raise if any issues, but if we have a function that returns a list of objects and they're all separate, then it makes sense to package up the good data and return it with the exceptions monad style.

@@ -27,3 +29,9 @@ def from_api_key(cls, api_key: str, datasets_dir: Optional[Path] = None) -> Meta
if datasets_dir:
config.datasets_dir = datasets_dir
return cls(config)

@property
def team(self) -> TeamMeta:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, pretty straightforward in the end.


R = TypeVar("R", bound=DefaultDarwin)

class MetaBase(Generic[R]):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was 50/50 whether to put this, but "metabase" is the name of a popular BI/Dashboarding tool. I wouldn't normally care, but we do actually use it elsewhere in the stack, so this might be worthy of renaming.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Meta Base object is definitely required for the generics, but don't mind what it's called.

assert_is(isinstance(dataset_id, int), "dataset_id must be an integer")

dataset_deleted = remove_dataset(client, dataset_id)
return dataset_deleted


def __next__(self) -> Dataset:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cheers!

def collect(self, client: Client) -> List[TeamMember]:
members, exceptions = get_team_members(client)
def collect(self) -> TeamMembersMeta:
members, exceptions = get_team_members(self.client)
if exceptions:
# TODO: print and or raise exceptions, tbd how we want to handle this
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should have a think about this before we have many more of these.

@Nathanjp91 Nathanjp91 merged commit 0443392 into master Jul 5, 2023
9 checks passed
@Nathanjp91 Nathanjp91 deleted the IO-1046 branch November 8, 2023 10:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants