Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hide FeatureViewProjections from user interface & have FeatureViews carry FVProjections that carries the modified info of the FeatureView #1899

Merged
merged 17 commits into from
Oct 5, 2021
Merged
Show file tree
Hide file tree
Changes from 9 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 13 additions & 2 deletions protos/feast/core/FeatureService.proto
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,9 @@ option java_outer_classname = "FeatureServiceProto";
option java_package = "feast.proto.core";

import "google/protobuf/timestamp.proto";
import "feast/core/FeatureViewProjection.proto";
import "feast/core/OnDemandFeatureView.proto";
import "feast/core/FeatureTable.proto";
import "feast/core/FeatureView.proto";

message FeatureService {
// User-specified specifications of this feature service.
Expand All @@ -17,6 +19,9 @@ message FeatureService {
}

message FeatureServiceSpec {
// was previously used with 'repeated FeatureViewProjection features'
reserved 3;

// Name of the Feature Service. Must be unique. Not updated.
string name = 1;

Expand All @@ -25,7 +30,13 @@ message FeatureServiceSpec {

// List of features that this feature service encapsulates.
// Stored as a list of references to other features views and the features from those views.
repeated FeatureViewProjection features = 3;
mavysavydav marked this conversation as resolved.
Show resolved Hide resolved
repeated string features = 6;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so is there a convention here like feature_table:feature?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, would it make more sense to use a feature reference instead of a string?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep that's the convention. I'll add a note on that expectation in the docstring. It should be a string instead of feature reference since we need to know which feature view the feature name is associated with.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A feature reference contains a reference to its associated view/table https://github.com/feast-dev/feast/blob/master/protos/feast/serving/ServingService.proto#L52. I actually think the FeatureViewProjection object has a good structure for this use case, even if we remove it from the rest of the user facing API. Wouldn't it be safer and more convenient to split views/tables from feature names vs having to rely on string conventions?


repeated FeatureTable feature_tables = 7;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what these new objects are supposed to contain. Don't we just need features?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FeatureViews and OnDemandFeatureViews will carry the info of the modified fields such as modified name. I also added a field for FeatureTable for consistency.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at this now, it doesn't look like the right approach. Wouldn't we run into consistency problems if we store copies of feature view objects with different configurations?

The original problem was

The trouble right now is that at query time, even though the FV is one concept in the users mind when use get_historical_features or get_online_features, but it actually splits into a different class when the getitem magic method is triggered by my_fv[[<my-feature-selections]] and it becomes a FVProjection. This means if users wanted to transform their FV, we'd have to support these methods on both the FV class and FVProjection class.

This is both a user facing API problem and a code duplication problem, but I am beginning to think that having an object like a FeatureViewProjection within the protos might be the right approach. Assuming the user could modify a FeatureView at their own whims, and assuming we could convert that final feature view into something like

message FeatureViewProjection {
  string feature_view_name = 1;
  repeated FeatureSpecV2 feature_columns = 2;
  string feature_view_name_alias = 3;
  something entity_mapping_etc = 4;
}

Wouldn't that both solve the consistency problem in the registry (normalized instead of denormalized), as well as the user facing API problem where we have to maintain FeatureViewProjections?


repeated FeatureView feature_views = 8;

repeated OnDemandFeatureView on_demand_feature_views = 9;

// User defined metadata
map<string,string> tags = 4;
mavysavydav marked this conversation as resolved.
Show resolved Hide resolved
Expand Down
18 changes: 0 additions & 18 deletions protos/feast/core/FeatureViewProjection.proto

This file was deleted.

12 changes: 4 additions & 8 deletions sdk/python/feast/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -172,14 +172,10 @@ def feature_service_list(ctx: click.Context):
repo = ctx.obj["CHDIR"]
cli_check_repo(repo)
store = FeatureStore(repo_path=str(repo))
feature_services = []
for feature_service in store.list_feature_services():
feature_names = []
for projection in feature_service.features:
feature_names.extend(
[f"{projection.name}:{feature.name}" for feature in projection.features]
)
feature_services.append([feature_service.name, ", ".join(feature_names)])
feature_services = [
[feature_service.name, ", ".join(feature_service.features)]
for feature_service in store.list_feature_services()
]

from tabulate import tabulate

Expand Down
81 changes: 48 additions & 33 deletions sdk/python/feast/feature_service.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,6 @@

from feast.feature_table import FeatureTable
from feast.feature_view import FeatureView
from feast.feature_view_projection import FeatureViewProjection
from feast.on_demand_feature_view import OnDemandFeatureView
from feast.protos.feast.core.FeatureService_pb2 import (
FeatureService as FeatureServiceProto,
Expand All @@ -30,7 +29,10 @@ class FeatureService:
"""

name: str
features: List[FeatureViewProjection]
features: List[str]
feature_tables: List[FeatureTable]
feature_views: List[FeatureView]
on_demand_feature_views: List[OnDemandFeatureView]
tags: Dict[str, str]
description: Optional[str] = None
created_timestamp: Optional[datetime] = None
Expand All @@ -39,9 +41,7 @@ class FeatureService:
def __init__(
self,
name: str,
features: List[
Union[FeatureTable, FeatureView, OnDemandFeatureView, FeatureViewProjection]
],
features: List[Union[FeatureTable, FeatureView, OnDemandFeatureView]],
tags: Optional[Dict[str, str]] = None,
description: Optional[str] = None,
):
Expand All @@ -53,17 +53,26 @@ def __init__(
"""
self.name = name
self.features = []
for feature in features:
if (
isinstance(feature, FeatureTable)
or isinstance(feature, FeatureView)
or isinstance(feature, OnDemandFeatureView)
):
self.features.append(FeatureViewProjection.from_definition(feature))
elif isinstance(feature, FeatureViewProjection):
self.features.append(feature)
self.feature_tables, self.feature_views, self.on_demand_feature_views = (
[],
[],
[],
)

for feature_grouping in features:
if isinstance(feature_grouping, FeatureTable):
self.feature_tables.append(feature_grouping)
elif isinstance(feature_grouping, FeatureView):
self.feature_views.append(feature_grouping)
elif isinstance(feature_grouping, OnDemandFeatureView):
self.on_demand_feature_views.append(feature_grouping)
else:
raise ValueError(f"Unexpected type: {type(feature)}")
raise ValueError(f"Unexpected type: {type(feature_grouping)}")
mavysavydav marked this conversation as resolved.
Show resolved Hide resolved

self.features.extend(
[f"{feature_grouping.name}:{f.name}" for f in feature_grouping.features]
)

self.tags = tags or {}
self.description = description
self.created_timestamp = None
Expand Down Expand Up @@ -102,10 +111,7 @@ def from_proto(feature_service_proto: FeatureServiceProto):
"""
fs = FeatureService(
name=feature_service_proto.spec.name,
features=[
FeatureViewProjection.from_proto(fp)
for fp in feature_service_proto.spec.features
],
features=[],
tags=dict(feature_service_proto.spec.tags),
description=(
feature_service_proto.spec.description
Expand All @@ -114,6 +120,20 @@ def from_proto(feature_service_proto: FeatureServiceProto):
),
)

fs.features = [feature for feature in feature_service_proto.spec.features]
fs.feature_tables = [
FeatureTable.from_proto(table)
for table in feature_service_proto.spec.feature_tables
]
fs.feature_views = [
FeatureView.from_proto(view)
for view in feature_service_proto.spec.feature_views
]
fs.on_demand_feature_views = [
OnDemandFeatureView.from_proto(view)
for view in feature_service_proto.spec.on_demand_feature_views
]

if feature_service_proto.meta.HasField("created_timestamp"):
fs.created_timestamp = (
feature_service_proto.meta.created_timestamp.ToDatetime()
Expand All @@ -136,20 +156,15 @@ def to_proto(self) -> FeatureServiceProto:
if self.created_timestamp:
meta.created_timestamp.FromDatetime(self.created_timestamp)

spec = FeatureServiceSpec()
spec.name = self.name
for definition in self.features:
if isinstance(definition, FeatureTable) or isinstance(
definition, FeatureView
):
feature_ref = FeatureViewProjection(
definition.name, definition.features
)
else:
feature_ref = definition

spec.features.append(feature_ref.to_proto())

spec = FeatureServiceSpec(
name=self.name,
features=self.features,
feature_tables=[table.to_proto() for table in self.feature_tables],
feature_views=[view.to_proto() for view in self.feature_views],
on_demand_feature_views=[
view.to_proto() for view in self.on_demand_feature_views
],
)
if self.tags:
spec.tags.update(self.tags)
if self.description:
Expand Down
72 changes: 37 additions & 35 deletions sdk/python/feast/feature_store.py
Original file line number Diff line number Diff line change
Expand Up @@ -318,10 +318,7 @@ def _get_features(

_feature_refs: List[str]
if isinstance(_features, FeatureService):
# Get the latest value of the feature service, in case the object passed in has been updated underneath us.
_feature_refs = _get_feature_refs_from_feature_services(
self.get_feature_service(_features.name)
)
_feature_refs = _features.features
else:
_feature_refs = _features
return _feature_refs
Expand Down Expand Up @@ -541,8 +538,7 @@ def get_historical_features(
)

_feature_refs = self._get_features(features, feature_refs)

all_feature_views = self.list_feature_views()
all_feature_views = self._get_feature_views_to_use(features)
all_on_demand_feature_views = self._registry.list_on_demand_feature_views(
project=self.project
)
Expand Down Expand Up @@ -804,8 +800,8 @@ def get_online_features(
>>> online_response_dict = online_response.to_dict()
"""
_feature_refs = self._get_features(features, feature_refs)
all_feature_views = self._list_feature_views(
allow_cache=True, hide_dummy_entity=False
all_feature_views = self._get_feature_views_to_use(
mavysavydav marked this conversation as resolved.
Show resolved Hide resolved
features=features, allow_cache=True, hide_dummy_entity=False
)
all_on_demand_feature_views = self._registry.list_on_demand_feature_views(
project=self.project, allow_cache=True
Expand Down Expand Up @@ -1017,6 +1013,30 @@ def _augment_response_with_on_demand_transforms(
] = GetOnlineFeaturesResponse.FieldStatus.PRESENT
return OnlineResponse(GetOnlineFeaturesResponse(field_values=result_rows))

def _get_feature_views_to_use(
self,
features: Optional[Union[List[str], FeatureService]],
woop marked this conversation as resolved.
Show resolved Hide resolved
allow_cache=False,
hide_dummy_entity: bool = True,
) -> List[FeatureView]:

passed_in_feature_views = (
{view.name: view for view in features.feature_views}
if isinstance(features, FeatureService)
else {}
)
mavysavydav marked this conversation as resolved.
Show resolved Hide resolved

all_feature_views = [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This name all_feature_views seems inaccurate if we are filtering.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the logic before may have been a bit confusing. The logic in the new commits should make it clear we're not actually filtering. All the FVs from the registry are present

mavysavydav marked this conversation as resolved.
Show resolved Hide resolved
*filter(
lambda view: view.name not in [*passed_in_feature_views.keys()],
self._list_feature_views(
allow_cache=allow_cache, hide_dummy_entity=hide_dummy_entity
),
)
] + [*passed_in_feature_views.values()]

return all_feature_views

@log_exceptions_and_usage
def serve(self, port: int) -> None:
"""Start the feature consumption server locally on a given port."""
Expand Down Expand Up @@ -1069,7 +1089,7 @@ def _validate_feature_refs(feature_refs: List[str], full_feature_names: bool = F


def _group_feature_refs(
features: Union[List[str], FeatureService],
features: List[str],
all_feature_views: List[FeatureView],
all_on_demand_feature_views: List[OnDemandFeatureView],
) -> Tuple[
Expand All @@ -1089,21 +1109,14 @@ def _group_feature_refs(
# on demand view name to feature names
on_demand_view_features = defaultdict(list)

if isinstance(features, list) and isinstance(features[0], str):
for ref in features:
view_name, feat_name = ref.split(":")
if view_name in view_index:
views_features[view_name].append(feat_name)
elif view_name in on_demand_view_index:
on_demand_view_features[view_name].append(feat_name)
else:
raise FeatureViewNotFoundException(view_name)
elif isinstance(features, FeatureService):
for feature_projection in features.features:
projected_features = feature_projection.features
views_features[feature_projection.name].extend(
[f.name for f in projected_features]
)
for ref in features:
view_name, feat_name = ref.split(":")
if view_name in view_index:
views_features[view_name].append(feat_name)
elif view_name in on_demand_view_index:
on_demand_view_features[view_name].append(feat_name)
else:
raise FeatureViewNotFoundException(view_name)

fvs_result: List[Tuple[FeatureView, List[str]]] = []
odfvs_result: List[Tuple[OnDemandFeatureView, List[str]]] = []
Expand All @@ -1115,17 +1128,6 @@ def _group_feature_refs(
return fvs_result, odfvs_result


def _get_feature_refs_from_feature_services(
feature_service: FeatureService,
) -> List[str]:
feature_refs = []
for projection in feature_service.features:
feature_refs.extend(
[f"{projection.name}:{f.name}" for f in projection.features]
)
return feature_refs


def _get_table_entity_keys(
table: FeatureView, entity_keys: List[EntityKeyProto], join_key_map: Dict[str, str],
) -> List[EntityKeyProto]:
Expand Down
15 changes: 12 additions & 3 deletions sdk/python/feast/feature_view.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,6 @@
from feast.data_source import DataSource
from feast.errors import RegistryInferenceFailure
from feast.feature import Feature
from feast.feature_view_projection import FeatureViewProjection
from feast.protos.feast.core.FeatureView_pb2 import FeatureView as FeatureViewProto
from feast.protos.feast.core.FeatureView_pb2 import (
FeatureViewMeta as FeatureViewMetaProto,
Expand Down Expand Up @@ -151,15 +150,25 @@ def __str__(self):
def __hash__(self):
return hash(self.name)

def __getitem__(self, item) -> FeatureViewProjection:
def __getitem__(self, item):
assert isinstance(item, list)

referenced_features = []
for feature in self.features:
if feature.name in item:
referenced_features.append(feature)

return FeatureViewProjection(self.name, referenced_features)
return FeatureView(
name=self.name,
entities=self.entities,
ttl=self.ttl,
input=self.input,
batch_source=self.batch_source,
stream_source=self.stream_source,
features=referenced_features,
tags=self.tags,
online=self.online,
)

def __eq__(self, other):
if not isinstance(other, FeatureView):
Expand Down
37 changes: 0 additions & 37 deletions sdk/python/feast/feature_view_projection.py

This file was deleted.

Loading