Skip to content

Commit

Permalink
Implement Redshift historical retrieval (#1720)
Browse files Browse the repository at this point in the history
* Implement Redshift historical retrieval

Signed-off-by: Tsotne Tabidze <tsotne@tecton.ai>

* Fix imports

Signed-off-by: Tsotne Tabidze <tsotne@tecton.ai>

* Fixed get_historical_features where entity_df is a SQL query

Fixed get_historical_features where entity_df is a SQL query, while keeping the utility functions common between Redshift and BigQuery. `infer_event_timestamp_from_entity_df` and `assert_expected_columns_in_entity_df` are now based on the entity schema rather than the dataframe.
I also completely removed the min/max timestamp inference, since those could not be merged (needed to query BigQuery and Redshift). Instead, I moved the logic inside the SQL templates, reducing the code complexity.

Signed-off-by: Tsotne Tabidze <tsotne@tecton.ai>

* Address most of the comments

Signed-off-by: Tsotne Tabidze <tsotne@tecton.ai>

* Update sdk/python/feast/infra/offline_stores/redshift.py

Co-authored-by: Willem Pienaar <6728866+woop@users.noreply.github.com>
Signed-off-by: Tsotne Tabidze <tsotne@tecton.ai>

* Merge common_utils and helpers into utils.py

Signed-off-by: Tsotne Tabidze <tsotne@tecton.ai>

* Add test_historical_retrieval test for Redshift, fix some bugs

Signed-off-by: Tsotne Tabidze <tsotne@tecton.ai>

* Use features instead of feature_refs

Signed-off-by: Tsotne Tabidze <tsotne@tecton.ai>

* Rename utils to offline_utils and add created_timestamp_column

Signed-off-by: Tsotne Tabidze <tsotne@tecton.ai>

Co-authored-by: Willem Pienaar <6728866+woop@users.noreply.github.com>
  • Loading branch information
Tsotne Tabidze and woop authored Jul 23, 2021
1 parent 9c5a961 commit bf557bc
Show file tree
Hide file tree
Showing 15 changed files with 949 additions and 383 deletions.
4 changes: 4 additions & 0 deletions sdk/python/feast/data_source.py
Original file line number Diff line number Diff line change
Expand Up @@ -376,6 +376,10 @@ def get_table_column_names_and_types(
"""
raise NotImplementedError

def get_table_query_string(self) -> str:
"""Returns a string that can directly be used to reference this table in SQL"""
raise NotImplementedError


class KafkaSource(DataSource):
def validate(self, config: RepoConfig):
Expand Down
4 changes: 3 additions & 1 deletion sdk/python/feast/driver_test_data.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,9 @@
import pandas as pd
from pytz import FixedOffset, timezone, utc

from feast.infra.provider import DEFAULT_ENTITY_DF_EVENT_TIMESTAMP_COL
from feast.infra.offline_stores.offline_utils import (
DEFAULT_ENTITY_DF_EVENT_TIMESTAMP_COL,
)


class EventTimestampType(Enum):
Expand Down
15 changes: 15 additions & 0 deletions sdk/python/feast/errors.py
Original file line number Diff line number Diff line change
Expand Up @@ -185,3 +185,18 @@ def __init__(self):
class RedshiftQueryError(Exception):
def __init__(self, details):
super().__init__(f"Redshift SQL Query failed to finish. Details: {details}")


class EntityTimestampInferenceException(Exception):
def __init__(self, expected_column_name: str):
super().__init__(
f"Please provide an entity_df with a column named {expected_column_name} representing the time of events."
)


class InvalidEntityType(Exception):
def __init__(self, entity_type: type):
super().__init__(
f"The entity dataframe you have provided must be a Pandas DataFrame or a SQL query, "
f"but we found: {entity_type} "
)
2 changes: 1 addition & 1 deletion sdk/python/feast/infra/aws.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
from feast import FeatureTable
from feast.entity import Entity
from feast.feature_view import FeatureView
from feast.infra.offline_stores.helpers import get_offline_store_from_config
from feast.infra.offline_stores.offline_utils import get_offline_store_from_config
from feast.infra.online_stores.helpers import get_online_store_from_config
from feast.infra.provider import (
Provider,
Expand Down
2 changes: 1 addition & 1 deletion sdk/python/feast/infra/gcp.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
from feast import FeatureTable
from feast.entity import Entity
from feast.feature_view import FeatureView
from feast.infra.offline_stores.helpers import get_offline_store_from_config
from feast.infra.offline_stores.offline_utils import get_offline_store_from_config
from feast.infra.online_stores.helpers import get_online_store_from_config
from feast.infra.provider import (
Provider,
Expand Down
2 changes: 1 addition & 1 deletion sdk/python/feast/infra/local.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
from feast import FeatureTable
from feast.entity import Entity
from feast.feature_view import FeatureView
from feast.infra.offline_stores.helpers import get_offline_store_from_config
from feast.infra.offline_stores.offline_utils import get_offline_store_from_config
from feast.infra.online_stores.helpers import get_online_store_from_config
from feast.infra.provider import (
Provider,
Expand Down
Loading

0 comments on commit bf557bc

Please sign in to comment.