Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OnDemandFeatureView.feature_transformation.infer_features does pass UDF outputs to python_type_to_feast_value_type #4308

Closed
alexmirrington opened this issue Jun 24, 2024 · 1 comment · Fixed by #4310

Comments

@alexmirrington
Copy link
Contributor

alexmirrington commented Jun 24, 2024

Expected Behavior

OnDemandFeatureView.feature_transformation.infer_features should be able to infer features from primitive python types for all supported feast data types, for all transformation backends.

Current Behavior

All on demand feature views are currently broken for list types, as there is no way to bypass schema inference.

Details

OnDemandFeatureView.feature_transformation.infer_features can only infer features in the type map inside python_type_to_feast_value_type, i.e.

type_map = {
    "int": ValueType.INT64,
    "str": ValueType.STRING,
    "string": ValueType.STRING,  # pandas.StringDtype
    "float": ValueType.DOUBLE,
    "bytes": ValueType.BYTES,
    "float64": ValueType.DOUBLE,
    "float32": ValueType.FLOAT,
    "int64": ValueType.INT64,
    "uint64": ValueType.INT64,
    "int32": ValueType.INT32,
    "uint32": ValueType.INT32,
    "int16": ValueType.INT32,
    "uint16": ValueType.INT32,
    "uint8": ValueType.INT32,
    "int8": ValueType.INT32,
    "bool": ValueType.BOOL,
    "boolean": ValueType.BOOL,
    "timedelta": ValueType.UNIX_TIMESTAMP,
    "timestamp": ValueType.UNIX_TIMESTAMP,
    "datetime": ValueType.UNIX_TIMESTAMP,
    "datetime64[ns]": ValueType.UNIX_TIMESTAMP,
    "datetime64[ns, tz]": ValueType.UNIX_TIMESTAMP,
    "category": ValueType.STRING,
}

This is because if the type e.g. ValueType.FLOAT_LIST doesn't have a mapping in the dictionary above, and value is None, then isinstance(value, dtype) checks will fall through to the ValueError in python_type_to_feast_value_type.

Steps to reproduce

Initialize a new repository:

feast init

Modify the sample on_demand_feature_view to return an array of floats instead of just floats, e.g.

diff --git a/true_garfish/feature_repo/example_repo.py b/true_garfish/feature_repo/example_repo.py
index 1f5b946..59d4501 100644
--- a/true_garfish/feature_repo/example_repo.py
+++ b/true_garfish/feature_repo/example_repo.py
@@ -16,7 +16,7 @@ from feast import (
 from feast.feature_logging import LoggingConfig
 from feast.infra.offline_stores.file_source import FileLoggingDestination
 from feast.on_demand_feature_view import on_demand_feature_view
-from feast.types import Float32, Float64, Int64
+from feast.types import Float32, Float64, Int64, Array
 
 # Define an entity for the driver. You can think of an entity as a primary key used to
 # fetch features.
@@ -72,15 +72,16 @@ input_request = RequestSource(
 @on_demand_feature_view(
     sources=[driver_stats_fv, input_request],
     schema=[
-        Field(name="conv_rate_plus_val1", dtype=Float64),
-        Field(name="conv_rate_plus_val2", dtype=Float64),
+        Field(name="conv_rate_plus_vals", dtype=Array(Float64)),
     ],
 )
 def transformed_conv_rate(inputs: pd.DataFrame) -> pd.DataFrame:
-    df = pd.DataFrame()
-    df["conv_rate_plus_val1"] = inputs["conv_rate"] + inputs["val_to_add"]
-    df["conv_rate_plus_val2"] = inputs["conv_rate"] + inputs["val_to_add_2"]
-    return df
+    result = {"conv_rate_plus_vals": []}
+    for _, row in inputs.iterrows():
+        result["conv_rate_plus_vals"].append(
+            [row["conv_rate"] + row["val_to_add"], row["conv_rate"] + row["val_to_add_2"]]
+        )
+    return pd.DataFrame(data=result)
  1. Run feast apply, and you should get the following error:
Traceback (most recent call last):
  File "~/.../.venv/bin/feast", line 8, in <module>
    sys.exit(cli())
             ^^^^^
  File "~/.../.venv/lib/python3.12/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/.../.venv/lib/python3.12/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "~/.../.venv/lib/python3.12/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/.../.venv/lib/python3.12/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/.../.venv/lib/python3.12/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/.../.venv/lib/python3.12/site-packages/click/decorators.py", line 33, in new_func
    return f(get_current_context(), *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/.../.venv/lib/python3.12/site-packages/feast/cli.py", line 506, in apply_total_command
    apply_total(repo_config, repo, skip_source_validation)
  File "~/.../.venv/lib/python3.12/site-packages/feast/repo_operations.py", line 347, in apply_total
    apply_total_with_repo_instance(
  File "~/.../.venv/lib/python3.12/site-packages/feast/repo_operations.py", line 299, in apply_total_with_repo_instance
    registry_diff, infra_diff, new_infra = store.plan(repo)
                                           ^^^^^^^^^^^^^^^^
  File "~/.../.venv/lib/python3.12/site-packages/feast/feature_store.py", line 745, in plan
    self._make_inferences(
  File "~/.../.venv/lib/python3.12/site-packages/feast/feature_store.py", line 640, in _make_inferences
    odfv.infer_features()
  File "~/.../.venv/lib/python3.12/site-packages/feast/on_demand_feature_view.py", line 521, in infer_features
    inferred_features = self.feature_transformation.infer_features(
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/....venv/lib/python3.12/site-packages/feast/transformation/pandas_transformation.py", line 47, in infer_features
    python_type_to_feast_value_type(f, type_name=str(dt))
  File "~/.../.venv/lib/python3.12/site-packages/feast/type_map.py", line 215, in python_type_to_feast_value_type
    raise ValueError(
ValueError: Value with native type object cannot be converted into Feast value type

Adding some debug statements inside python_type_to_feast_value_type, we get the following locals before the error was raised:

name='conv_rate_plus_vals'
value=None
recurse=True
type_name='object'
type(value)=<class 'NoneType'>

As mentioned before this is because all transformation backends don't pass values to the type mapper, e.g. the pandas backend in this case

Specifications

  • Version: 0.39.0
  • Platform: arm64
  • Subsystem: MacOS

Possible Solution

  • Pass the sample values generated for type inference through to the type mapper
  • Update the type mapper to handle lists that are two levels deep. This is because primitive UDF outputs are wrapped in either a np.array or list of length 1, so therefore lists should be two levels deep with the inner list being the list of feature values.
@alexmirrington
Copy link
Contributor Author

alexmirrington commented Jun 24, 2024

PR here for those following along: #4310

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant