Inferencing of Features in FeatureView and timestamp column of DataSource #1523

mavysavydav · 2021-04-29T16:30:38Z

Test cases written and also tested by running with CLI.

What this PR does / why we need it:
Milestone 1 of https://docs.google.com/document/d/1MkWvexE4e5nYWcQLELFnJ5o9OlJDKC2rn_USHMDT9dg/edit

Which issue(s) this PR fixes:

Fixes #1500

Does this PR introduce a user-facing change?:

Feast will be able to infer the features to register if no features are included in the FeatureView definition, and also infer the event timestamp column of DataSources if that parameter is left out. This is milestone 1 of https://docs.google.com/document/d/1MkWvexE4e5nYWcQLELFnJ5o9OlJDKC2rn_USHMDT9dg/edit#heading=h.pqhio4s5uw2s

feast-ci-bot · 2021-04-29T16:30:48Z

Hi @mavysavydav. Thanks for your PR.

I'm waiting for a feast-dev member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

…More through testing needed. Signed-off-by: David Liu <davidl@twitter.com>

woop · 2021-04-29T23:23:03Z

/ok-to-test

woop · 2021-04-29T23:23:19Z

Thanks @mavysavydav, we will have a look.

woop · 2021-04-30T01:33:12Z

Don't turn columns that start with or end with "__" (double underscore) into a Feature. (any objections?)

Seems reasonable.

Signed-off-by: David Liu <davidl@twitter.com>

mavysavydav · 2021-04-30T20:13:38Z

About to push up new commit with test cases and some other minor fixes @jklegar. Feel free to wait on review until that commit arrives

Signed-off-by: David Liu <davidl@twitter.com>

…ion test & added __ rule in inference Signed-off-by: David Liu <davidl@twitter.com>

mavysavydav · 2021-04-30T22:01:36Z

Ok, tests have been added and ball is in ur court :D 🏀

jklegar

Thank you David this is great! Left a few comments, most of them small - let me know if anything doesn't make sense

sdk/python/feast/data_source.py

sdk/python/feast/feature_view.py

sdk/python/feast/type_map.py

sdk/python/tests/fixtures/data_source_fixtures.py

… test case for it Signed-off-by: David Liu <davidl@twitter.com>

mavysavydav · 2021-05-02T00:04:30Z

thx for review! i agree w ur suggestions and have made the changes (including adding a test for checking that bigQuerySource is properly inferencing based on query when table_ref is absent.

jklegar

This is great - a couple small followup comments and then should be good to go

jklegar · 2021-05-03T21:17:41Z

sdk/python/feast/data_source.py

+                    list(zip(df["COLUMN_NAME"].to_list(), df["DATA_TYPE"].to_list()))
+                )
+        else:
+            bq_columns_query = f"SELECT * FROM {self.query} LIMIT 1"


I think we may need parentheses around {self.query} here

jklegar · 2021-05-03T21:19:26Z

sdk/python/tests/fixtures/data_source_fixtures.py

+        df, event_timestamp_column
+    )
+    return BigQuerySource(
+        query=bq_source_using_table_ref.table_ref,


The query is supposed to be a select statement, so I think something like f"SELECT * FROM {bq_source_using_table_ref.table_ref}" would be better

jklegar · 2021-05-03T21:23:13Z

sdk/python/tests/test_feature_store.py

        expected = {
            ("float_col", ValueType.DOUBLE),
            ("int64_col", ValueType.INT64),
            ("string_col", ValueType.STRING),
        }
+        expected2 = {  # parsing with the "query" param in bq sources uses different code path


I think we want both code paths to end up producing the same types, see also comment on types above

jklegar · 2021-05-03T21:29:52Z

sdk/python/feast/type_map.py

        "INT64": ValueType.INT64,
        "STRING": ValueType.STRING,
+        "FLOAT": ValueType.FLOAT,


it's pretty weird on BigQuery's part that it's giving different types based on whether you query the INFORMATION_SCHEMA table or look at the field_type on schema_field; however, it seems INTEGER and INT64 are equivalent, and same for FLOAT and FLOAT64 - see https://cloud.google.com/bigquery/docs/reference/rest/v2/tables#TableFieldSchema.FIELDS.type.
So I think that here you can just have "INTEGER" map to the int64 value type and "FLOAT" map to the double value type, and this should also fix the mapping to different types in the test below

Signed-off-by: David Y Liu <davidyliuliu@gmail.com>

feast-ci-bot · 2021-05-04T16:28:59Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jklegar, mavysavydav

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [jklegar]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

jklegar · 2021-05-04T16:29:50Z

/kind feature

jklegar · 2021-05-04T16:29:56Z

/lgtm

…urce (feast-dev#1523) * Implemented the inferencing. Did cursory runs to make sure it works. More through testing needed. Signed-off-by: David Liu <davidl@twitter.com> * fixed issue with mutable default argument in FeatureView Signed-off-by: David Liu <davidl@twitter.com> * Fix in example_feature_repo_with_inference.py file Signed-off-by: David Liu <davidl@twitter.com> * Added test cases and small fixes. Signed-off-by: David Liu <davidl@twitter.com> * fixed missing import with handling for lint error Signed-off-by: David Liu <davidl@twitter.com> * marked a test that needs bigquery client requesting to be an integration test & added __ rule in inference Signed-off-by: David Liu <davidl@twitter.com> * Code review corrections + BQSource Query arg handling + corresponding test case for it Signed-off-by: David Liu <davidl@twitter.com> * CR corrections Signed-off-by: David Y Liu <davidyliuliu@gmail.com> Co-authored-by: David Liu <davidl@twitter.com>

…urce (#1523) * Implemented the inferencing. Did cursory runs to make sure it works. More through testing needed. Signed-off-by: David Liu <davidl@twitter.com> * fixed issue with mutable default argument in FeatureView Signed-off-by: David Liu <davidl@twitter.com> * Fix in example_feature_repo_with_inference.py file Signed-off-by: David Liu <davidl@twitter.com> * Added test cases and small fixes. Signed-off-by: David Liu <davidl@twitter.com> * fixed missing import with handling for lint error Signed-off-by: David Liu <davidl@twitter.com> * marked a test that needs bigquery client requesting to be an integration test & added __ rule in inference Signed-off-by: David Liu <davidl@twitter.com> * Code review corrections + BQSource Query arg handling + corresponding test case for it Signed-off-by: David Liu <davidl@twitter.com> * CR corrections Signed-off-by: David Y Liu <davidyliuliu@gmail.com> Co-authored-by: David Liu <davidl@twitter.com>

feast-ci-bot added do-not-merge/work-in-progress release-note needs-kind needs-ok-to-test labels Apr 29, 2021

feast-ci-bot added the size/L label Apr 29, 2021

Implemented the inferencing. Did cursory runs to make sure it works. …

df1e9a3

…More through testing needed. Signed-off-by: David Liu <davidl@twitter.com>

mavysavydav force-pushed the featureRepoSchemaInferencing branch from b397c1a to df1e9a3 Compare April 29, 2021 16:32

mavysavydav marked this pull request as ready for review April 29, 2021 17:32

mavysavydav requested review from jklegar, tsotnet, woop and a team as code owners April 29, 2021 17:32

feast-ci-bot removed the do-not-merge/work-in-progress label Apr 29, 2021

feast-ci-bot added ok-to-test and removed needs-ok-to-test labels Apr 29, 2021

woop assigned jklegar Apr 30, 2021

fixed issue with mutable default argument in FeatureView

e59c561

Signed-off-by: David Liu <davidl@twitter.com>

mavysavydav force-pushed the featureRepoSchemaInferencing branch from aba7fea to e59c561 Compare April 30, 2021 04:01

Fix in example_feature_repo_with_inference.py file

8791013

Signed-off-by: David Liu <davidl@twitter.com>

David Liu added 3 commits April 30, 2021 14:16

Added test cases and small fixes.

c9484a2

Signed-off-by: David Liu <davidl@twitter.com>

fixed missing import with handling for lint error

6ef2afd

Signed-off-by: David Liu <davidl@twitter.com>

marked a test that needs bigquery client requesting to be an integrat…

91bf53b

…ion test & added __ rule in inference Signed-off-by: David Liu <davidl@twitter.com>

jklegar reviewed May 1, 2021

View reviewed changes

Code review corrections + BQSource Query arg handling + corresponding…

01fd199

… test case for it Signed-off-by: David Liu <davidl@twitter.com>

mavysavydav force-pushed the featureRepoSchemaInferencing branch from d097efe to 01fd199 Compare May 1, 2021 23:29

jklegar reviewed May 3, 2021

View reviewed changes

CR corrections

052abe9

Signed-off-by: David Y Liu <davidyliuliu@gmail.com>

jklegar approved these changes May 4, 2021

View reviewed changes

feast-ci-bot added the approved label May 4, 2021

feast-ci-bot added kind/feature New feature or request and removed needs-kind labels May 4, 2021

feast-ci-bot added the lgtm label May 4, 2021

feast-ci-bot merged commit f55b51c into feast-dev:master May 4, 2021

woop mentioned this pull request May 11, 2021

Add support for DynamoDB and S3 registry #1483

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inferencing of Features in FeatureView and timestamp column of DataSource #1523

Inferencing of Features in FeatureView and timestamp column of DataSource #1523

mavysavydav commented Apr 29, 2021 •

edited

Loading

feast-ci-bot commented Apr 29, 2021

woop commented Apr 29, 2021

woop commented Apr 29, 2021

woop commented Apr 30, 2021

mavysavydav commented Apr 30, 2021

mavysavydav commented Apr 30, 2021

jklegar left a comment

mavysavydav commented May 2, 2021

jklegar left a comment

jklegar May 3, 2021

jklegar May 3, 2021

jklegar May 3, 2021

jklegar May 3, 2021

feast-ci-bot commented May 4, 2021

jklegar commented May 4, 2021

jklegar commented May 4, 2021

Inferencing of Features in FeatureView and timestamp column of DataSource #1523

Inferencing of Features in FeatureView and timestamp column of DataSource #1523

Conversation

mavysavydav commented Apr 29, 2021 • edited Loading

feast-ci-bot commented Apr 29, 2021

woop commented Apr 29, 2021

woop commented Apr 29, 2021

woop commented Apr 30, 2021

mavysavydav commented Apr 30, 2021

mavysavydav commented Apr 30, 2021

jklegar left a comment

Choose a reason for hiding this comment

mavysavydav commented May 2, 2021

jklegar left a comment

Choose a reason for hiding this comment

jklegar May 3, 2021

Choose a reason for hiding this comment

jklegar May 3, 2021

Choose a reason for hiding this comment

jklegar May 3, 2021

Choose a reason for hiding this comment

jklegar May 3, 2021

Choose a reason for hiding this comment

feast-ci-bot commented May 4, 2021

jklegar commented May 4, 2021

jklegar commented May 4, 2021

mavysavydav commented Apr 29, 2021 •

edited

Loading