-
Notifications
You must be signed in to change notification settings - Fork 996
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Update Feast object metadata in the registry #4257
fix: Update Feast object metadata in the registry #4257
Conversation
Hey, thanks for raising this. Just curious, maybe you have taken a look, is this the case only for sql registry? why isn't the same happening for file-based registries? |
@tokoko I noticed this for the SQL registry as our team uses it, but I'm not sure if this is happening for other registry types as well. If other registries do require this change, it would be separate changes to each of their apply methods from what I've seen so far. |
assert len(feature_services) == 1 | ||
assert feature_services[0] == fs | ||
|
||
test_registry.delete_feature_service("my_feature_service_1", project) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should you be doing this? If you delete and apply again, of course created_timestamp
will be regenerated. Isn't the whole point to assert that it's unchanged even if new object is applied when during apply an object already exists?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My bad, this was supposed to be present for test_apply_feature_service_success
only. I'll remove this.
assert feature_services[0] == fs | ||
# The created_timestamp for the feature service should be set to the created_timestamp value stored from the | ||
# previous apply | ||
assert feature_services[0].created_timestamp is not None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you drop delete
above, then I think this should be an equality check to assert that it's not changed. is not None
doesn't seem to be checking much, unless I'm missing something
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right. I initially wrote the code with the mindset that it shouldn't return a None when you do an apply, but it would be better to check this against the previous feature view's created timestamp as well.
@msistla96 sure, I quickly took a look and this doesn't seem to be handled anywhere. That's why I find it strange that the newly added tests in |
Signed-off-by: msistla96 <msistla@expediagroup.com>
ad0ed8f
to
f8d3c85
Compare
So the tests worked because for the other registries, the created_timestamp value does get set technically but it shouldn't for each time a Feast object gets updated. I've updated the code for all the other registries failing the test and also the tests as well. |
Signed-off-by: msistla96 <msistla@expediagroup.com>
Signed-off-by: msistla96 <msistla@expediagroup.com>
af1c94d
to
ca88553
Compare
@msistla96 thanks. this is looking good. Can you add |
Signed-off-by: msistla96 <msistla@expediagroup.com>
@tokoko, while adding the tests you mentioned, I noticed two bugs with Remote Registry:
To reproduce Timestamp's behavior:
|
and len(updated_feature_view.materialization_intervals) == 0 | ||
) | ||
|
||
# Simulate materialization |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should put this before test_registry.apply_feature_view(updated_fv1, project)
to make sure that an apply operation doesn't mess with materialization intervals.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Signed-off-by: msistla96 <msistla@expediagroup.com>
I had the same bug in |
…mpty only Signed-off-by: msistla96 <msistla@expediagroup.com>
@tokoko Could I have another review please? I realised that I missed a test case when apply_materialization adds the materialization_intervals which shouldn't call update_materialization_intervals, which I added in the last commit. |
@msistla96 that's an interesting corner case. Why do you think that should be the desired behavior? That actually would sort of address an issue raised here #4222, but I'm still unsure. We would effectively be allowing materialization intervals to be updates in two ways (apply_materialization and apply_feature_view). |
apply_materialization takes care of updating materialization_intervals, but apply_feature_view will ensure the materialization_interval stored doesn't get erased( the only time we would erase it is if the feature view was created for the first time and that becomes a separate feature view anyway). I think renaming update_materialization_intervals to copy_materialization_intervals would avoid this confusion. Does that make sense? |
@tokoko What is the purpose/ use case of calling apply_materialization for StreamFeatureViews? I'm assuming it mostly make sense for Feature Views only. |
@EXPEbdodla yeah, you're right, my bad. I've had a similar use (not with feast) case when I needed to update online store both directly and from offline with different datasets, but until we can practically support something like that, there's no point. |
@tokoko I see Registry support Stream Feature Views for apply_materialization currently, so I'm a bit confused about your comment on not having it. Are there any changes required for this PR? |
@msistla96 Technically it is, but as feast workflow for stream feature views doesn't really contain materialization from offline to online, it's never used. It would be impossible to restrict it in signature with types only as I was still unsure about the update we discussed above, but you're probably right, overwriting with new materialization intervals if provided seems as good as any other option. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for the fix
@tokoko Let me know when this would be merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
# [0.40.0](v0.39.0...v0.40.0) (2024-07-31) ### Bug Fixes * Added missing type ([#4315](#4315)) ([86af60a](86af60a)) * Avoid XSS attack from Jinjin2's Environment(). ([#4355](#4355)) ([40270e7](40270e7)) * CGO Memory leak issue in GO Feature server ([#4291](#4291)) ([43e198f](43e198f)) * Deprecated the datetime.utcfromtimestamp(). ([#4306](#4306)) ([21deec8](21deec8)) * Fix SQLite import issue ([#4294](#4294)) ([398ea3b](398ea3b)) * Increment operator to v0.39.0 ([#4368](#4368)) ([3ddb4fb](3ddb4fb)) * Minor typo in the unit test. ([#4296](#4296)) ([6c75e84](6c75e84)) * OnDemandFeatureView type inference for array types ([#4310](#4310)) ([c45ff72](c45ff72)) * Remove redundant batching in PostgreSQLOnlineStore.online_write_batch and fix progress bar ([#4331](#4331)) ([0d89d15](0d89d15)) * Remove typo. ([#4351](#4351)) ([92d17de](92d17de)) * Retire the datetime.utcnow(). ([#4352](#4352)) ([a8bc696](a8bc696)) * Update dask version to support pandas 1.x ([#4326](#4326)) ([a639d61](a639d61)) * Update Feast object metadata in the registry ([#4257](#4257)) ([8028ae0](8028ae0)) * Using one single function call for utcnow(). ([#4307](#4307)) ([98ff63c](98ff63c)) ### Features * Add async feature retrieval for Postgres Online Store ([#4327](#4327)) ([cea52e9](cea52e9)) * Add Async refresh to Sql Registry ([#4251](#4251)) ([f569786](f569786)) * Add SingleStore as an OnlineStore ([#4285](#4285)) ([2c38946](2c38946)) * Add Tornike to maintainers.md ([#4339](#4339)) ([8e8c1f2](8e8c1f2)) * Bump psycopg2 to psycopg3 for all Postgres components ([#4303](#4303)) ([9451d9c](9451d9c)) * Entity key deserialization ([#4284](#4284)) ([83fad15](83fad15)) * Ignore paths feast apply ([#4276](#4276)) ([b4d54af](b4d54af)) * Move get_online_features to OnlineStore interface ([#4319](#4319)) ([7072fd0](7072fd0)) * Port mssql contrib offline store to ibis ([#4360](#4360)) ([7914cbd](7914cbd)) ### Reverts * Revert "fix: Avoid XSS attack from Jinjin2's Environment()." ([#4357](#4357)) ([cdeab48](cdeab48)), closes [#4355](#4355)
# [0.40.0](feast-dev/feast@v0.39.0...v0.40.0) (2024-07-31) ### Bug Fixes * Added missing type ([feast-dev#4315](feast-dev#4315)) ([86af60a](feast-dev@86af60a)) * Avoid XSS attack from Jinjin2's Environment(). ([feast-dev#4355](feast-dev#4355)) ([40270e7](feast-dev@40270e7)) * CGO Memory leak issue in GO Feature server ([feast-dev#4291](feast-dev#4291)) ([43e198f](feast-dev@43e198f)) * Deprecated the datetime.utcfromtimestamp(). ([feast-dev#4306](feast-dev#4306)) ([21deec8](feast-dev@21deec8)) * Fix SQLite import issue ([feast-dev#4294](feast-dev#4294)) ([398ea3b](feast-dev@398ea3b)) * Increment operator to v0.39.0 ([feast-dev#4368](feast-dev#4368)) ([3ddb4fb](feast-dev@3ddb4fb)) * Minor typo in the unit test. ([feast-dev#4296](feast-dev#4296)) ([6c75e84](feast-dev@6c75e84)) * OnDemandFeatureView type inference for array types ([feast-dev#4310](feast-dev#4310)) ([c45ff72](feast-dev@c45ff72)) * Remove redundant batching in PostgreSQLOnlineStore.online_write_batch and fix progress bar ([feast-dev#4331](feast-dev#4331)) ([0d89d15](feast-dev@0d89d15)) * Remove typo. ([feast-dev#4351](feast-dev#4351)) ([92d17de](feast-dev@92d17de)) * Retire the datetime.utcnow(). ([feast-dev#4352](feast-dev#4352)) ([a8bc696](feast-dev@a8bc696)) * Update dask version to support pandas 1.x ([feast-dev#4326](feast-dev#4326)) ([a639d61](feast-dev@a639d61)) * Update Feast object metadata in the registry ([feast-dev#4257](feast-dev#4257)) ([8028ae0](feast-dev@8028ae0)) * Using one single function call for utcnow(). ([feast-dev#4307](feast-dev#4307)) ([98ff63c](feast-dev@98ff63c)) ### Features * Add async feature retrieval for Postgres Online Store ([feast-dev#4327](feast-dev#4327)) ([cea52e9](feast-dev@cea52e9)) * Add Async refresh to Sql Registry ([feast-dev#4251](feast-dev#4251)) ([f569786](feast-dev@f569786)) * Add SingleStore as an OnlineStore ([feast-dev#4285](feast-dev#4285)) ([2c38946](feast-dev@2c38946)) * Add Tornike to maintainers.md ([feast-dev#4339](feast-dev#4339)) ([8e8c1f2](feast-dev@8e8c1f2)) * Bump psycopg2 to psycopg3 for all Postgres components ([feast-dev#4303](feast-dev#4303)) ([9451d9c](feast-dev@9451d9c)) * Entity key deserialization ([feast-dev#4284](feast-dev#4284)) ([83fad15](feast-dev@83fad15)) * Ignore paths feast apply ([feast-dev#4276](feast-dev#4276)) ([b4d54af](feast-dev@b4d54af)) * Move get_online_features to OnlineStore interface ([feast-dev#4319](feast-dev#4319)) ([7072fd0](feast-dev@7072fd0)) * Port mssql contrib offline store to ibis ([feast-dev#4360](feast-dev#4360)) ([7914cbd](feast-dev@7914cbd)) ### Reverts * Revert "fix: Avoid XSS attack from Jinjin2's Environment()." ([feast-dev#4357](feast-dev#4357)) ([cdeab48](feast-dev@cdeab48)), closes [feast-dev#4355](feast-dev#4355)
# [0.40.0](feast-dev/feast@v0.39.0...v0.40.0) (2024-07-31) ### Bug Fixes * Added missing type ([feast-dev#4315](feast-dev#4315)) ([86af60a](feast-dev@86af60a)) * Avoid XSS attack from Jinjin2's Environment(). ([feast-dev#4355](feast-dev#4355)) ([40270e7](feast-dev@40270e7)) * CGO Memory leak issue in GO Feature server ([feast-dev#4291](feast-dev#4291)) ([43e198f](feast-dev@43e198f)) * Deprecated the datetime.utcfromtimestamp(). ([feast-dev#4306](feast-dev#4306)) ([21deec8](feast-dev@21deec8)) * Fix SQLite import issue ([feast-dev#4294](feast-dev#4294)) ([398ea3b](feast-dev@398ea3b)) * Increment operator to v0.39.0 ([feast-dev#4368](feast-dev#4368)) ([3ddb4fb](feast-dev@3ddb4fb)) * Minor typo in the unit test. ([feast-dev#4296](feast-dev#4296)) ([6c75e84](feast-dev@6c75e84)) * OnDemandFeatureView type inference for array types ([feast-dev#4310](feast-dev#4310)) ([c45ff72](feast-dev@c45ff72)) * Remove redundant batching in PostgreSQLOnlineStore.online_write_batch and fix progress bar ([feast-dev#4331](feast-dev#4331)) ([0d89d15](feast-dev@0d89d15)) * Remove typo. ([feast-dev#4351](feast-dev#4351)) ([92d17de](feast-dev@92d17de)) * Retire the datetime.utcnow(). ([feast-dev#4352](feast-dev#4352)) ([a8bc696](feast-dev@a8bc696)) * Update dask version to support pandas 1.x ([feast-dev#4326](feast-dev#4326)) ([a639d61](feast-dev@a639d61)) * Update Feast object metadata in the registry ([feast-dev#4257](feast-dev#4257)) ([8028ae0](feast-dev@8028ae0)) * Using one single function call for utcnow(). ([feast-dev#4307](feast-dev#4307)) ([98ff63c](feast-dev@98ff63c)) ### Features * Add async feature retrieval for Postgres Online Store ([feast-dev#4327](feast-dev#4327)) ([cea52e9](feast-dev@cea52e9)) * Add Async refresh to Sql Registry ([feast-dev#4251](feast-dev#4251)) ([f569786](feast-dev@f569786)) * Add SingleStore as an OnlineStore ([feast-dev#4285](feast-dev#4285)) ([2c38946](feast-dev@2c38946)) * Add Tornike to maintainers.md ([feast-dev#4339](feast-dev#4339)) ([8e8c1f2](feast-dev@8e8c1f2)) * Bump psycopg2 to psycopg3 for all Postgres components ([feast-dev#4303](feast-dev#4303)) ([9451d9c](feast-dev@9451d9c)) * Entity key deserialization ([feast-dev#4284](feast-dev#4284)) ([83fad15](feast-dev@83fad15)) * Ignore paths feast apply ([feast-dev#4276](feast-dev#4276)) ([b4d54af](feast-dev@b4d54af)) * Move get_online_features to OnlineStore interface ([feast-dev#4319](feast-dev#4319)) ([7072fd0](feast-dev@7072fd0)) * Port mssql contrib offline store to ibis ([feast-dev#4360](feast-dev#4360)) ([7914cbd](feast-dev@7914cbd)) ### Reverts * Revert "fix: Avoid XSS attack from Jinjin2's Environment()." ([feast-dev#4357](feast-dev#4357)) ([cdeab48](feast-dev@cdeab48)), closes [feast-dev#4355](feast-dev#4355)
What this PR does / why we need it:
This PR is to fix an issue observed when updating a feature view: The
created_timestamp
andmaterialization_intervals
values already present in the SQL registry for the feature view are overwritten by the respective values created when feature view is updated, which isNone
. So when one queries the feature view from the registry, it returnNone
for both these fields.The fix requires updating the
apply_object
function to have the previously stored values of the metadata fields like created_timestamp etc assigned to the feast objects respective fields, before it gets updated by the SQL registry.Hence this change has been made for the relevant Feast objects like
Entity
,SavedDataset
,FeatureService
,FeatureView
,OnDemandFeatureView
,StreamFeatureView
.This change has also been applied for all other registries other than SQL.
Which issue(s) this PR fixes:
Fixes