-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for DynamoDB and S3 registry #1483
Add support for DynamoDB and S3 registry #1483
Conversation
Hi @leonid133. Thanks for your PR. I'm waiting for a feast-dev member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/ok-to-test |
Thanks for this @leonid133! |
Also, we certainly should add the data model for Dynamo here: https://github.com/feast-dev/feast/tree/master/docs/specs |
Also, we will definitely need to have integrations tests with DynamoDB in order to have confidence in this code. We're happy to use our own AWS account, just let me know when you're ready the code and I can help set things up. |
d63be24
to
53284e8
Compare
Signed-off-by: lblokhin <lenin133@yandex.ru>
Signed-off-by: lblokhin <lenin133@yandex.ru>
Signed-off-by: lblokhin <lenin133@yandex.ru>
Signed-off-by: lblokhin <lenin133@yandex.ru>
Signed-off-by: lblokhin <lenin133@yandex.ru>
Signed-off-by: lblokhin <lenin133@yandex.ru>
685bd68
to
1ccfd7d
Compare
Signed-off-by: lblokhin <lenin133@yandex.ru>
1ccfd7d
to
3d1b78c
Compare
…ure/online_dynamodb Signed-off-by: lblokhin <lenin133@yandex.ru>
Signed-off-by: Tsotne Tabidze <tsotne@tecton.ai>
Signed-off-by: Tsotne Tabidze <tsotne@tecton.ai>
Signed-off-by: Tsotne Tabidze <tsotne@tecton.ai>
Signed-off-by: Tsotne Tabidze <tsotne@tecton.ai>
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: leonid133, tsotnet The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Huge. Thanks @leonid133 @blvp @tsotnet ! |
* Add support for DynamoDB and S3 registry Signed-off-by: lblokhin <lenin133@yandex.ru> * rcu and wcu as a parameter of dynamodb online store Signed-off-by: lblokhin <lenin133@yandex.ru> * fix linter Signed-off-by: lblokhin <lenin133@yandex.ru> * aws dependency to extras Signed-off-by: lblokhin <lenin133@yandex.ru> * FEAST_S3_ENDPOINT_URL Signed-off-by: lblokhin <lenin133@yandex.ru> * tests Signed-off-by: lblokhin <lenin133@yandex.ru> * fix signature, after merge Signed-off-by: lblokhin <lenin133@yandex.ru> * aws default region name configurable Signed-off-by: lblokhin <lenin133@yandex.ru> * add offlinestore config type to test Signed-off-by: lblokhin <lenin133@yandex.ru> * review changes Signed-off-by: lblokhin <lenin133@yandex.ru> * review requested changes Signed-off-by: lblokhin <lenin133@yandex.ru> * integration test for Dynamo Signed-off-by: lblokhin <lenin133@yandex.ru> * change the rest of table_name to table_instance (where table_name is actually an instance of DynamoDB Table object) Signed-off-by: lblokhin <lenin133@yandex.ru> * fix DynamoDBOnlineStore commit Signed-off-by: lblokhin <lenin133@yandex.ru> * move client to _initialize_dynamodb Signed-off-by: lblokhin <lenin133@yandex.ru> * rename document_id to entity_id and Row to entity_id Signed-off-by: lblokhin <lenin133@yandex.ru> * The default value is None Signed-off-by: lblokhin <lenin133@yandex.ru> * Remove Datastore from the docstring. Signed-off-by: lblokhin <lenin133@yandex.ru> * get rid of the return call from S3RegistryStore Signed-off-by: lblokhin <lenin133@yandex.ru> * merge two exceptions Signed-off-by: lblokhin <lenin133@yandex.ru> * For ci requirement Signed-off-by: lblokhin <lenin133@yandex.ru> * remove configuration from test Signed-off-by: lblokhin <lenin133@yandex.ru> * feast-integration-tests for tests Signed-off-by: lblokhin <lenin133@yandex.ru> * change test path Signed-off-by: lblokhin <lenin133@yandex.ru> * add fixture feature_store_with_s3_registry to test Signed-off-by: lblokhin <lenin133@yandex.ru> * region required Signed-off-by: lblokhin <lenin133@yandex.ru> * Address the rest of the comments Signed-off-by: Tsotne Tabidze <tsotne@tecton.ai> * Update to_table to to_arrow Signed-off-by: Tsotne Tabidze <tsotne@tecton.ai> Co-authored-by: Tsotne Tabidze <tsotne@tecton.ai> Signed-off-by: Mwad22 <51929507+Mwad22@users.noreply.github.com>
…on to include feature view name in feature naming. (#1641) * test Signed-off-by: David Y Liu <davidyliuliu@gmail.com> Signed-off-by: Mwad22 <51929507+Mwad22@users.noreply.github.com> * refactored existing tests to test full_feature_names feature on data retreival, added new tests also. Signed-off-by: Mwad22 <51929507+Mwad22@users.noreply.github.com> * removed full_feature_names usage from quickstart and README to have more simple examples. Resolved failing tests. Signed-off-by: Mwad22 <51929507+Mwad22@users.noreply.github.com> * Update CHANGELOG for Feast v0.10.8 Signed-off-by: Mwad22 <51929507+Mwad22@users.noreply.github.com> * GitBook: [master] 2 pages modified Signed-off-by: Mwad22 <51929507+Mwad22@users.noreply.github.com> * Schema Inferencing should happen at apply time (#1646) * wip1 Signed-off-by: David Y Liu <davidyliuliu@gmail.com> * just need to do clean up Signed-off-by: David Y Liu <davidyliuliu@gmail.com> * linted Signed-off-by: David Y Liu <davidyliuliu@gmail.com> * improve test coverage Signed-off-by: David Y Liu <davidyliuliu@gmail.com> * changed placement of inference methods in repo_operation apply_total Signed-off-by: David Y Liu <davidyliuliu@gmail.com> * updated inference method name + changed to void return since it updates in place Signed-off-by: David Y Liu <davidyliuliu@gmail.com> * fixed integration test and added comments Signed-off-by: David Y Liu <davidyliuliu@gmail.com> * Made DataSource event_timestamp_column optional Signed-off-by: David Y Liu <davidyliuliu@gmail.com> Signed-off-by: Mwad22 <51929507+Mwad22@users.noreply.github.com> * GitBook: [master] 80 pages modified Signed-off-by: Mwad22 <51929507+Mwad22@users.noreply.github.com> * GitBook: [master] 80 pages modified Signed-off-by: Mwad22 <51929507+Mwad22@users.noreply.github.com> * Provide descriptive error on invalid table reference (#1627) * Initial commit to catch nonexistent table Signed-off-by: Cody Lin <codyjlin@yahoo.com> Signed-off-by: Cody Lin <codyl@twitter.com> * simplify nonexistent BQ table test Signed-off-by: Cody Lin <codyl@twitter.com> * clean up table_exists exception Signed-off-by: Cody Lin <codyl@twitter.com> * remove unneeded variable Signed-off-by: Cody Lin <codyl@twitter.com> * function name change to _assert_table_exists Signed-off-by: Cody Lin <codyl@twitter.com> * Initial commit to catch nonexistent table Signed-off-by: Cody Lin <codyjlin@yahoo.com> Signed-off-by: Cody Lin <codyl@twitter.com> * simplify nonexistent BQ table test Signed-off-by: Cody Lin <codyl@twitter.com> * clean up table_exists exception Signed-off-by: Cody Lin <codyl@twitter.com> * function name change to _assert_table_exists Signed-off-by: Cody Lin <codyl@twitter.com> * fix lint errors and rebase Signed-off-by: Cody Lin <codyl@twitter.com> * Fix get_table(None) error Signed-off-by: Cody Lin <codyl@twitter.com> * custom exception for both missing file and BQ source Signed-off-by: Cody Lin <codyl@twitter.com> * revert FileSource checks Signed-off-by: Cody Lin <codyl@twitter.com> * Use DataSourceNotFoundException instead of subclassing Signed-off-by: Cody Lin <codyl@twitter.com> * Moved assert_table_exists out of the BQ constructor to apply_total Signed-off-by: Cody Lin <codyl@twitter.com> * rename test and test asset Signed-off-by: Cody Lin <codyl@twitter.com> * move validate logic back to data_source Signed-off-by: Cody Lin <codyl@twitter.com> * fixed tests Signed-off-by: Cody Lin <codyl@twitter.com> * Set pytest.integration for tests that access BQ Signed-off-by: Cody Lin <codyl@twitter.com> * Import pytest in failed test files Signed-off-by: Cody Lin <codyl@twitter.com> Signed-off-by: Mwad22 <51929507+Mwad22@users.noreply.github.com> * Refactor OnlineStoreConfig classes into owning modules (#1649) * Refactor OnlineStoreConfig classes into owning modules Signed-off-by: Achal Shah <achals@gmail.com> * make format Signed-off-by: Achal Shah <achals@gmail.com> * Move redis too Signed-off-by: Achal Shah <achals@gmail.com> * update test_telemetery Signed-off-by: Achal Shah <achals@gmail.com> * add a create_repo_config method that should be called instead of RepoConfig ctor directly Signed-off-by: Achal Shah <achals@gmail.com> * fix the table reference in repo_operations Signed-off-by: Achal Shah <achals@gmail.com> * reuse create_repo_config Signed-off-by: Achal Shah <achals@gmail.com> Remove redis provider reference * CR comments Signed-off-by: Achal Shah <achals@gmail.com> * Remove create_repo_config in favor of __init__ Signed-off-by: Achal Shah <achals@gmail.com> * make format Signed-off-by: Achal Shah <achals@gmail.com> * Remove print statement Signed-off-by: Achal Shah <achals@gmail.com> Signed-off-by: Mwad22 <51929507+Mwad22@users.noreply.github.com> * Possibility to specify a project for BigQuery queries (#1656) Signed-off-by: Matt Delacour <matt.delacour@shopify.com> Co-authored-by: Achal Shah <achals@gmail.com> Signed-off-by: Mwad22 <51929507+Mwad22@users.noreply.github.com> * Refactor OfflineStoreConfig classes into their owning modules (#1657) * Refactor OfflineStoreConfig classes into their owning modules Signed-off-by: Achal Shah <achals@gmail.com> * Fix error string Signed-off-by: Achal Shah <achals@gmail.com> * Generic error class Signed-off-by: Achal Shah <achals@gmail.com> * Merge conflicts Signed-off-by: Achal Shah <achals@gmail.com> * make the store type work, and add a test that uses the fully qualified name of the OnlineStore Signed-off-by: Achal Shah <achals@gmail.com> * Address comments from previous PR Signed-off-by: Achal Shah <achals@gmail.com> * CR updates Signed-off-by: Achal Shah <achals@gmail.com> Signed-off-by: Mwad22 <51929507+Mwad22@users.noreply.github.com> * Run python unit tests in parallel (#1652) Signed-off-by: Achal Shah <achals@gmail.com> Signed-off-by: Mwad22 <51929507+Mwad22@users.noreply.github.com> * Rename telemetry to usage (#1660) * Rename telemetry to usage Signed-off-by: Tsotne Tabidze <tsotne@tecton.ai> * Update docs Signed-off-by: Tsotne Tabidze <tsotne@tecton.ai> * Update .prow and infra Signed-off-by: Tsotne Tabidze <tsotne@tecton.ai> * Rename file Signed-off-by: Tsotne Tabidze <tsotne@tecton.ai> * Change url Signed-off-by: Tsotne Tabidze <tsotne@tecton.ai> * Re-add telemetry.md for backwards-compatibility Signed-off-by: Tsotne Tabidze <tsotne@tecton.ai> Signed-off-by: Mwad22 <51929507+Mwad22@users.noreply.github.com> * resolved final comments on PR (variable renaming, refactor tests) Signed-off-by: Mwad22 <51929507+Mwad22@users.noreply.github.com> * reformatted after merge conflict Signed-off-by: Mwad22 <51929507+Mwad22@users.noreply.github.com> * Update CHANGELOG for Feast v0.11.0 Signed-off-by: Willem Pienaar <git@willem.co> Signed-off-by: Mwad22 <51929507+Mwad22@users.noreply.github.com> * Update charts README (#1659) Adding feast jupyter link to it. + Fix the helm 'feast-serving' name in aws/azure terraform. Signed-off-by: szalai1 <szalaipeti.vagyok@gmail.com> Signed-off-by: Mwad22 <51929507+Mwad22@users.noreply.github.com> * Added Redis to list of online stores for local provider in providers reference doc. (#1668) Signed-off-by: Nel Swanepoel <c.swanepoel@ucl.ac.uk> Signed-off-by: Mwad22 <51929507+Mwad22@users.noreply.github.com> * Grouped inferencing statements together in apply methods for easier readability (#1667) * grouped inferencing statements together Signed-off-by: David Y Liu <davidyliuliu@gmail.com> * update in testing Signed-off-by: David Y Liu <davidyliuliu@gmail.com> Signed-off-by: Mwad22 <51929507+Mwad22@users.noreply.github.com> * Add RedshiftDataSource (#1669) * Add RedshiftDataSource Signed-off-by: Tsotne Tabidze <tsotne@tecton.ai> * Call parent __init__ first Signed-off-by: Tsotne Tabidze <tsotne@tecton.ai> Signed-off-by: Mwad22 <51929507+Mwad22@users.noreply.github.com> * Provide the user with more options for setting the to_bigquery config (#1661) * Provide more options for to_bigquery config Signed-off-by: Cody Lin <codyl@twitter.com> * Fix default job_config when none; remove excessive testing Signed-off-by: Cody Lin <codyl@twitter.com> * Add param type and docstring Signed-off-by: Cody Lin <codyl@twitter.com> * add docstrings and typing Signed-off-by: Cody Lin <codyl@twitter.com> * Apply docstring suggestions from code review Co-authored-by: Willem Pienaar <6728866+woop@users.noreply.github.com> Signed-off-by: Cody Lin <codyjlin@yahoomail.com> Co-authored-by: Willem Pienaar <6728866+woop@users.noreply.github.com> Signed-off-by: Mwad22 <51929507+Mwad22@users.noreply.github.com> * Add streaming sources to the FeatureView API (#1664) * Add a streaming source to the FeatureView API This diff only updates the API. It is currently up to the providers to actually use this information to spin up resources to consume events from the stream sources. Signed-off-by: Achal Shah <achals@gmail.com> * remove stuff from rebase Signed-off-by: Achal Shah <achals@gmail.com> * make format Signed-off-by: Achal Shah <achals@gmail.com> * Update protos Signed-off-by: Achal Shah <achals@gmail.com> * lint Signed-off-by: Achal Shah <achals@gmail.com> * format Signed-off-by: Achal Shah <achals@gmail.com> * CR Signed-off-by: Achal Shah <achals@gmail.com> * fix test Signed-off-by: Achal Shah <achals@gmail.com> * lint Signed-off-by: Achal Shah <achals@gmail.com> Signed-off-by: Mwad22 <51929507+Mwad22@users.noreply.github.com> * Add to_table() to RetrievalJob object (#1663) * Add notion of OfflineJob Signed-off-by: Matt Delacour <matt.delacour@shopify.com> * Use RetrievalJob instead of creating a new OfflineJob object Signed-off-by: Matt Delacour <matt.delacour@shopify.com> * Add to_table() in integration tests Signed-off-by: Matt Delacour <matt.delacour@shopify.com> Co-authored-by: Tsotne Tabidze <tsotne@tecton.ai> Signed-off-by: Mwad22 <51929507+Mwad22@users.noreply.github.com> * Rename to_table to to_arrow (#1671) Signed-off-by: Matt Delacour <matt.delacour@shopify.com> Signed-off-by: Mwad22 <51929507+Mwad22@users.noreply.github.com> * Cancel BigQuery job if timeout hits (#1672) * Cancel BigQuery job if timedout hits Signed-off-by: Matt Delacour <matt.delacour@shopify.com> * Fix typo Signed-off-by: Matt Delacour <matt.delacour@shopify.com> Signed-off-by: Mwad22 <51929507+Mwad22@users.noreply.github.com> * Fix Feature References example (#1674) Fix Feature References example by passing `entity_rows` to `get_online_features()` Signed-off-by: Mwad22 <51929507+Mwad22@users.noreply.github.com> * Allow strings for online/offline store instead of dicts (#1673) Signed-off-by: Achal Shah <achals@gmail.com> Signed-off-by: Mwad22 <51929507+Mwad22@users.noreply.github.com> * Remove default list from the FeatureView constructor (#1679) Signed-off-by: Achal Shah <achals@gmail.com> Signed-off-by: Mwad22 <51929507+Mwad22@users.noreply.github.com> * made changes requested by @tsotnet Signed-off-by: Mwad22 <51929507+Mwad22@users.noreply.github.com> * Fix unit tests that got broken by Pandas 1.3.0 release (#1683) Signed-off-by: Tsotne Tabidze <tsotne@tecton.ai> Signed-off-by: Mwad22 <51929507+Mwad22@users.noreply.github.com> * Add support for DynamoDB and S3 registry (#1483) * Add support for DynamoDB and S3 registry Signed-off-by: lblokhin <lenin133@yandex.ru> * rcu and wcu as a parameter of dynamodb online store Signed-off-by: lblokhin <lenin133@yandex.ru> * fix linter Signed-off-by: lblokhin <lenin133@yandex.ru> * aws dependency to extras Signed-off-by: lblokhin <lenin133@yandex.ru> * FEAST_S3_ENDPOINT_URL Signed-off-by: lblokhin <lenin133@yandex.ru> * tests Signed-off-by: lblokhin <lenin133@yandex.ru> * fix signature, after merge Signed-off-by: lblokhin <lenin133@yandex.ru> * aws default region name configurable Signed-off-by: lblokhin <lenin133@yandex.ru> * add offlinestore config type to test Signed-off-by: lblokhin <lenin133@yandex.ru> * review changes Signed-off-by: lblokhin <lenin133@yandex.ru> * review requested changes Signed-off-by: lblokhin <lenin133@yandex.ru> * integration test for Dynamo Signed-off-by: lblokhin <lenin133@yandex.ru> * change the rest of table_name to table_instance (where table_name is actually an instance of DynamoDB Table object) Signed-off-by: lblokhin <lenin133@yandex.ru> * fix DynamoDBOnlineStore commit Signed-off-by: lblokhin <lenin133@yandex.ru> * move client to _initialize_dynamodb Signed-off-by: lblokhin <lenin133@yandex.ru> * rename document_id to entity_id and Row to entity_id Signed-off-by: lblokhin <lenin133@yandex.ru> * The default value is None Signed-off-by: lblokhin <lenin133@yandex.ru> * Remove Datastore from the docstring. Signed-off-by: lblokhin <lenin133@yandex.ru> * get rid of the return call from S3RegistryStore Signed-off-by: lblokhin <lenin133@yandex.ru> * merge two exceptions Signed-off-by: lblokhin <lenin133@yandex.ru> * For ci requirement Signed-off-by: lblokhin <lenin133@yandex.ru> * remove configuration from test Signed-off-by: lblokhin <lenin133@yandex.ru> * feast-integration-tests for tests Signed-off-by: lblokhin <lenin133@yandex.ru> * change test path Signed-off-by: lblokhin <lenin133@yandex.ru> * add fixture feature_store_with_s3_registry to test Signed-off-by: lblokhin <lenin133@yandex.ru> * region required Signed-off-by: lblokhin <lenin133@yandex.ru> * Address the rest of the comments Signed-off-by: Tsotne Tabidze <tsotne@tecton.ai> * Update to_table to to_arrow Signed-off-by: Tsotne Tabidze <tsotne@tecton.ai> Co-authored-by: Tsotne Tabidze <tsotne@tecton.ai> Signed-off-by: Mwad22 <51929507+Mwad22@users.noreply.github.com> * Parallelize integration tests (#1684) * Parallelize integration tests Signed-off-by: Tsotne Tabidze <tsotne@tecton.ai> * Update the usage flag Signed-off-by: Tsotne Tabidze <tsotne@tecton.ai> Signed-off-by: Mwad22 <51929507+Mwad22@users.noreply.github.com> * BQ exception should be raised first before we check the timedout (#1675) Signed-off-by: Matt Delacour <matt.delacour@shopify.com> Signed-off-by: Mwad22 <51929507+Mwad22@users.noreply.github.com> * Update sdk/python/feast/infra/provider.py Co-authored-by: Willem Pienaar <6728866+woop@users.noreply.github.com> Signed-off-by: Mwad22 <51929507+Mwad22@users.noreply.github.com> * Update sdk/python/feast/feature_store.py Co-authored-by: Willem Pienaar <6728866+woop@users.noreply.github.com> Signed-off-by: Mwad22 <51929507+Mwad22@users.noreply.github.com> * made error logic/messages more descriptive Signed-off-by: Mwad22 <51929507+Mwad22@users.noreply.github.com> * made error logic/messages more descriptive. Signed-off-by: Mwad22 <51929507+Mwad22@users.noreply.github.com> * Simplified error messages Signed-off-by: Mwad22 <51929507+Mwad22@users.noreply.github.com> * ran formatter, issue in errors.py Signed-off-by: Mwad22 <51929507+Mwad22@users.noreply.github.com> * python linter issues resolved Signed-off-by: Mwad22 <51929507+Mwad22@users.noreply.github.com> * removed unnecessary default assignment in get_historical_features. default now set only in feature_store.py Signed-off-by: Mwad22 <51929507+Mwad22@users.noreply.github.com> * added error message assertion for feature name collisions, and other nitpick changes Signed-off-by: Mwad22 <51929507+Mwad22@users.noreply.github.com> Co-authored-by: David Y Liu <davidyliuliu@gmail.com> Co-authored-by: Tsotne Tabidze <tsotne@tecton.ai> Co-authored-by: Achal Shah <achals@gmail.com> Co-authored-by: David Y Liu <7172604+mavysavydav@users.noreply.github.com> Co-authored-by: Willem Pienaar <github@willem.co> Co-authored-by: codyjlin <31944154+codyjlin@users.noreply.github.com> Co-authored-by: Matt Delacour <MattDelac@users.noreply.github.com> Co-authored-by: Willem Pienaar <git@willem.co> Co-authored-by: Peter Szalai <szalaipeti.vagyok@gmail.com> Co-authored-by: Nel Swanepoel <nels@users.noreply.github.com> Co-authored-by: Willem Pienaar <6728866+woop@users.noreply.github.com> Co-authored-by: Greg Kuhlmann <greg.kuhlmann@gmail.com> Co-authored-by: Leonid <lenin133@yandex.ru>
What this PR does / why we need it:
We need to have an online serving backend with Dynamo to support customer deployments on AWS.
Which issue(s) this PR fixes:
Fixes #1409
Does this PR introduce a user-facing change?:
NONE