-
Notifications
You must be signed in to change notification settings - Fork 996
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Feast AWS Athena offline store (again) #3044
Conversation
I noticed there were a bunch of weird merge artifacts so I fixed them. Also removed Athena from the integration tests that run as part of CI since this is a contrib plugin. can you sign your commits? None of them seem signed. Also: I have no way of reproducing the test results because I don't have Athena setup. It's ok for this PR, but maybe leave a comment in the Makefile to specify that it needs the user to have their own custom athena setup? |
Codecov Report
@@ Coverage Diff @@
## master #3044 +/- ##
==========================================
+ Coverage 67.64% 75.72% +8.08%
==========================================
Files 167 202 +35
Lines 14696 16776 +2080
==========================================
+ Hits 9941 12704 +2763
+ Misses 4755 4072 -683
Flags with carried forward coverage won't be shown. Click here to find out more.
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
Signed-off-by: Youngkyu OH <toping4445@gmail.com>
…al_retrieval - 100% passed Signed-off-by: Youngkyu OH <toping4445@gmail.com>
Signed-off-by: Youngkyu OH <toping4445@gmail.com>
Signed-off-by: Youngkyu OH <toping4445@gmail.com>
…bucket_name hardcoding to variable in AthenaDataSourceCreator Signed-off-by: Youngkyu OH <toping4445@gmail.com>
Signed-off-by: Youngkyu OH <toping4445@gmail.com>
Signed-off-by: Danny Chiao <danny@tecton.ai>
Signed-off-by: Danny Chiao <danny@tecton.ai>
Signed-off-by: Danny Chiao <danny@tecton.ai>
Signed-off-by: Youngkyu OH <toping4445@gmail.com>
/ok-to-test |
sdk/python/feast/infra/offline_stores/contrib/athena_offline_store/athena_source.py
Outdated
Show resolved
Hide resolved
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: adchia, toping4445 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Signed-off-by: Youngkyu OH <toping4445@gmail.com>
it’d be good to have some documentation, but this is good for an initial cut. Thanks! |
/lgtm |
Is there a documentation guide? If you share it, I will refer to it and fill it out when I have time. |
* fixed bugs, cleaned code, added AthenaDataSourceCreator Signed-off-by: Youngkyu OH <toping4445@gmail.com> * fixed bugs, cleaned code, added some methods. test_universal_historical_retrieval - 100% passed Signed-off-by: Youngkyu OH <toping4445@gmail.com> * fixed bugs to pass test_validation Signed-off-by: Youngkyu OH <toping4445@gmail.com> * changed boolean data type mapping Signed-off-by: Youngkyu OH <toping4445@gmail.com> * 1.added test-python-universal-athena in Makefile 2.replaced database,bucket_name hardcoding to variable in AthenaDataSourceCreator Signed-off-by: Youngkyu OH <toping4445@gmail.com> * format,run lint Signed-off-by: Youngkyu OH <toping4445@gmail.com> * revert merge changes Signed-off-by: Danny Chiao <danny@tecton.ai> * add entity_key_serialization Signed-off-by: Danny Chiao <danny@tecton.ai> * restore deleted file Signed-off-by: Danny Chiao <danny@tecton.ai> * modified confusing environment variable names, added how to use Athena Signed-off-by: Youngkyu OH <toping4445@gmail.com> * enforce AthenaSource to have a name Signed-off-by: Youngkyu OH <toping4445@gmail.com> Co-authored-by: toping4445 <yelo.blood@kakaopaycorp.com> Co-authored-by: Danny Chiao <danny@tecton.ai> Signed-off-by: Francisco Javier Arceo <arceofrancisco@gmail.com>
# [0.24.0](v0.23.0...v0.24.0) (2022-08-25) ### Bug Fixes * Check if on_demand_feature_views is an empty list rather than None for snowflake provider ([#3046](#3046)) ([9b05e65](9b05e65)) * FeatureStore.apply applies BatchFeatureView correctly ([#3098](#3098)) ([41be511](41be511)) * Fix Feast Java inconsistency with int64 serialization vs python ([#3031](#3031)) ([4bba787](4bba787)) * Fix feature service inference logic ([#3089](#3089)) ([4310ed7](4310ed7)) * Fix field mapping logic during feature inference ([#3067](#3067)) ([cdfa761](cdfa761)) * Fix incorrect on demand feature view diffing and improve Java tests ([#3074](#3074)) ([0702310](0702310)) * Fix Java helm charts to work with refactored logic. Fix FTS image ([#3105](#3105)) ([2b493e0](2b493e0)) * Fix on demand feature view output in feast plan + Web UI crash ([#3057](#3057)) ([bfae6ac](bfae6ac)) * Fix release workflow to release 0.24.0 ([#3138](#3138)) ([a69aaae](a69aaae)) * Fix Spark offline store type conversion to arrow ([#3071](#3071)) ([b26566d](b26566d)) * Fixing Web UI, which fails for the SQL registry ([#3028](#3028)) ([64603b6](64603b6)) * Force Snowflake Session to Timezone UTC ([#3083](#3083)) ([9f221e6](9f221e6)) * Make infer dummy entity join key idempotent ([#3115](#3115)) ([1f5b1e0](1f5b1e0)) * More explicit error messages ([#2708](#2708)) ([e4d7afd](e4d7afd)) * Parse inline data sources ([#3036](#3036)) ([c7ba370](c7ba370)) * Prevent overwriting existing file during `persist` ([#3088](#3088)) ([69af21f](69af21f)) * Register BatchFeatureView in feature repos correctly ([#3092](#3092)) ([b8e39ea](b8e39ea)) * Return an empty infra object from sql registry when it doesn't exist ([#3022](#3022)) ([8ba87d1](8ba87d1)) * Teardown tables for Snowflake Materialization testing ([#3106](#3106)) ([0a0c974](0a0c974)) * UI error when saved dataset is present in registry. ([#3124](#3124)) ([83cf753](83cf753)) * Update sql.py ([#3096](#3096)) ([2646a86](2646a86)) * Updated snowflake template ([#3130](#3130)) ([f0594e1](f0594e1)) ### Features * Add authentication option for snowflake connector ([#3039](#3039)) ([74c75f1](74c75f1)) * Add Cassandra/AstraDB online store contribution ([#2873](#2873)) ([feb6cb8](feb6cb8)) * Add Snowflake materialization engine ([#2948](#2948)) ([f3b522b](f3b522b)) * Adding saved dataset capabilities for Postgres ([#3070](#3070)) ([d3253c3](d3253c3)) * Allow passing repo config path via flag ([#3077](#3077)) ([0d2d951](0d2d951)) * Contrib azure provider with synapse/mssql offline store and Azure registry store ([#3072](#3072)) ([9f7e557](9f7e557)) * Custom Docker image for Bytewax batch materialization ([#3099](#3099)) ([cdd1b07](cdd1b07)) * Feast AWS Athena offline store (again) ([#3044](#3044)) ([989ce08](989ce08)) * Implement spark offline store `offline_write_batch` method ([#3076](#3076)) ([5b0cc87](5b0cc87)) * Initial Bytewax materialization engine ([#2974](#2974)) ([55c61f9](55c61f9)) * Refactor feature server helm charts to allow passing feature_store.yaml in environment variables ([#3113](#3113)) ([85ee789](85ee789))
What this PR does / why we need it:
It enables Feast users to use S3+AWS Athena as an offline store.
The tests above failed in my environment because of MySQLdb package-M1 compatibility issues.
I didn't implement some methods related to feature write. and tests with fixed S3 bucket name and related to GCP failed.
However, all tests in test_universal_historical_retrieval.py and test_univeral_types.py related to feature extraction have passed.
Which issue(s) this PR fixes:
Fixes #
Does this PR introduce a user-facing change?:
AWS Users can choose between Redshift and S3+Athena for an offline store.