-
Notifications
You must be signed in to change notification settings - Fork 996
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add to_remote_storage
functionality to SparkOfflineStore
#3175
feat: Add to_remote_storage
functionality to SparkOfflineStore
#3175
Conversation
SparkOfflineStore
SparkOfflineStore
to_remote_storage
functionality to SparkOfflineStore
Codecov ReportBase: 67.03% // Head: 76.10% // Increases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## master #3175 +/- ##
==========================================
+ Coverage 67.03% 76.10% +9.06%
==========================================
Files 175 211 +36
Lines 15941 17925 +1984
==========================================
+ Hits 10686 13641 +2955
+ Misses 5255 4284 -971
Flags with carried forward coverage won't be shown. Click here to find out more.
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
53fa0c2
to
0574b04
Compare
Signed-off-by: niklasvm <niklasvm@gmail.com>
0574b04
to
844bb83
Compare
to_remote_storage
functionality to SparkOfflineStore
to_remote_storage
functionality to SparkOfflineStore
cf8b646
to
844bb83
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: kevjumba, niklasvm The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
# [0.25.0](v0.24.0...v0.25.0) (2022-09-20) ### Bug Fixes * Broken Feature Service Link ([#3227](#3227)) ([e117082](e117082)) * Feature-server image is missing mysql dependency for mysql registry ([#3223](#3223)) ([ae37b20](ae37b20)) * Fix handling of TTL in Go server ([#3232](#3232)) ([f020630](f020630)) * Fix materialization when running on Spark cluster. ([#3166](#3166)) ([175fd25](175fd25)) * Fix push API to respect feature view's already inferred entity types ([#3172](#3172)) ([7c50ab5](7c50ab5)) * Fix release workflow ([#3144](#3144)) ([20a9dd9](20a9dd9)) * Fix Shopify timestamp bug and add warnings to help with debugging entity registration ([#3191](#3191)) ([de75971](de75971)) * Handle complex Spark data types in SparkSource ([#3154](#3154)) ([5ddb83b](5ddb83b)) * Local staging location provision ([#3195](#3195)) ([cdf0faf](cdf0faf)) * Remove bad snowflake offline store method ([#3204](#3204)) ([dfdd0ca](dfdd0ca)) * Remove opening file object when validating S3 parquet source ([#3217](#3217)) ([a906018](a906018)) * Snowflake config file search error ([#3193](#3193)) ([189afb9](189afb9)) * Update Snowflake Online docs ([#3206](#3206)) ([7bc1dff](7bc1dff)) ### Features * Add `to_remote_storage` functionality to `SparkOfflineStore` ([#3175](#3175)) ([2107ce2](2107ce2)) * Add ability to give boto extra args for registry config ([#3219](#3219)) ([fbc6a2c](fbc6a2c)) * Add health endpoint to py server ([#3202](#3202)) ([43222f2](43222f2)) * Add snowflake support for date & number with scale ([#3148](#3148)) ([50e8755](50e8755)) * Add tag kwarg to set Snowflake online store table path ([#3176](#3176)) ([39aeea3](39aeea3)) * Add workgroup to athena offline store config ([#3139](#3139)) ([a752211](a752211)) * Implement spark materialization engine ([#3184](#3184)) ([a59c33a](a59c33a))
What this PR does / why we need it:
Add
to_remote_storage
method toSparkRetrivalJob
to write to remote storage. Both a local file-based and s3-based option have been implemented.This is facilitated by 2 new config parameters for the
SparkOfflineStore
:staging_location
: should either start withfile://
ors3://
to specify uri accordinglyregion
: aws region if applicableSpark Universal tests pass. This is untested with an S3-based
staging_location
.This PR is required in preparation for implementing a
SparkBatchMaterializationEngine
in a later PR.Which issue(s) this PR fixes: None
First step towards solving #3167