-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Go SDK] RunInference wrapper supporting Sklearn Model Handler #24497
Conversation
Codecov Report
@@ Coverage Diff @@
## master #24497 +/- ##
========================================
Coverage 73.35% 73.35%
========================================
Files 719 719
Lines 97137 97246 +109
========================================
+ Hits 71251 71333 +82
- Misses 24539 24564 +25
- Partials 1347 1349 +2
Flags with carried forward coverage won't be shown. Click here to find out more.
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
Assigning reviewers. If you would like to opt out of this review, comment R: @lostluck for label go. Available commands:
The PR bot will only process comments in the main thread (not review comments). |
|
||
services := integration.NewExpansionServices() | ||
defer func() { services.Shutdown() }() | ||
addr, err := services.GetAddr("python_transform") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We still don't have this expansion service generated/set by the integration script, so neither this nor the dataframe test will be running.
exit_background_processes () { |
inputRow := [][]int64{{0, 0}, {1, 1}} | ||
input := beam.CreateList(s, inputRow) | ||
kwargs := inference.SklearnKwargs{ | ||
ModelURI: "/tmp/staged/sklearn_model", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where/how is this staged? where is this file?
Is it automatically done from the python code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was expecting it to be staged by python code and then we could have used the semi_persist_dir
to load the model from there. Seems like java uses semi_persist_dir
. For the test, I manually uploaded the model file to gcs bucket and passed the gs://...
address there. Updated the PR description with job link.
Tagging @chamikaramj to know how java loads the model. If it is loaded from semi_persist_dir
do we have to change something on Go SDK side or is it staged by the python code?
I have updated the api to have generic options. I'm yet to try out remaining open comments. |
Did changes for named outputs, PTAL |
This reverts commit 083276b.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a typo, but otherwise LGTM!
I like the "model basis" object for building the pipeline. I think it makes the usage clear.
In the last commit I added the functionality to infer extra packages when starting an automated expansion service. Please let me know any comments on that. Sorry for the late addition. Sample Job with Automated Expansion Service (It uses Python's release 2.43.0 and the dev version of Go SDK tagged as 2.43.0) |
This PR adds RunInference Wrapper with Sklearn Model Handler. At present it only supports non-keyed input PCollection. Support for Keyed PCollection and Pytorch Model will be added in a later PR.
The goal of this PR is to mostly finalize the api interaction from user point of view.
Sample Job
(I manually uploaded the tmp model to gcs bucket but I see that Java uses semi-persistent directory (if present) to load that.)
Part of #23382
Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
R: @username
).addresses #123
), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, commentfixes #<ISSUE NUMBER>
instead.CHANGES.md
with noteworthy changes.See the Contributor Guide for more tips on how to make review process smoother.
To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md
GitHub Actions Tests Status (on master branch)
See CI.md for more information about GitHub Actions CI.