-
Notifications
You must be signed in to change notification settings - Fork 237
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move the udf-examples module to the external repository spark-rapids-examples [databricks] #4619
Conversation
build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not a fan of removing some, but not all, of the udf examples. They should be bundled together, along with their tests, and built separately from the plugin build to ensure it is compiling and testing against the plugin dist artifact.
…examples repository Signed-off-by: Chong Gao <res_life@163.com>
a2ca9e0
to
f9bdb79
Compare
build |
build |
Move the whole udf-examples module to external. |
For the spark-rapids-examples code, refer to NVIDIA/spark-rapids-examples#94 |
Convert to draft because of the build is not passed. |
implements a Hive simple UDF using | ||
[native code](../../udf-examples/src/main/cpp/src) to count words in strings | ||
|
||
in the [udf-examples](https://github.com/NVIDIA/spark-rapids-examples/tree/branch-22.04/examples/Spark-Rapids/udf-examples) project. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's unfortunate there isn't a main
branch in the spark-rapids-examples repository we can point to so we're always pointing to the latest stable release code. Otherwise we're going to forget about this and then point to possibly very stale examples, and the user would miss any new examples that were added in later versions of the spark-rapids-examples code. @GaryShen2008 should we be doing a merge to main on the spark-rapids-examples repository as we do for this one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, we can create a main branch for the latest release version.
tests/src/test/scala/com/nvidia/spark/rapids/udf/URLEncode.scala
Outdated
Show resolved
Hide resolved
tests/src/test/scala/com/nvidia/spark/rapids/udf/URLDecode.scala
Outdated
Show resolved
Hide resolved
Updated the code for the rapids_build-it-UDF-native weekly Jenkins job.
After this PR merged, then should update the Jenkins job acorrdingly. @pxLi |
build |
build |
build |
build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want the premerge and nightly tests to also build and run the UDF example tests, are the scripts going to be updated by this PR or done separately? If separately, we should file an issue to track.
- [StringWordCount](../../udf-examples/src/main/java/com/nvidia/spark/rapids/udf/hive/StringWordCount.java) | ||
implements a Hive simple UDF using | ||
[native code](../../udf-examples/src/main/cpp/src) to count words in strings | ||
Source code for examples of RAPIDS accelerated UDFs is provided | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The extra whitespace ends up disconnecting this sentence when it's rendered.
Filed an issue to track the premerge and nightly tests: #4704 |
build |
build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code looks OK, but it would be good to verify the nightly builds are ready for this before approving. cc: @pxLi
build |
The nightly builds are not ready, convert to draft to prevent it from being accidentally merged. |
I mentioned in this comment that we shouldn't lose premerge testing against at least some external UDFs for this repository. Having separate nightly tests is fine, I don't care much either way as long as we get nightly coverage of the UDF examples working against updated snapshot versions of the RAPIDS Accelerator, but as it is now I think we're losing premerge tests against truly external UDFs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving this for now since it looks like a bunch of nightly tests are failing because the udf example jars were removed. I still want to see premerge fixed to test truly external UDFs, but we can address that in a followup.
This hasn't been built in a while, so getting a fresh build to verify it's still ready to go. |
build |
Filed #4834 to track covering the external UDF tests in the premerge build |
build |
[BUG] udf-examples dependencies are incorrect #3971
Move the the whole udf-examples module to the external spark-rapids-examples repository.
The moved udf-examples depends on Spark Rapids repository and Spark Rapids repository should not depends on the external udf-examples. This will avoid circular dependency between the two repositories.
Done: Update the jenkens pipeline to depend on the external udf-examples for test purpose.
Background:
Contributes to #3971