-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make it so applying and removing patches are repeatable without errors #2502
Make it so applying and removing patches are repeatable without errors #2502
Conversation
Signed-off-by: Robert (Bobby) Evans <bobby@apache.org>
build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's here looks OK.
What about the dangerous and surprising behavior of specifying submodule.patch.skip=true as discussed in #2497 (comment)? If this is an artifact of how the validate build works, maybe we should change the name of the property to something less "inviting" for developers to try, as this won't do what they think it does.
That is a good point. At a minimum I need to document it better. Thinking about it I am trying to understand the use case fro it. Originally I put it in as a way to skip the apply/unapply steps. But then I used it as a part of the CI process to auto upmerge CUDF, which made @gerashegalov want to lock it down so we don't accidentally build the plugin without the patches being installed. But I don't see that being the case any more. Perhaps I can move it back to what it was originally intended to be and document it better to let people know that there is a foot/gun here no matter what it does. |
build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor typo, lgtm. Note that this will make the submodule sync take a bit longer because it will build everything only to rebuild it, but ccache may be able to avoid a lot of the rebuild.
I'm personally OK if we decide to keep the build-skipping as long as we update the name of the config to reflect that. submodule.patch.skip implies it's only going to skip patching submodules, not also skip all native builds (including those outside of the submodule).
if [ -n "$(git status --porcelain --untracked-files=no)" ] ; then | ||
CHANGED_FILES=$(git status --porcelain --untracked-files=no) | ||
|
||
if [ \( -s "$FULLY_PATCHED_FILE" \) -a \( -n "$CHANGED_FILES" \) ] ; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer a direct test of patch applicability via git apply --check
for both forward and --reverse
. If it is neither of the states we are in, we can ask/warn the developer to commit changes and regenerate the patch if it is intentional
# is to save some state files about what happened. But a user could mess with CUDF directly | ||
# so we want to have ways to double check that they are indeed correct. | ||
|
||
FULLY_PATCHED_FILE="$CUDF_DIR/spark-rapids-jni.patch" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there may technically be a patch that applies to non-cudf portion of spark-rapids, maybe should go to the top-level dir?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious what's the use-case for patching outside of the submodule? I thought we would checkin the change directly rather than a patch of that change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We only have a sole use case with the submodule patching right now. No doubt direct modifications of the source tree outside the module is preferrable. In terms of hypotheticals, there could be another submodule or some native dependency that we want consume bypassing cudf which needs patching.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now all patches apply only to CUDF. I don't want to try and make that more generic right now. If we do hit that situation I would rather have sub-directories under patches for each third party sub directory, and then apply/remove patches for each subdirectory separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, separate subdirs work too
Signed-off-by: Robert (Bobby) Evans <bobby@apache.org>
build |
@gerashegalov I added in the direct test, and you were also right that git apply is much cleaner than patch. Please take another look |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
NVIDIA#2502) * Make it so applying and removing patches are repeatable without errors Signed-off-by: Robert (Bobby) Evans <bobby@apache.org> * Adjust config for skipping a patch * More fixes Signed-off-by: Robert (Bobby) Evans <bobby@apache.org> --------- Signed-off-by: Robert (Bobby) Evans <bobby@apache.org>
There are a number of issues with dealing with patches for the CUDF repo. This fixes some of them. It does not fix
submodule update --init
as that happens externally. It also does not fix other parts of upmerging a branch unless you have unapplied the patches before hand. Sadly it also does not provide any good way to do development work on CUDF using the submodule as your development environment.What this does do it make it so that if you apply patches to CUDF and try to build again without removing those patches it should work.