Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#251 remove custom spark operator compilation #253

Merged
merged 1 commit into from
Aug 1, 2024

Conversation

ewilkins-csi
Copy link
Contributor

  • remove the custom compilation of Spark Operator
  • update the Spark Operator chart dependency to the latest version
  • modify the Spark Operator chart to work with the stock kubeflow image
    • remove custom RBAC
    • change ivy cache mount location

Note that the only thing preventing the complete removal of our custom Spark Operator image is the fact that kubeflow's Spark Operator is on Spark 3.5 while we're on Spark 3.4. The upgrade is actually pretty easy without many breaking changes at all, however it started to require some dependency untangling with different versions being pulled in between Spark 3.5 and Quarkus 2.8. Since we know we'll be upgrading Quarkus soon, I'm leaving the Spark 3.5 upgrade to follow that. This should cut a significant amount of time off of the image build though, as the custom Go compilation was taking at least half of the build time.

@@ -31,7 +31,6 @@
<version.assembly.plugin>3.6.0</version.assembly.plugin>
<version.buildhelper.plugin>3.2.0</version.buildhelper.plugin>
<version.buildnumber.plugin>3.1.0</version.buildnumber.plugin>
<version.cucumber.reporting.plugin>5.7.5</version.cucumber.reporting.plugin>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I: Just consolidating the two properties into the common parent pom.

COPY --from=builder --chmod=755 /usr/bin/spark-operator /usr/bin/
COPY --from=builder --chmod=755 /usr/bin/entrypoint.sh /usr/bin/
RUN apt-get update \
&& apt-get install -y tini \
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I: Spark Operator 1.6+ no longer uses the gencerts script or the webhook init/cleanup k8s jobs. So we only need tini as it's used in entrypoint.sh to launch the service.

@@ -10,7 +10,7 @@ sources:
- https://github.com/boozallen/aissemble
dependencies:
- name: spark-operator
version: 1.1.27
version: 1.4.6
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I: Latest version as of today

@@ -48,7 +43,7 @@ spark-operator:
- name: spark-logging
mountPath: "/tmp/spark-events"
- name: ivy-cache
mountPath: "/opt/spark/.ivy2"
mountPath: "/home/spark/.ivy2"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I: The base Spark Operator image uses the default home location for the spark user. Technically we could have updated the home in the image build with RUN usermod, but this sets us up to remove the aissemble-spark-operator image entirely after the upgrade to Spark 3.5.

@ewilkins-csi ewilkins-csi force-pushed the 251-remove-custom-spark-operator branch from 9bfcabe to fd4a830 Compare August 1, 2024 15:25
pom.xml Show resolved Hide resolved
 - remove the custom compilation of Spark Operator
 - update the Spark Operator chart dependency to the latest version
 - modify the Spark Operator chart to work with the stock kubeflow image
   - remove custom RBAC
   - change ivy cache mount location

Note that the only thing preventing the complete removal of our custom
Spark Operator image is the fact that kubeflow's Spark Operator is on
Spark 3.5 while we're on Spark 3.4.  The upgrade is actually pretty easy
without many breaking changes at all, however it started to require some
dependency untangling with different versions being pulled in between
Spark 3.5 and Quarkus 2.8.  Since we know we'll be upgrading Quarkus
soon, I'm leaving the Spark 3.5 upgrade to follow that.  This should cut
a significant amount of time off of the image build though, as the
custom Go compilation was taking at least half of the build time.
@ewilkins-csi ewilkins-csi force-pushed the 251-remove-custom-spark-operator branch from fd4a830 to 8ff8bc9 Compare August 1, 2024 16:04
@ewilkins-csi ewilkins-csi merged commit 3603262 into dev Aug 1, 2024
@ewilkins-csi ewilkins-csi deleted the 251-remove-custom-spark-operator branch August 1, 2024 16:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants