-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[#398] update Delta Lake dependencies to 3.2.1 #412
Conversation
@@ -115,20 +114,6 @@ public void generate(GenerationContext context) { | |||
vc.put("versionNeo4j", config.getNeo4jVersion()); | |||
vc.put("versionAwsSdkBundle", config.getAwsSdkBundleVersion()); | |||
|
|||
// Addresses a Spark bug in which the --packages argument was not respected in cluster mode (3.0.0 <= x < 3.4.0) | |||
// We don't currently support Spark <3.x.x in aiSSEMBLE, but we include a check for it just in case we ever do. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I: We don't support changing the Spark version from the default aiSSEMBLE version (currently 3.5.2). It's not really worth it to keep the complexity around supporting older versions since we don't allow it. If we revisit this due to demand, the logic isn't too bad to re-introduce. (And would likely be a very small portion of the overall work that would need to be done to support multiple Spark versions.)
@@ -9,34 +9,6 @@ sparkApp: | |||
#end | |||
mainApplicationFile: "local:///opt/spark/jobs/pipelines/${mainApplicationFile}" | |||
deps: | |||
#if (!${usePackageDeps}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I: Because we pulled out the logic for supporting older versions of Spark, usePackageDeps
can be assumed to be true
.
Scenario: POM dependencies are migrated | ||
Given a POM that references the delta-core package | ||
When the 1.10.0 DeltaSpark POM migration executes | ||
Then delta-core is updated to delta-spark and any version numbers are bumped up to 3.2.1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A:
Then delta-core is updated to delta-spark and any version numbers are bumped up to 3.2.1 | |
Then delta-core is updated to delta-spark and the version is set to the build-parent property version.delta |
It greatly simplified the logic to just always set the version to ${version.delta}
and should be safe as any project running these baton migrations would be inheriting from build-parent
.
...src/main/java/com/boozallen/aissemble/upgrade/migration/v1_10_0/DeltaSparkYamlMigration.java
Show resolved
Hide resolved
.../src/main/java/com/boozallen/aissemble/upgrade/migration/v1_10_0/DeltaSparkPomMigration.java
Show resolved
Hide resolved
return dep.getGroupId().equals("io.delta") && dep.getArtifactId().startsWith("delta-core_"); | ||
} | ||
|
||
private static PomModifications.Replacement replaceInTag(InputLocationTracker container, String tag, String contents) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I: Good candidate to bake into the PomModifications
class in Baton
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
if (yaml.hasObject("spec")) { | ||
yaml = yaml.getObject("spec"); | ||
if (yaml.hasObject("deps")) { | ||
yaml = yaml.getObject("deps"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe create an utility function to get the object from a path. e.g.: YamlUtils.getObjectFromPath("sparkApp.spec.deps")
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just updated the YamlObject
hasX/getX methods to accept full paths instead of single keys. Prevents this nesting situation while not cluttering the YamlObject
API.
- com.amazonaws:aws-java-sdk-bundle:1.12.262 | ||
- org.neo4j:neo4j-connector-apache-spark_2.12:4.1.5_for_spark_3 | ||
- org.elasticsearch:elasticsearch-spark-30_2.12:8.9.0 | ||
- io.delta:delta-core_2.12:2.4.0 #comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's good that we tested with comment 👍
45fce50
to
40177b8
Compare
return parent.containsKey(key) && type.isInstance(parent.get(key)); | ||
} | ||
|
||
private <T> T getValue(String... path) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I: An argument could be made that we only need these two type-parameterized methods and the hasObject/getObject
methods, but it can be difficult to understand how to work with an API that only has type-parameterized methods.
throw new NoSuchElementException("Parent object does not exist at " + Arrays.toString(path)); | ||
} | ||
String key = path[path.length - 1]; | ||
return (T) parent.get(key); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I: If we passed in a Class
object for the type, we could provide a more helpful message when the type of the value doesn't match what's expected. E.g.:
if (!type.isInstance(parent.get(key)) {
throw new IllegalArgumentException("Value for [" + key + "] is not of type " + type.getSimpleName());
}
However, I guessed that in nearly every case the call site will contain the literal path to the value being accessed and so a couple extra steps to check the source code to find what value isn't of the expected type isn't that big of a deal and simplifies both the call site and the implementation of this method.
Version 3.2.1 added support for Spark 3.5.3. Note the name of the
delta-core
artifacts changed todelta-spark
.