Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#127 #116 Hive Metastore Service v2 chart and Hive upgrade #130

Merged

Conversation

peter-mcclonski
Copy link
Contributor

No description provided.

<properties>
<dockerbuild.jars.directory>target/dockerbuild/jars/</dockerbuild.jars.directory>
<version.hive.metastore>4.0.0</version.hive.metastore>
<version.hadoop>3.4.0</version.hadoop>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I: Taking the opportunity to bump the Hadoop version for this service. Out of sync with Spark (which uses 3.3.4) as that version is no longer available on https://dlcdn.apache.org/hadoop/common/.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to float the idea of using the current version rather than pegging a specific version, but I can see both pros and cons to this approach. Namely, it becomes non-obvious what version we're actually on.

@@ -15,4 +15,57 @@

<packaging>orphedomos</packaging>

<properties>
<dockerbuild.jars.directory>target/dockerbuild/jars/</dockerbuild.jars.directory>
<version.hive.metastore>4.0.0</version.hive.metastore>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I: Managing dependencies through Maven as a matter of good practice and centralization.

initContainers:
{{- toYaml . | nindent 12 }}
{{- end }}
- name: "init-or-upgrade-schema"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I: This initcontainer will check the version of the schema in the underlying database, and either create it or upgrade it, as needed. This is future-proof for upgrades down the line. Notably, it does require specification of the db driver. aiSSEMBLE currently uses mysql and only mysql for this purpose, but the driver was still extracted to an intentionally undocumented property in the values file in case this ever changes.

@@ -33,26 +33,12 @@ tests:
<value>jdbc:mysql://hive-metastore-db:3306/metastore?createDatabaseIfNotExist=true&amp;allowPublicKeyRetrieval=true&amp;useSSL=false</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
<property>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I: Moved these properties to the downstream values.yaml

@@ -12,11 +12,11 @@ aissemble-thrift-server-chart:
deployment:
envFromSecret:
AWS_ACCESS_KEY_ID:
secretName: aws-s3-credentials
key: aws_access_key_id
secretName: remote-auth-config
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I: It was brought to my attention that we publish this secret as part of the localstack chart, so we use it for out of the box functionality and as a model of good practice.

@peter-mcclonski peter-mcclonski force-pushed the 127-hive-metastore-service-v2 branch 4 times, most recently from c7faf7c to 395cc7e Compare June 11, 2024 16:39
@@ -160,7 +160,6 @@
"**/jenkins/values.yaml",
"**/kafka-cluster/values.yaml",
"**/hive-metastore-db/values.yaml",
"**/hive-metastore-service/values.yaml",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I: The v1 chart is deprecated after 1.7.0 due to significant breaking changes.

@@ -10,7 +10,7 @@ image:
imagePullPolicy: IfNotPresent
dockerRepo: ghcr.io/
# Overrides the image tag whose default is the chart appVersion.
tag: "${versionTag}"
tag: "1.7.0"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I: Locking the version as the v1 chart will be deprecated after 1.7.0

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should go ahead and deprecate the profile for the v1 chart in this commit as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had intended to deprecate it alongside the v1 SHS and STS charts after completing the full chart (along with writing migration steps, etc). I'm wary of hitting the deprecation button before the full chart is ready. I imagine a case where for whatever reason we have to cut a 1.7.1 release and the deprecation is in before the replacement is ready.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd cut 1.7.1 from 1.7.0 not dev. It's a fair point either way though. I guess I was just thinking if we're refusing to update the image it'd be nice to tell the user why. But also understand from a user perspective it might be frustrating to see a deprecation message and have no way to resolve it. Could we word it something like "due to ongoing effort to create a Hive v2 chart and subsequent incompatibilities, v1 cannot be upgraded beyond version 1.7" as a middle ground?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely-- I can do that.

@peter-mcclonski peter-mcclonski force-pushed the 127-hive-metastore-service-v2 branch 4 times, most recently from 2b0c64a to 128238c Compare June 12, 2024 16:27
# Sets labels for app.kubernetes.io/name Default is Chart.name (aissemble-hive-metastore-service-chart)
name: ''
# Sets labels for app.kubernetes.io/name. Default is Chart.name (aissemble-hive-metastore-service-chart)
name: 'hive-metastore-service'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A: I think the previous behavior makes more sense in the context of a downstream project, where the chart will naturally inherit the appName set in the deploy POM.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That doesn't work in this case, since the appName set in the deploy pom is shared by STS, SHS, this, and soon hive-metastore-db. That's also why you'll see content like this in the STS and SHS templates:

metadata:
  name: {{ .Release.Name }}-sts

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right. Hmmmm. Ok.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do want to raise one interesting thing that this hints at: The hive charts and STS/SHS charts feel different to interact with from a downstream user perspective. This is fundamentally because they were built by different people at different times. If we want to do a more wholesale unification of the styles and approaches to each, we can do that. I wouldn't necessarily do it right now, but I think it's one of those oddities where we'll probably come back to it in the future. Even basic things like how metastore-config.xml is built versus how hive-site.xml is loaded into STS. One takes a raw XML string, while the other takes a list of objects that then gets transformed into an XML string. This naming difference is sort-of one of those oddities-- Though the main motivator for keeping the name specifically as-is was to minimize the breaking changes from previous versions.

Copy link
Contributor

@ewilkins-csi ewilkins-csi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

S: Add hive-service v2 profile to containers.adoc and instruct users on migrating.

@peter-mcclonski
Copy link
Contributor Author

S: Add hive-service v2 profile to containers.adoc and instruct users on migrating.

I was under the impression that we would be waiting until the new spark-infrastructure chart was completed before making that change-- particularly since there are still so many manual steps required to keep the project functional when migrating.

@ewilkins-csi
Copy link
Contributor

I was under the impression that we would be waiting until the new spark-infrastructure chart was completed before making that change-- particularly since there are still so many manual steps required to keep the project functional when migrating.

Ok. That works for me. As long as we don't lose track of it.

@peter-mcclonski peter-mcclonski force-pushed the 127-hive-metastore-service-v2 branch 2 times, most recently from 0fc3907 to 5fb052e Compare June 12, 2024 20:36
Signed-off-by: Peter McClonski <mcclonski_peter@bah.com>
@peter-mcclonski peter-mcclonski force-pushed the 127-hive-metastore-service-v2 branch from 5fb052e to 92b423e Compare June 13, 2024 02:19
@peter-mcclonski peter-mcclonski merged commit f858d0e into boozallen:dev Jun 13, 2024
@peter-mcclonski peter-mcclonski deleted the 127-hive-metastore-service-v2 branch June 13, 2024 02:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants