Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent EMR cluster recreation when javax.jdo.option.ConnectionPassword is used in configuration_json #13

Merged
merged 3 commits into from
Apr 20, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 6 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -191,7 +191,7 @@ Available targets:
| applications | A list of applications for the cluster. Valid values are: Flink, Ganglia, Hadoop, HBase, HCatalog, Hive, Hue, JupyterHub, Livy, Mahout, MXNet, Oozie, Phoenix, Pig, Presto, Spark, Sqoop, TensorFlow, Tez, Zeppelin, and ZooKeeper (as of EMR 5.25.0). Case insensitive | list(string) | - | yes |
| attributes | Additional attributes (_e.g._ "1") | list(string) | `<list>` | no |
| bootstrap_action | List of bootstrap actions that will be run before Hadoop is started on the cluster nodes | object | `<list>` | no |
| configurations_json | A JSON string for supplying list of configurations for the EMR cluster. See https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-configure-apps.html for more details | string | `null` | no |
| configurations_json | A JSON string for supplying list of configurations for the EMR cluster. See https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-configure-apps.html for more details | string | `` | no |
| core_instance_group_autoscaling_policy | String containing the EMR Auto Scaling Policy JSON for the Core instance group | string | `null` | no |
| core_instance_group_bid_price | Bid price for each EC2 instance in the Core instance group, expressed in USD. By setting this attribute, the instance group is being declared as a Spot Instance, and will implicitly create a Spot request. Leave this blank to use On-Demand Instances | string | `null` | no |
| core_instance_group_ebs_iops | The number of I/O operations per second (IOPS) that the Core volume supports | number | `null` | no |
Expand Down Expand Up @@ -310,6 +310,10 @@ We deliver 10x the value for a fraction of the cost of a full-time engineer. Our

Join our [Open Source Community][slack] on Slack. It's **FREE** for everyone! Our "SweetOps" community is where you get to talk with others who share a similar vision for how to rollout and manage infrastructure. This is the best place to talk shop, ask questions, solicit feedback, and work together as a community to build totally *sweet* infrastructure.

## Discourse Forums

Participate in our [Discourse Forums][discourse]. Here you'll find answers to commonly asked questions. Most questions will be related to the enormous number of projects we support on our GitHub. Come here to collaborate on answers, find solutions, and get ideas about the products and services we value. It only takes a minute to get started! Just sign in with SSO using your GitHub account.

## Newsletter

Sign up for [our newsletter][newsletter] that covers everything on our technology radar. Receive updates on what we're up to on GitHub as well as awesome new projects we discover.
Expand Down Expand Up @@ -423,6 +427,7 @@ Check out [our other projects][github], [follow us on twitter][twitter], [apply
[testimonial]: https://cpco.io/leave-testimonial?utm_source=github&utm_medium=readme&utm_campaign=cloudposse/terraform-aws-emr-cluster&utm_content=testimonial
[office_hours]: https://cloudposse.com/office-hours?utm_source=github&utm_medium=readme&utm_campaign=cloudposse/terraform-aws-emr-cluster&utm_content=office_hours
[newsletter]: https://cpco.io/newsletter?utm_source=github&utm_medium=readme&utm_campaign=cloudposse/terraform-aws-emr-cluster&utm_content=newsletter
[discourse]: https://ask.sweetops.com/?utm_source=github&utm_medium=readme&utm_campaign=cloudposse/terraform-aws-emr-cluster&utm_content=discourse
[email]: https://cpco.io/email?utm_source=github&utm_medium=readme&utm_campaign=cloudposse/terraform-aws-emr-cluster&utm_content=email
[commercial_support]: https://cpco.io/commercial-support?utm_source=github&utm_medium=readme&utm_campaign=cloudposse/terraform-aws-emr-cluster&utm_content=commercial_support
[we_love_open_source]: https://cpco.io/we-love-open-source?utm_source=github&utm_medium=readme&utm_campaign=cloudposse/terraform-aws-emr-cluster&utm_content=we_love_open_source
Expand Down
2 changes: 1 addition & 1 deletion docs/terraform.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
| applications | A list of applications for the cluster. Valid values are: Flink, Ganglia, Hadoop, HBase, HCatalog, Hive, Hue, JupyterHub, Livy, Mahout, MXNet, Oozie, Phoenix, Pig, Presto, Spark, Sqoop, TensorFlow, Tez, Zeppelin, and ZooKeeper (as of EMR 5.25.0). Case insensitive | list(string) | - | yes |
| attributes | Additional attributes (_e.g._ "1") | list(string) | `<list>` | no |
| bootstrap_action | List of bootstrap actions that will be run before Hadoop is started on the cluster nodes | object | `<list>` | no |
| configurations_json | A JSON string for supplying list of configurations for the EMR cluster. See https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-configure-apps.html for more details | string | `null` | no |
| configurations_json | A JSON string for supplying list of configurations for the EMR cluster. See https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-configure-apps.html for more details | string | `` | no |
| core_instance_group_autoscaling_policy | String containing the EMR Auto Scaling Policy JSON for the Core instance group | string | `null` | no |
| core_instance_group_bid_price | Bid price for each EC2 instance in the Core instance group, expressed in USD. By setting this attribute, the instance group is being declared as a Spot Instance, and will implicitly create a Spot request. Leave this blank to use On-Demand Instances | string | `null` | no |
| core_instance_group_ebs_iops | The number of I/O operations per second (IOPS) that the Core volume supports | number | `null` | no |
Expand Down
2 changes: 1 addition & 1 deletion examples/complete/variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ variable "applications" {
variable "configurations_json" {
type = string
description = "A JSON string for supplying list of configurations for the EMR cluster"
default = null
default = ""
}

variable "core_instance_group_instance_type" {
Expand Down
20 changes: 18 additions & 2 deletions main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -349,6 +349,21 @@ resource "aws_iam_role_policy_attachment" "ec2_autoscaling" {
policy_arn = "arn:aws:iam::aws:policy/service-role/AmazonElasticMapReduceforAutoScalingRole"
}

# This dummy bootstrap action is needed because of terraform bug https://github.com/terraform-providers/terraform-provider-aws/issues/12683
# When javax.jdo.option.ConnectionPassword is used in configuration_json then every plan will result in force recreation of EMR cluster.
# To mitigate this issue dummy bootstrap action `echo` was introduced. It is executed with an argument of a hash generated from configuration.
# This in tandem with lifecycle ignore_changes for `configurations_json` will only trigger EMR recreation when hash of configuration will change.
locals {
bootstrap_action = concat(
[{
path = "file:/bin/echo",
name = "Dummy bootstrap action to prevent EMR cluster recration when configuration_json has parameter javax.jdo.option.ConnectionPassword.",
args = [md5(jsonencode(var.configurations_json))]
}],
var.bootstrap_action
)
}

resource "aws_emr_cluster" "default" {
count = var.enabled ? 1 : 0
name = module.label.id
Expand Down Expand Up @@ -409,7 +424,7 @@ resource "aws_emr_cluster" "default" {
security_configuration = var.security_configuration

dynamic "bootstrap_action" {
for_each = var.bootstrap_action
for_each = local.bootstrap_action
content {
path = bootstrap_action.value.path
name = bootstrap_action.value.name
Expand All @@ -426,8 +441,9 @@ resource "aws_emr_cluster" "default" {

tags = module.label.tags

# configurations_json changes are ignored because of terraform bug. Configuration changes are applied via local.bootstrap_action.
lifecycle {
ignore_changes = ["kerberos_attributes", "step"]
ignore_changes = ["kerberos_attributes", "step", "configurations_json"]
}
}

Expand Down
2 changes: 1 addition & 1 deletion variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -125,7 +125,7 @@ variable "applications" {
variable "configurations_json" {
type = string
description = "A JSON string for supplying list of configurations for the EMR cluster. See https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-configure-apps.html for more details"
default = null
default = ""
}

variable "key_name" {
Expand Down