Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] - Terraform error when upgrading GCP deployment to latest RC using GHA #2732

Closed
marcelovilla opened this issue Sep 20, 2024 · 2 comments · Fixed by #2734
Closed

[BUG] - Terraform error when upgrading GCP deployment to latest RC using GHA #2732

marcelovilla opened this issue Sep 20, 2024 · 2 comments · Fixed by #2734

Comments

@marcelovilla
Copy link
Member

Describe the bug

When updating a GCP deployment from 2024.7.1 to 2024.9.1rc2 via GHA, I'm seeing the following error:

Downloading https://get.helm.sh/helm-v3.15.3-linux-amd64.tar.gz
Verifying checksum... Done.
Preparing to install helm into /tmp/helm/v3.15.3
helm installed into /tmp/helm/v3.15.3/helm
v5.4.3
kustomize installed to /tmp/kustomize/5.4.3/kustomize
['1.28.13-gke.1119000', '1.28.13-gke.1078000', '1.28.13-gke.1049000', '1.28.13-gke.1024000', '1.28.13-gke.1006000', '1.28.12-gke.1179000', '1.28.12-gke.1052000', '1.27.16-gke.1342000', '1.27.16-gke.1296000', '1.27.16-gke.1287000', '1.27.16-gke.1258000', '1.27.16-gke.1148001', '1.27.16-gke.1051001']
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ The following files will be created:                                         ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ stages/07-kubernetes-services/modules/kubernetes/services/conda-store/confi… │
│ stages/07-kubernetes-services/modules/kubernetes/services/dask-gateway/file… │
│ stages/07-kubernetes-services/modules/kubernetes/services/dask-gateway/file… │
│ stages/07-kubernetes-services/modules/kubernetes/services/jupyterhub/files/… │
│ stages/07-kubernetes-services/modules/kubernetes/services/jupyterhub/files/… │
│ stages/07-kubernetes-services/modules/kubernetes/services/jupyterhub/files/… │
│ stages/07-kubernetes-services/modules/kubernetes/services/jupyterhub/files/… │
│ stages/07-kubernetes-services/modules/kubernetes/services/jupyterhub/files/… │
└──────────────────────────────────────────────────────────────────────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ The following files will be updated:            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ stages/01-terraform-state/gcp/_nebari.tf.json   │
│ stages/02-infrastructure/gcp/_nebari.tf.json    │
│ stages/03-kubernetes-initialize/_nebari.tf.json │
│ stages/04-kubernetes-ingress/_nebari.tf.json    │
│ stages/05-kubernetes-keycloak/_nebari.tf.json   │
│ stages/07-kubernetes-services/_nebari.tf.json   │
│ stages/08-nebari-tf-extensions/_nebari.tf.json  │
└─────────────────────────────────────────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ The following files are untracked (only exist in output directory): ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ stages/01-terraform-state/gcp/_nebari.tf.json                       │
│ stages/02-infrastructure/gcp/_nebari.tf.json                        │
│ stages/03-kubernetes-initialize/_nebari.tf.json                     │
│ stages/04-kubernetes-ingress/_nebari.tf.json                        │
│ stages/05-kubernetes-keycloak/_nebari.tf.json                       │
│ stages/07-kubernetes-services/_nebari.tf.json                       │
│ stages/08-nebari-tf-extensions/_nebari.tf.json                      │
└─────────────────────────────────────────────────────────────────────┘
[terraform]: ╷
[terraform]: │ Error: Required plugins are not installed
[terraform]: │ 
[terraform]: │ The installed provider plugins are not consistent with the packages
[terraform]: │ selected in the dependency lock file:
[terraform]: │   - registry.terraform.io/hashicorp/google: there is no package for registry.terraform.io/hashicorp/google 4.83.0 cached in .terraform/providers
[terraform]: │ 
[terraform]: │ Terraform uses external plugins to integrate with a variety of different
[terraform]: │ infrastructure services. To download the plugins required for this
[terraform]: │ configuration, run:
[terraform]: │   terraform init
[terraform]: ╵
╭───────────────────── Traceback (most recent call last) ──────────────────────╮
│ /opt/hostedtoolcache/Python/3.10.15/x64/lib/python3.10/site-packages/_nebari │
│ /subcommands/deploy.py:92 in deploy                                          │
│                                                                              │
│   89 │   │   │   msg = "Digital Ocean support is currently being deprecated  │
│   90 │   │   │   typer.confirm(msg)                                          │
│   91 │   │                                                                   │
│ ❱ 92 │   │   deploy_configuration(                                           │
│   93 │   │   │   config,                                                     │
│   94 │   │   │   stages,                                                     │
│   95 │   │   │   disable_prompt=disable_prompt,                              │
│                                                                              │
│ /opt/hostedtoolcache/Python/3.10.15/x64/lib/python3.10/site-packages/_nebari │
│ /deploy.py:55 in deploy_configuration                                        │
│                                                                              │
│   52 │   │   │   │   s: hookspecs.NebariStage = stage(                       │
│   53 │   │   │   │   │   output_directory=pathlib.Path.cwd(), config=config  │
│   54 │   │   │   │   )                                                       │
│ ❱ 55 │   │   │   │   stack.enter_context(s.deploy(stage_outputs, disable_pro │
│   56 │   │   │   │                                                           │
│   57 │   │   │   │   if not disable_checks:                                  │
│   58 │   │   │   │   │   s.check(stage_outputs, disable_prompt)              │
│                                                                              │
│ /opt/hostedtoolcache/Python/3.10.15/x64/lib/python3.10/contextlib.py:492 in  │
│ enter_context                                                                │
│                                                                              │
│   489 │   │   # statement.                                                   │
│   490 │   │   _cm_type = type(cm)                                            │
│   491 │   │   _exit = _cm_type.__exit__                                      │
│ ❱ 492 │   │   result = _cm_type.__enter__(cm)                                │
│   493 │   │   self._push_cm_exit(cm, _exit)                                  │
│   494 │   │   return result                                                  │
│   495                                                                        │
│                                                                              │
│ /opt/hostedtoolcache/Python/3.10.15/x64/lib/python3.10/contextlib.py:135 in  │
│ __enter__                                                                    │
│                                                                              │
│   132 │   │   # they are only needed for recreation, which is not possible a │
│   133 │   │   del self.args, self.kwds, self.func                            │
│   134 │   │   try:                                                           │
│ ❱ 135 │   │   │   return next(self.gen)                                      │
│   136 │   │   except StopIteration:                                          │
│   137 │   │   │   raise RuntimeError("generator didn't yield") from None     │
│   138                                                                        │
│                                                                              │
│ /opt/hostedtoolcache/Python/3.10.15/x64/lib/python3.10/site-packages/_nebari │
│ /stages/terraform_state/__init__.py:237 in deploy                            │
│                                                                              │
│   234 │   def deploy(                                                        │
│   235 │   │   self, stage_outputs: Dict[str, Dict[str, Any]], disable_prompt │
│   236 │   ):                                                                 │
│ ❱ 237 │   │   self.check_immutable_fields()                                  │
│   238 │   │                                                                  │
│   239 │   │   with super().deploy(stage_outputs, disable_prompt):            │
│   240 │   │   │   env_mapping = ***                                           │
│                                                                              │
│ /opt/hostedtoolcache/Python/3.10.15/x64/lib/python3.10/site-packages/_nebari │
│ /stages/terraform_state/__init__.py:255 in check_immutable_fields            │
│                                                                              │
│   252 │   │   │   │   yield                                                  │
│   253 │                                                                      │
│   254 │   def check_immutable_fields(self):                                  │
│ ❱ 255 │   │   nebari_config_state = self.get_nebari_config_state()           │
│   256 │   │   if not nebari_config_state:                                    │
│   257 │   │   │   return                                                     │
│   258                                                                        │
│                                                                              │
│ /opt/hostedtoolcache/Python/3.10.15/x64/lib/python3.10/site-packages/_nebari │
│ /stages/terraform_state/__init__.py:287 in get_nebari_config_state           │
│                                                                              │
│   284 │                                                                      │
│   285 │   def get_nebari_config_state(self):                                 │
│   286 │   │   directory = str(self.output_directory / self.stage_prefix)     │
│ ❱ 287 │   │   tf_state = terraform.show(directory)                           │
│   288 │   │   nebari_config_state = None                                     │
│   289 │   │                                                                  │
│   290 │   │   # get nebari config from state                                 │
│                                                                              │
│ /opt/hostedtoolcache/Python/3.10.15/x64/lib/python3.10/site-packages/_nebari │
│ /provider/terraform.py:204 in show                                           │
│                                                                              │
│   201 │   │   │   )                                                          │
│   202 │   │   │   return output                                              │
│   203 │   │   except TerraformException as e:                                │
│ ❱ 204 │   │   │   raise e                                                    │
│   205                                                                        │
│   206                                                                        │
│   207 def refresh(directory=None, var_files=None):                           │
│                                                                              │
│ /opt/hostedtoolcache/Python/3.10.15/x64/lib/python3.10/site-packages/_nebari │
│ /provider/terraform.py:194 in show                                           │
│                                                                              │
│   191 │   with timer(logger, "terraform show"):                              │
│   192 │   │   try:                                                           │
│   193 │   │   │   output = json.loads(                                       │
│ ❱ 194 │   │   │   │   run_terraform_subprocess(                              │
│   195 │   │   │   │   │   command,                                           │
│   196 │   │   │   │   │   cwd=directory,                                     │
│   197 │   │   │   │   │   prefix="terraform",                                │
│                                                                              │
│ /opt/hostedtoolcache/Python/3.10.15/x64/lib/python3.10/site-packages/_nebari │
│ /provider/terraform.py:119 in run_terraform_subprocess                       │
│                                                                              │
│   116 │   logger.info(f" terraform at ***terraform_path***")                     │
│   117 │   exit_code, output = run_subprocess_cmd([terraform_path] + processa │
│   118 │   if exit_code != 0:                                                 │
│ ❱ 119 │   │   raise TerraformException("Terraform returned an error")        │
│   120 │   return output                                                      │
│   121                                                                        │
│   122                                                                        │
╰──────────────────────────────────────────────────────────────────────────────╯
TerraformException: Terraform returned an error
Error: Process completed with exit code 1.

Expected behavior

I'd expect the re-deployment after the upgrade to work fine.

OS and architecture in which you are running Nebari

GHA

How to Reproduce the problem?

Upgrade an existing GCP deployment from 2024.7.1 to 2024.9.1.rc2 via GHA.

Command output

No response

Versions and dependencies used.

No response

Compute environment

None

Integrations

No response

Anything else?

  • I haven't been able to reproduce this when upgrading the existing GCP cluster and redeploying it from my local machine—it works fine when doing that.
  • I saw this issue on two separate GCP deployments and re-triggering the workflows did not work.
  • I haven't tested it myself, but @BrianCashProf did not see this issue when upgrading an AWS cluster.
@viniciusdc
Copy link
Contributor

I also didn't encounter this issue on AWS while deploying locally last week. I once saw something similar, which was due to a .tf file being leftover after an upgrade (the render graph changed, thus not removing the file), but I doubt it's this since it runs on GHA.

@viniciusdc
Copy link
Contributor

As a ref. The issue was due to the terraform provider cache not being updated with the recent infrastructural changes (in this case, the Google Cloud provider had been removed due to recent updates in the package deps to move to a direct lib).

Thus, when show was run as part of the immutable checks, it reached a mismatch between the expected provided config and the actual state. This was addressed by including an init call before the actual show subprocess execution to update the caching.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

Successfully merging a pull request may close this issue.

2 participants