-
Notifications
You must be signed in to change notification settings - Fork 140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unify usage and rendering of HintError
#2095
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Collaborator
mr0re1
commented
Jan 5, 2024
cdunbar13
approved these changes
Jan 10, 2024
mr0re1
added a commit
that referenced
this pull request
Feb 14, 2024
* Bump github.com/hashicorp/terraform-exec from 0.19.0 to 0.20.0 Bumps [github.com/hashicorp/terraform-exec](https://github.com/hashicorp/terraform-exec) from 0.19.0 to 0.20.0. - [Release notes](https://github.com/hashicorp/terraform-exec/releases) - [Changelog](https://github.com/hashicorp/terraform-exec/blob/main/CHANGELOG.md) - [Commits](hashicorp/terraform-exec@v0.19.0...v0.20.0) --- updated-dependencies: - dependency-name: github.com/hashicorp/terraform-exec dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> * Make `subnetwork_self_link` required, don't pass `subnetwork_project` around (#2067) * Slurm6. Automagicaly set `nodeset.name` from module id. (#2068) Slurm6. Automagicaly set `nodeset.name` from module id. * VPC. Replace `s/_/-/` in `deployment_name` to avoid deploy-time error (#2083) * Add `hpc-slurm6.yaml` to `examples/README` (#2084) * Slurm6. QuickFix broken TPU nodeset usage (#2086) * Slurm6. Reference TPU example in `examples/README` (#2087) * Use `cty.Type` instead of `string` to represent type of vars. (#2088) NOTE: after this change instead of `list` the `list(any)` will be used. * fix: header was over-indented * Check if supplied value matches module variable type (#2089) * Add spelling hints for global vars and outputs (#2082) * Point ref errors to a location within nested object (#2081) * Refactor `Blueprint.WalkModules` (#2094) * Add safe-version to avoid useless `return nil`; * Supply `ModulePath` to the "dangerous" version; * Use `WalkModules` instead of nested for-loops in few cases. * Bump golang.org/x/sys from 0.15.0 to 0.16.0 Bumps [golang.org/x/sys](https://github.com/golang/sys) from 0.15.0 to 0.16.0. - [Commits](golang/sys@v0.15.0...v0.16.0) --- updated-dependencies: - dependency-name: golang.org/x/sys dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> * Bump google.golang.org/api from 0.154.0 to 0.155.0 Bumps [google.golang.org/api](https://github.com/googleapis/google-api-go-client) from 0.154.0 to 0.155.0. - [Release notes](https://github.com/googleapis/google-api-go-client/releases) - [Changelog](https://github.com/googleapis/google-api-go-client/blob/main/CHANGES.md) - [Commits](googleapis/google-api-go-client@v0.154.0...v0.155.0) --- updated-dependencies: - dependency-name: google.golang.org/api dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> * Update spack openfoam example to Slurm V6 * Add test that check that modules don't output forbidden names (#2091) * Remove hpc-slurm-legacy example and references * Remove pre existing fs example and references * Remove slurm-two-partitions-workstation example and references * Remove use-resources example and references * Remove lustre-new-vpc example and references * Remove test-gcs-fuse example and references * Remove hpc-cluster-service-acct example and references * Remove hpc-cluste-slurm-with-startup example and references * Remove hpc-cluster-project exampke and references * Remove hpc-cluster-high-io-remote-state example and references * Update test outputs example and remove slurm partition and controller * Slurm6. Support `additional_networks`,`reservation_name` & `access_config` (#2062) * Add support for `additional_networks` & `reservation_name`; * Nodeset. Pass `access_config`, do not use `enable_public_ips` * Add zone finding for cpu partitions in hpc-enterprise-slurm test * Rename GKE subnet with build id to avoid conflicts * Fix slurm v6 links in example README * Add startup script option to install stackdriver agent * Update tests to focus on stackdriver while still testing ops agent * Unify usage and rendering of `HintError` (#2095) * Rename slurm tpu test to be consistent with blueprint name * Add example script to uninstall Ops Agent and install Stackdriver Agent * Silence make error message for old versions of git Older versions of git do not have a '--show-current' flag on the git branch command. This command allows fallback to the ancient approach to determining the active branch and also redirects stderr to /dev/null. If neither command succeeds, then ghpc --version reports detached HEAD for the branch. * yamllint. Don't show warnings (#2122) Motivation: warnings doesn't cause lint to fail (only errors do), but they will be outputed along the errors (many lines), that makes it hard to see the actual error message * Move `examples/hpc-slurm` to V6 (#2097) pick f88a30f Unify usage and rendering of `HintError` * Move `examples/hpc-slurm` to V6; * Updated `examples/README`; * Remove `slurm-v5-hpc-centos7` test. * Add `has_to_be_used` behaviour to some of modules (#2092) * Update README.md * Reduce default maximum number of HTCondor execute points Especialy for initial deployments, a maximum of 100 could result in significant spend beyond what was anticipated. Reducing to 5 addresses this while still allowing the user to deliberately scale up. * Hint spelling for inputs (#2124) * Simplify rendering of errors with Position but without Path (#2096) * Remove `internalPath`; * Add `PosError` wrapper, render it specifically. * Bump jinja2 from 3.1.2 to 3.1.3 in /community/front-end/ofe Bumps [jinja2](https://github.com/pallets/jinja) from 3.1.2 to 3.1.3. - [Release notes](https://github.com/pallets/jinja/releases) - [Changelog](https://github.com/pallets/jinja/blob/main/CHANGES.rst) - [Commits](pallets/jinja@3.1.2...3.1.3) --- updated-dependencies: - dependency-name: jinja2 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> * Address #2120: fail on bad state and succeed on reinstall * Fix rendering of "cobra" errors (#2130) * Move `OFE venv` PR validation into separate trigger. (#2128) **Motivation**: * Reduce time it takes to run `PR-validation`; * Reduce noise in output of `PR-validation`. * Make `cleanup_compute_nodes` `depends_on` on network (#2126) * Update spack wrf example and references to use Slurm V6 * Update spack openfoam example to use /opt/apps directory * Improve readability of "required setting is missing" error (#2133) * GKE controller node pool extra features * Improve MIG replacement policies for HTCondor Central Managers Set the MIG replacement policy to PROACTIVE by default for Central Managers. This ensures that configuration changes are propagated by a terraform apply which updates the HTCondor configuration. This is safe for Central Managers because they recover state dynamically through periodic API calls to the rest of the cluster. Document the alternative of OPPORTUNISTIC updates and how to manually trigger a MIG replacement. * Improve MIG replacement policies for HTCondor Access Points Allow configuration of the MIG replacement policy for Access Points. Document the behavior of OPPORTUNISTIC updates and how to manually trigger a MIG replacement or set to the alternative of PROACTIVE replacements. * Improve MIG replacement policies for HTCondor Execute Points Continue using the default of OPPORTUNISTIC replacement of Execute Point VMs so that they are (typically) replaced when a job becomes idle. Strongly recommend this setting in the documentation but discuss the alternative of PROACTIVE or manually issuing updates via gcloud. * Fix HTCondow Windows URI for latest 23.0 LTS release * Address feedback from #2140 for README formatting * Fix broken link in HTCondor MIG documentation * Remove intel-select blueprints and references * Add support for string interpolation (#2076) * Add support for string interpolation * Support proper escaping * Adress comments * UX. Enable output colorization by default (#2145) * Update DAOS blueprints to use google-cloud-daos v0.5.0, slurm v6 [DAOSGCP-182](https://daosio.atlassian.net/browse/DAOSGCP-182) - Bump version of DAOS modules to v0.5.0 which install DAOS v2.4 - Modify community/examples/intel/hpc-slurm-daos.yaml to use Slurm v6 modules - Add temporary fix to community/examples/intel/hpc-slurm-daos.yaml to work around issue with missing lustre-client 8.8 repo - Update community/examples/intel/README.md to account for changes in DAOS v2.4 Signed-off-by: Mark Olson <115657904+mark-olson@users.noreply.github.com> * Added validation and error message to login_startup_scripts_timeout because it is broken * Update spack gromac example tutorial and reference to use Slurm V6 * Slurm6. Advance to 6.3.1 (#2146) * Add commands to verify monitoring agents are active * Copies python binaries instead of symlink for more isolated venv * Increase dynamic node count to a more reasonable default value * Bump google.golang.org/api from 0.155.0 to 0.157.0 Bumps [google.golang.org/api](https://github.com/googleapis/google-api-go-client) from 0.155.0 to 0.157.0. - [Release notes](https://github.com/googleapis/google-api-go-client/releases) - [Changelog](https://github.com/googleapis/google-api-go-client/blob/main/CHANGES.md) - [Commits](googleapis/google-api-go-client@v0.155.0...v0.157.0) --- updated-dependencies: - dependency-name: google.golang.org/api dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> * Update README.md with fixes from review. Signed-off-by: Mark Olson <115657904+mark-olson@users.noreply.github.com> * Add example of building Slurm on top of Rocky 8 Tom provided an example blueprint that demonstrated this methodology. Co-authored-by: Tom Downes <tpdownes@google.com> * Update hpc slurm gromac example and references to use Slurm V6 * Clarify that zone-finding isn't available for TPUs (#2156) * Fixed PyMarkdown issue in community/examples/intel/README.md Signed-off-by: Mark Olson <115657904+mark-olson@users.noreply.github.com> * Fixed typos in community/examples/intel/README.md Signed-off-by: Mark Olson <115657904+mark-olson@users.noreply.github.com> * Bring `$(...)` functionality on par with `((...))` (#2053) * Use token-replacement instead of string-replacement for expresssion updates; * Translate any BP-expression to TF-expressions by transforming used traversals; * Remove notion of `((...))` from documentation. * Address feedback from #2150 * Start legacy monitoring agent after installing * Address feedback: be explicit about Ansible install * Add documentation for Slurm building example * Adding test for building slurm image * Create variable to pass the Packer group name * Fix: old ansible was not compatable with selinux package, pin to latest * Fix false-positive `test_deployment_variable_not_used` (#2164) Preserve original state of `Vars` * Bump test coverage for `pkg/modulewriter` (#2163) * Add `--force` flag to `ghpc create` (#2162) * Improve error logging for expressions parsing (#2078) * Show snippet with ponter to a column; ```sh Error: :0,21-22: Invalid character; This character is not used within the language., and 1 other diagnostic(s) 34: content: | Error: Invalid character; This character is not used within the language. echo "Hello $(vars.project_id from $(vars.region)" ^ 33: content: | ``` * Prevent line-breaks within expressions. This constraint existed before, but was accidentaly relaxed by recent PR. * Remove quantum circuit simulator example * Update hpc-slurm-legacy-sharedvpc example and references to use Slurm V6 * Bump `cmd` test coverage (#2165) * Update Slurm image 6.1 -> 6.3 (#2169) * Add login node in the spack openfoam tutorial example * Update Toolkit docs to point to GCP Slurm fork * Fix: added new variables to ml-slurm integration test * Update slurm references * Update CloudSQL blueprint to v6 * Ensure Windows VMs start HTCondor only after successful secret download - this enables Managed Instance Group health checks to mark the node unhealthy for deletion * Updated legacy-sharedvpc reference naming to sharedvpc * Bump github.com/zclconf/go-cty from 1.14.1 to 1.14.2 Bumps [github.com/zclconf/go-cty](https://github.com/zclconf/go-cty) from 1.14.1 to 1.14.2. - [Release notes](https://github.com/zclconf/go-cty/releases) - [Changelog](https://github.com/zclconf/go-cty/blob/main/CHANGELOG.md) - [Commits](zclconf/go-cty@v1.14.1...v1.14.2) --- updated-dependencies: - dependency-name: github.com/zclconf/go-cty dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> * Bump google.golang.org/api from 0.157.0 to 0.159.0 Bumps [google.golang.org/api](https://github.com/googleapis/google-api-go-client) from 0.157.0 to 0.159.0. - [Release notes](https://github.com/googleapis/google-api-go-client/releases) - [Changelog](https://github.com/googleapis/google-api-go-client/blob/main/CHANGES.md) - [Commits](googleapis/google-api-go-client@v0.157.0...v0.159.0) --- updated-dependencies: - dependency-name: google.golang.org/api dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> * Modified validation message to be more clear * Remove project_id from image building example * Reduce size of image builder to be compatible with initial projects * Remove SLurm V4 modules and add note to use and reference V4 modules and examples * Patch Slurm integration test Retry initial node count until sinfo command is successful * Update Chrome Remote Desktop to Debian 12 by default * Make TPU non-preemptible in blueprint and add retries in JAX verification integration test * Update startup-script module to latest release * Bump `pkg/modulereader` test coverage 80% -> 87% (#2161) * Fix test broken by remove module. (#2186) * Update TPU v6 blueprint to use new VPC module * Improve test coverage of `pkg/modulewriter` (#2188) * Updating spack and ramble buckets to use 6 digits of hex * Improve output of `tools/enforce_coverage.pl` (#2191) * Output package that failed; * Set thresholds `pkg/logging: 0; pkg/inspect: 60` * Remove v4 reference from network storage document * * Add function `Dict.Keys` to differentiate places that don't care about values. (#2194) * Add shorthand `Reference.AsValue()` (#2195) * Deprecate Dell Omnia module and example blueprint * Change mode of maintenance.py so that it can be executed as description suggests * Change batch-job-base template from json to YAML Batch supports YAML job configurations now so we can use YAML everywhere instead of JSON, which will hopefully make some of the conditional syntax in the templates easier to manage. * Move topological ordering of vars into separate function. (#2190) **Motivation:** To be reused in other places * Bump google.golang.org/api from 0.159.0 to 0.161.0 Bumps [google.golang.org/api](https://github.com/googleapis/google-api-go-client) from 0.159.0 to 0.161.0. - [Release notes](https://github.com/googleapis/google-api-go-client/releases) - [Changelog](https://github.com/googleapis/google-api-go-client/blob/main/CHANGES.md) - [Commits](googleapis/google-api-go-client@v0.159.0...v0.161.0) --- updated-dependencies: - dependency-name: google.golang.org/api dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> * Update Slurm-GCP release to 5.10.2 * Add expression variables to the test configs (#2197) * HTCondor: expire ClassAds more rapidly Decrease the default CLASSAD_LIFETIME from 15 minutes to 3 minutes. In an on-premises system, allowing for long intervals between ClassAd updates can be good to allow more machines to reboot. In the cloud, the absence of ClassAd updates more likely indicates the intentional (or automated) deletion of a VM. So it should be removed from the HTCondor pool. * HTCondor: ensure Windows nodes are detected as unhealthy Ensure that the script for Windows exits with an error before starting HTCondor when it cannot download the condor_config file. * Align formatting choices with recent commits * HTCondor autoscaler Adopt a more conservative approach that the autoscaler should treat nodes in any state that reflects automated MIG modification as an "idle" node for the purposes of autoscaling. This helps prevent autoscaling runaway when VMs are unable to enter the healthy state (which reflects as "NONE" for currentAction in the MIG). * Fix tests: look for yaml file, use image with yaml compat gcloud * Use multiline yaml block scalar for Batch runnable * Address feedback from #2204 * Bump cryptography from 41.0.6 to 42.0.0 in /community/front-end/ofe Bumps [cryptography](https://github.com/pyca/cryptography) from 41.0.6 to 42.0.0. - [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst) - [Commits](pyca/cryptography@41.0.6...42.0.0) --- updated-dependencies: - dependency-name: cryptography dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> * Take "first deploy only" dependency on `slurm_files` (#2181) * Refactor `eval` functions (#2201) Make `Blueprint.Eval` the only place where blueprint context is created. * Remove `setGlobalLabels` as it's not needed (#2193) The `combineLabels` will do it. * Show hint message if unsupported function is used (#2211) Currently we have a few contexts that got evaluated by ghpc: * `vars`; * `module.settings` in packer groups; * `validators.input` Our evaluation context only supports 2 functions: `merge` and `flatten`. The rest of expressions (`module.settings` in TF groups) are not evaluated by ghpc => can use any valid HCL-syntax. * Update pre-commit hooks * Restrict GitHub actions to operate on upstream - the dependency license and PR label actions only need to run on the GoogleCloudPlatform copy of the HPC Toolkit * Remove pre-commit from Cloud Build PR validation * Create GitHub Action to run pre-commit - pre-commit verification will run on every Pull Request - if the user opts in with the label "pre-commit-autofix" the user can request that pre-commit add a commit that fixes formatting, where it is capable of automatically fixing formatting * Bump django from 4.2.7 to 4.2.10 in /community/front-end/ofe (#2213) Bumps [django](https://github.com/django/django) from 4.2.7 to 4.2.10. - [Commits](django/django@4.2.7...4.2.10) --- updated-dependencies: - dependency-name: django dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Don't run `destroy_resource_policies` before `destroy_nodes` is done (#2217) ```sh module.slurm_controller.module.cleanup_compute_nodes[0].null_resource.destroy_nodes_on_destroy[0]: Destruction complete after 2m52s module.slurm_controller.module.cleanup_resource_policies[0].null_resource.destroy_resource_policies_on_destroy[0]: Destroying... [id=89627024760583747 65] ``` * Rename HTC Slurm configuration templates with explicit purpose * Add Slurm configuration template for long Prolog/Epilog scripts * Adopt empty string as default value for maintenance_interval The default value of null cannot be set as a deployment variable; this will allow the value to be set at the top of a blueprint. * Remove enable_devel from slurm-gcp v5 examples * Add login node to spack gromacs tutorial example * Version bump to 1.28.0 (#2232) --------- Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: Mark Olson <115657904+mark-olson@users.noreply.github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Nick Stroud <nickstroud@google.com> Co-authored-by: Harsh Thakkar <harshthakkar@google.com> Co-authored-by: Tom Downes <tpdownes@google.com> Co-authored-by: Tom Downes <tpdownes@users.noreply.github.com> Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Eimantas Kazakevicius <eimantas.kazakevicius@nag.com> Co-authored-by: Mark Olson <115657904+mark-olson@users.noreply.github.com> Co-authored-by: Carson Dunbar <carsondunbar@google.com> Co-authored-by: Carlos Boneti <cboneti@users.noreply.github.com> Co-authored-by: Rohit Ramu <roramu@google.com> Co-authored-by: Alyssa <alyssasm@google.com> Co-authored-by: alyssa-sm <146790241+alyssa-sm@users.noreply.github.com> Co-authored-by: cdunbar13 <139253655+cdunbar13@users.noreply.github.com> Co-authored-by: Aaron Golden <aegolden@google.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.