Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1ES Hosted Pools #3054

Merged
merged 29 commits into from
Aug 26, 2022
Merged

Conversation

StephanTLavavej
Copy link
Member

@StephanTLavavej StephanTLavavej commented Aug 25, 2022

We're being required to migrate from Azure virtual machine scale sets (VMSS) to One Engineering System (1ES) Hosted Pools. Fortunately, we can preserve the important properties of our current scheme (namely, that we have full control over how we prepare the VM image, capturing a specific VS Preview version). I also figured out how to automate more steps than were previously possible, making the process simpler and more reproducible. Thanks to @BillyONeal for insights here. 💡 😻

For contributors, there should be no significant changes. Runs might take a little longer (we have to change VM SKUs, at least initially), and I'm not sure what kind of sporadic issues we'll see. (Perhaps the sporadic stalls we've been dealing with will be reduced in frequency!) There should be no changes to how the results of PR checks are presented.

I've structured this as a series of commits for easier review. (Because we work on this script so rarely, I also performed several refactorings and cleanups; apologies in advance.)

  • provision-image.ps1: PowerShell 7.2.6.
  • Rename create-vmss.ps1 to create-1es-hosted-pool.ps1.
  • Extract function Display-Progress-Bar.
    • This significantly reduces clutter and makes it easier to add progress bar steps. The necessary magic is $script:CurrentProgress++, allowing us to modify a variable in the outer script scope (otherwise, a function can only read it).
  • $TotalProgress should be the number of calls to Display-Progress-Bar -Status.
    • If we display 1/N as we begin the 1st task, we should display N/N as we begin the Nth task.
  • Simplify status messages: 'provisioning script' isn't useful.
  • Extract $CurrentDate, ensuring that the value won't change during the script.
  • Simplify Wait-Shutdown with the -notin operator.
  • Drop Find-ResourceGroupName; 'HHmm' provides sufficient uniqueness.
    • Running the creation script twice in a single minute is not a scenario worth worrying about. (This was necessary when only the date was used in the resource group name.)
  • Consistently wrap Azure PowerShell cmdlet invocations.
  • Split progress messages.
  • Use single quotes when we don't need string expansion.
  • Consistently use spaces and semicolons in single-line hashtables.
    • This is similar to trailing commas.
  • Drop VMSS machinery.
    • $LiveVMPrefix was only needed for Set-AzVmssOsProfile.
    • Azure CLI (az) was only needed for VMSS diagnostic logs.
    • We'll be replacing New-AzImageConfig and New-AzImage with an Azure Compute Gallery.
  • Update comments/messages that mentioned scale sets.
  • Move to 'eastus', 'Standard_D32ds_v5'.
    • We were able to successfully obtain 1ES quota here.
  • Add new machinery to create a 1ES Hosted Pool (explained below).
  • All we need at the end is the pool name, so print that.
  • Cleanup network resources.
  • .gitignore: Drop vmss-config.json, vmss-protected.json.
  • Quote the strings 'CPP_STL_GitHub' and 'latest'.
    • PowerShell will implicitly treat things as strings, but I find this confusing; we're reasonably consistent about using quotes and this makes us more so.
  • When calculating progress, multiply before dividing for clarity and precision.
  • Quote another string.
  • New pool!
  • All shall love ISO 8601 and despair! (Omit hyphen/underscore between the date and THHmm time part.)
  • Follow PowerShell naming conventions: Display-ProgressBar
  • Forbid positional binding for all functions.
  • Make parameters mandatory when we don't provide defaults.
  • Style: Consistent Param spacing.
  • Newer pool!

Brief explanation of how the 1ES Hosted Pool machinery works (this is my understanding, I hope it's accurate):

An "Azure Compute Gallery" (previously named "Shared Image Gallery"; knowing this is helpful to understand documentation) is a container (in the STL sense) of image definitions, and those are containers of image versions, which are VM snapshots. This happens on the Azure side. Then there are 1ES Images (which refer to those snapshots), and finally 1ES Hosted Pools, which are the new analogues of our old Virtual Machine Scale Sets (the difference is that 1ES is now responsible for obtaining quota, spinning up VMs, etc.). Lots and lots of complexity is possible here, but for the moment, we are using everything in a 1:1 ratio. That is, we create one resource group with one compute gallery, one image definition, one image version, one 1ES image, and one 1ES hosted pool. The pool directly connects to Azure Pipelines (via magic that I can't quite believe worked on the first try). We don't need to worry about images being updated without the pool changing, because we have only one version.

(In the future, we may want/need to explore other schemes, if we run into quota difficulties or whatever, but this will be a starting point.)

The script takes care of several things that would otherwise need to be done through the web UI (granting permission to 1ES Resource Management, creating the 1ES Image and Hosted Pool). I verified through JSON inspection that the scripted results are comparable to the web UI results in the ways that matter (some empty fields were simply omitted, which didn't seem to affect the checks passing).

I've updated the Checklist for Toolset Updates wiki page to explain the new world order. It is generally simpler (no Azure CLI, no web UI for creating a VMSS pool), with the one new part being a manual permission step that involves clicking buttons but not making choices.

…ar -Status`.

If we display 1/N as we begin the 1st task, we should display N/N as we begin the Nth task.
We don't need to worry about running the creation script twice in a single minute.
`$LiveVMPrefix` was only needed for `Set-AzVmssOsProfile`.

Azure CLI (`az`) was only needed for VMSS diagnostic logs.

We'll be replacing `New-AzImageConfig` and `New-AzImage` with an Azure Compute Gallery.
@StephanTLavavej StephanTLavavej added the infrastructure Related to repository automation label Aug 25, 2022
@StephanTLavavej StephanTLavavej marked this pull request as ready for review August 25, 2022 09:25
@StephanTLavavej StephanTLavavej requested a review from a team as a code owner August 25, 2022 09:25
@StephanTLavavej StephanTLavavej added the high priority Important! label Aug 25, 2022
@strega-nil-ms strega-nil-ms self-assigned this Aug 25, 2022
Copy link
Contributor

@strega-nil-ms strega-nil-ms left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function name is the important bit to me (otherwise, this looks good)

azure-devops/create-1es-hosted-pool.ps1 Outdated Show resolved Hide resolved
azure-devops/create-1es-hosted-pool.ps1 Outdated Show resolved Hide resolved
azure-devops/create-1es-hosted-pool.ps1 Outdated Show resolved Hide resolved
azure-devops/create-1es-hosted-pool.ps1 Outdated Show resolved Hide resolved
@StephanTLavavej
Copy link
Member Author

I'm speculatively mirroring this to the MSVC-internal repo - please notify me if any further changes are pushed.

@StephanTLavavej StephanTLavavej merged commit fdb9c99 into microsoft:main Aug 26, 2022
@StephanTLavavej StephanTLavavej deleted the 1es-hosted-pools branch August 26, 2022 23:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
high priority Important! infrastructure Related to repository automation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants