Skip to content
This repository has been archived by the owner on Oct 5, 2023. It is now read-only.

Introduction to deploying ALZ Monitor

github-actions edited this page Aug 2, 2023 · 14 revisions

Background

This guide describes how to get started with implementing alert policies and initiatives in your environment for testing and validation. In the guide it is assumed that you will be using GitHub actions or manual deployment to implement policies, initiatives and policy assignments in your environment.

Note that this is a preview solution intended to solicitate feedback for further development which should be tested in a safe environment before deploying to production to protect against possible failures/unnecessary cost. Also note that this private repo is shared with different select Microsoft customers and partners, as such you should never upload or otherwise divulge sensitive information to this repo. If there is any concern, please contact your Microsoft counterparts for detailed advice.

The repo at present contains code and details for the following:

  • Policies to automatically create alerts, action groups and alert processing rules for different Azure resource types, centered around a recommended Azure Monitor Baseline for Alerting in a customers´ newly created or existing brownfield ALZ deployment.
  • Initiatives grouping said policies into appropriate buckets for ease of policy assignment in alignment with ALZ Platform structure (Networking, Identity and Management).

Alerts, action groups and alert processing rules are created as follows:

  1. All metric alerts are created in the resource group where the resource that is being monitored exists. i.e. creating an ER circuit in a resource group covered by the policies will create the corresponding alerts in that same resource group.
  2. Activity log alerts are created in a specific resource group (created specifically by and used for this solution) in each subscription, when the subscription is deployed. The resource group name is parameterized, with a default value of AlzMonitoring-rg.
  3. Resource health alerts are created in a specific resource group (created specifically by and used for this solution) in each subscription, when the subscription is deployed. The resource group name is parameterized, with a default value of AlzMonitoring-rg.
  4. Action groups and alert processing rules are created in a specific resource group (created specifically by and used for this solution) in each subscription, when the subscription is deployed. The resource group name is parameterized, with a default value of AlzMonitoring-rg.

Prerequisites

  1. Azure Active Directory Tenant.
  2. ALZ Management group hierarchy deployed as described here.*
  3. Minimum 1 subscription, for when deploying alerts through policies.
  4. Deployment Identity with Owner permission to the pseudo root management group. Owner permission is required to allow the Service Principal Account to create role-based access control assignments.
  5. If deploying manually, i.e. via Azure CLI or PowerShell, ensure that you have Bicep installed and working, before attempting installation. See here for how to configure for Azure CLI and here for PowerShell
  6. For the policies to work, the following Azure resource providers, normally registered by default, must be registered on all subscriptions in scope:
    • Microsoft.AlertsManagement
    • Microsoft.Insights

Please see here for details on how to register a resource provider should you need to do so.

  1. For leveraging the log alerts for Virtual Machines, ensure that VM Insights is enabled for the Virtual Machines to be monitored. For more details on VM Insights deployment see here . Please note only the performance collection of the VM insights solution is required for the current alerts to deploy.

*While it´s recommended to implement the alert policies and initiatives to an ALZ Management Group hierarchy, it is not a technical requirement. These policies and initiatives can be implemented in existing brownfield scenarios that don´t adhere to the ALZ Management Group hierarchy. For example, in hierarchies where there is a single management group, or where the structure does not align to ALZ. At least one management group is required. In case you haven't implemented management groups, we included guidance on how to get started.

Getting started

  • Fork this repo to your own GitHub organization, you should not create a direct clone of the repo. Pull requests based off direct clones of the repo will not be allowed.
  • Clone the repo from your own GitHub organization to your developer workstation.
  • Review your current configuration to determine what scenario applies to you. We have guidance that will help deploy these policies and initiatives whether you are aligned with Azure Landing Zones, or use other management group hierarchy, or you may not be using management groups at all. If you know your type of management group hierarchy, you can skip forward to your preferred deployment method:

Determining your management group hierarchy

Azure Landing Zones is a concept that provides a set of best practices, patterns, and tools for creating a cloud environment that is secure, Well-Architected, and easy to manage. Management groups are a key component of Azure Landing Zones, as they allow you to organize and manage your subscriptions and resources in a hierarchical structure. By using management groups, you can apply policies and access controls across multiple subscriptions and resources, making it easier to manage and govern your Azure environment.

The initiatives provided in this repository align with the management group hierarchy guidelines of Azure Landing Zones. Effectively creating the following assignment mapping between the initiative and the management group:

  • Identity Initiative is assigned to the Identity management group.
  • Management Initiative is assigned to the Management management group.
  • Connectivity Initiative is assigned to the Connectivity management group.
  • Landing Zone Initiative is assigned to the Landing Zone management group.
  • Service Health Initiative is assigned to the intermediate (ALZ) root management group.

The image below is an example of how a management group hierarchy looks like when you follow Azure Landing Zone guidance. Also illustrated in this image is the default recommended assignments of the initiatives.

ALZ Management group structure

The diagram below shows the flow using the orange dash-lines of the policy initiatives and their associated policy definitions. Notice how the Service Health Initiative is assigned at the pseudo root of the management group structure in this case the Contoso management group. This initiative contains the policy that deploys the alert processing rules and action group to each subscription.

The other monitoring initiatives are each assigned at specific platform landing zone management groups and workload landing zones. The flows for these are in blue dash-lines.

Azure Monitor Baseline Alerts policy initiative flows

Click here if you'd like to download this Visio diagram.

If you have this management group hierarchy, you can skip forward to your preferred deployment method:

It´s important to understand why we assign initiatives to certain management groups. In the previous example, the assignment mapping was done this way because the associated resources within a subscription below a management group have a specific purpose. For example, below the Connectivity management group you will find a subscription that contains the networking components like Firewalls, Virtual WAN, Hub Networks, etc. Consequently, this is where we assign the connectivity initiative to get relevant alerting on those services. It wouldn't make sense to assign the connectivity initiative to other management groups when there are no relevant networking services deployed.

We recognize that Azure allows for flexibility and choice, and you may not be aligned with ALZ. For example, you may have:

  • A management group structure that is not aligned to ALZ. Where you may only have a Platform management group without the sub management groups like Identity/ Management/ Connectivity.
  • No management group structure.

NOTE: If you are looking to align your Azure environment to Azure landing zone, please see Transition existing Azure environments to the Azure landing zone conceptual architecture.

Suppose Identity/ Management/ Connectivity are combined in one Platform Management Group, the approach could be to assign the three corresponding initiatives to the Platform management group instead. Maybe you have a hierarchy where you organize by geography and/or business units instead of specific landing zones. Assignment mapping:

  • Identity Initiative is assigned to the Platform management group.
  • Management Initiative is assigned to the Platform management group.
  • Connectivity Initiative is assigned to the Platform management group.
  • Landing Zone Initiative is assigned to the Geography management group.
  • Service Health Initiative is assigned to the top-most level(s) in your management group hierarchy.

The image below is an example of how the assignments could look like when the management group hierarchy isn´t aligned with ALZ.

Management group structure - unaligned

We recommend that you review the initiative definitions to determine where best to apply the initiatives in your management group hierarchy.

If you have this management group hierarchy, you can skip forward to your preferred deployment method:

If management groups were never configured in your environment, there are some additional steps that need to be implemented. To be able to deploy the policies and initiatives through the guidance and code we provide you need to create at least one management group, and by doing so the tenant root management group is created automatically. We strongly recommend following the Azure Landing Zones guidance on management group design.

Please refer to our documentation on how to create management groups.

If you implemented the recommended management group design, you can skip forward to your preferred deployment method, following the ALZ aligned guidance.

If you implemented a single management group, we recommend to move your production subscriptions into that management group, consult the steps in the documentation for guidance to add the subscriptions.

To prevent unnecessary alerts, we recommend keeping development, sandbox, and other non-production subscriptions either in a different management group or below the tenant root group.

The image below is an example of how the assignments look like when you are using a single management group.

Management group structure - single

Customizing policy assignments

As mentioned previously the above guidance will deploy policies, alerts and action groups with default settings. For details on how to customize policy and in particular initiative assignments please refer to Customize Policy Assignment

Customizing the ALZ-Monitor policies

Whatever way you may choose to consume the policies we do expect, and want, customers and partners to customize the policies to suit their needs and requirements for their design in their local copies of the policies.

For example, if you want to include more thresholds, metrics, activity log alerts or similar, outside of what the parameters allow you to change and customize, then by opening the individual policy or initiative definitions you should be able to read, understand and customize the required lines to meet your requirements easily.

This customized policy can then be deployed into your environment to deliver the desired functionality.

Disabling Monitoring

If you wish to disable monitoring for a resource or for alerts targeted at subscription level such as Activity Log, Service Health, and Resource Health. A "MonitorDisable" tag can be created with a value of "true" at the scope where you wish to disable monitor. This will effectively filter the resource or subscription from the compliance check for the policy.

IMPORTANT: If you believe the changes you have made should be more easily available to be customized by a parameter etc. in the policies, then please raise an issue for a 'Feature Request' on the repository.

If you wish to, also feel free to submit a pull request relating to the issue which we can review and work with you to potentially implement the suggestion/feature request.

Cleaning up an ALZ Monitor Deployment

In some scenarios, it may be necessary to remove everything deployed by the ALZ Monitor solution. If you want to clean up all resources deployed, please refer to the instructions on running the Cleaning up an ALZ Monitor Deployment.

Next steps