Azure Compute  ·  AZ-104 Exam Prep

Azure Virtual Machine Scale Sets (VMSS):
Complete Guide for IT Admins

Auto-scale your workloads intelligently, eliminate manual VM management, and achieve high availability — all from the Azure Portal. Everything you need from concept to deployment.

⏱ 12 min read 📊 Difficulty: Intermediate 🗓 Last updated: June 2025
0Max instances
0SLA with Zones
0Cost w/ Spot VMs
0Tutorial steps

What Are Azure Virtual Machine Scale Sets?

Azure Virtual Machine Scale Sets (VMSS) is a compute service that lets you deploy and manage a group of load-balanced, auto-scaling virtual machines. Rather than manually provisioning individual VMs and configuring each one, VMSS treats your entire fleet as a single managed resource — automatically adding or removing instances based on demand, health probes, or a schedule you define.

Think of it as Azure’s answer to the question: “What happens when one VM is no longer enough?” Whether you are running a web tier behind Application Gateway, a processing fleet for batch jobs, or a stateless microservice, VMSS gives you elastic capacity without the operational overhead of per-VM management.

ℹ️

AZ-104 Exam Relevance

VMSS is a testable topic under the Deploy and Manage Azure Compute Resources objective domain. Expect scenario questions on scaling policies, orchestration modes, and integration with Azure Load Balancer or Application Gateway.

Core Concepts You Must Know

Before deploying your first scale set, you need to understand the six pillars that define how VMSS works:

🖥️

Instance Model

All VMs share the same OS image, size, and configuration. Changes to the model roll out via the upgrade policy.

📈

Autoscale

Scales in/out based on CPU, memory, custom metrics, or a schedule. Rules are defined in Azure Monitor autoscale settings.

🔄

Upgrade Policy

Defines how model changes roll out: Manual, Automatic, or Rolling — each with different availability impact.

🛡️

Health Monitoring

Application Health extension or Load Balancer probes report instance health. Unhealthy instances can be auto-repaired.

🏗️

Fault Domains

Azure distributes instances across fault and update domains to maximise availability during hardware failures and maintenance.

💰

Spot Instances

VMSS supports Azure Spot VMs for fault-tolerant workloads, cutting compute costs by up to 90% when capacity is available.

Orchestration Modes: Uniform vs. Flexible

This is one of the most important distinctions to understand — both for real-world deployments and the AZ-104 exam. Azure VMSS supports two orchestration modes, and choosing the wrong one can severely limit your options.

Feature Uniform Orchestration Flexible Orchestration
VM ManagementManaged by VMSS profile onlyFull VM API access per instance
VM SizeSingle SKU across all instancesMultiple VM SKUs supported
Max InstancesUp to 1,000 (platform image)Up to 1,000 per region
Autoscale✓ Supported✓ Supported
Availability ZonesSupportedSupported (better zone balance)
Use CaseStateless web tiers, identical VMsMixed workloads, phased migration
Recommended ForClassic workloads✓ New deployments
💡

Best Practice

Microsoft recommends Flexible orchestration for all new VMSS deployments. It gives you per-instance control while still supporting the full autoscale and load balancing feature set.

VMSS Architecture Overview

A production VMSS deployment involves several Azure services working together. Here is how the layers stack from top to bottom:

VMSS Reference Architecture

Traffic LayerAzure Load Balancer (Standard)Application Gateway / WAFTraffic Manager
Scale SetVMSS (Flexible)VM Instance 001VM Instance 002VM Instance 00N…
AutoscaleAzure Monitor AutoscaleCPU / Memory MetricCustom MetricSchedule Rule
AvailabilityAvailability Zones 1 / 2 / 3Fault Domain SpreadAuto-Repair Policy
StorageManaged OS DisksAzure Shared DiskAzure Files Mount

Autoscaling Deep Dive

Autoscale is the feature that makes VMSS genuinely powerful. Azure Monitor Autoscale evaluates your rules on a repeating cycle (typically every 1 minute) and scales in or out accordingly.

Scale-Out Rule — Adding Instances

A typical scale-out rule triggers when the average CPU across all instances exceeds 75% for 10 minutes. Azure then adds the configured number of instances and enters a cooldown period to prevent rapid oscillation.

Scale-In Rule — Removing Instances

The matching scale-in rule removes instances when the average CPU drops below 30% for 10 minutes. Always pair a scale-in with a scale-out rule — without it, your fleet only ever grows and your costs climb indefinitely.

⚠️

Common Gotcha

The cooldown period (default 5 minutes) applies after every scale action. If your workload spikes unpredictably, consider lowering it — but too short a cooldown risks thrashing. A minimum of 3 minutes is recommended for most web tiers.

Instance Limits

  • Minimum instances: The floor — VMSS never scales below this count, even at zero load.
  • Maximum instances: The ceiling — caps spending and prevents runaway scale events.
  • Default instance count: Used when metric data is unavailable, such as during initial deployment.
🎮 Interactive Tool

Autoscale Simulator

Drag the sliders to see how VMSS responds to different CPU loads and rule configurations.

45%
75%
30%
2 / 8
Instances: 2 CPU: 45% Status: Idle — within thresholds

Upgrade Policies Explained

When you update the VMSS model — for example, changing the OS image version, adding an extension, or modifying the VM size — the upgrade policy controls how that change rolls out to existing instances.

Policy How It Works Availability Impact Best For
Manual Existing instances keep the old model until you manually trigger an upgrade. None Dev / test environments
Automatic Azure upgrades all instances immediately when the model changes. High risk Non-production workloads
Rolling Upgrades instances in configurable batches with health checks between each batch. Controlled Production deployments

VMSS vs. Availability Sets vs. Availability Zones

Understanding where VMSS fits relative to other high-availability constructs is a frequently tested AZ-104 concept:

  • Availability Sets — Protect against rack-level hardware failures within a single datacenter. Support up to 3 fault domains and 20 update domains. No auto-scaling capability.
  • Availability Zones — Protect against full datacenter failure by distributing VMs across physically separate zones within a region. Deliver the highest SLA at 99.99%.
  • VMSS with Zones — Combines elastic auto-scaling with zone distribution. This is the recommended pattern for highly available, elastic production workloads on Azure.
🎯

AZ-104 Exam Tip

If an exam scenario asks how to build a web tier that handles traffic spikes and survives a datacenter outage, the answer is VMSS deployed across Availability Zones, combined with a Standard Load Balancer. Neither Availability Sets alone nor a single-zone VMSS satisfies both requirements simultaneously.

Integration with Azure Load Balancer

VMSS instances are automatically registered with an Azure Load Balancer backend pool when you configure one during deployment. Key points to know:

  • Always use the Standard SKU load balancer. The Basic SKU does not support Availability Zones and has been retired for new deployments.
  • Health probes (HTTP, HTTPS, or TCP) determine which instances receive traffic. Failing instances are removed from rotation — not deleted.
  • When VMSS scales out, new instances are automatically registered with the backend pool. No manual NIC association is required.
  • Inbound NAT rules in the load balancer can map unique external ports to RDP or SSH ports on individual scale set instances for direct management access.

Hands-On Tutorial

Deploy a VMSS from the Azure Portal (Step-by-Step)

📋
Prerequisites: An active Azure subscription with Contributor access  ·  An existing Resource Group (e.g., AZ104-Compute-RG)  ·  Basic familiarity with the Azure Portal.
1

Navigate to Virtual Machine Scale Sets

Sign in to the Azure Portal at portal.azure.com. In the top search bar, type Virtual Machine Scale Sets and select the service from the search results.

Azure Portal  ›  Search Bar  ›  “Virtual Machine Scale Sets”  ›  + Create

Click + Create to begin the deployment wizard.

2

Configure the Basics Tab

Fill in the required fields on the Basics tab:

  • Subscription: Select your Azure subscription.
  • Resource Group: Choose AZ104-Compute-RG or your existing RG.
  • Virtual machine scale set name: e.g., vmss-webfront-001
  • Region: e.g., East US or a region close to your users.
  • Availability zone: Select Zones 1, 2, 3 for maximum resilience.
  • Orchestration mode: Select Flexible — recommended for all new deployments.
  • Security type: Trusted launch virtual machines is the default and recommended.
📸 Screenshot PlacementVMSS Basics tab showing Resource Group, name, region, zones, and Flexible orchestration selected.
3

Choose Image and VM Size

4

Set Administrator Account

5

Configure Scaling (Autoscale)

6

Configure Networking and Load Balancer

7

Set the Upgrade Policy

8

Review + Create

9

Verify Autoscale in Azure Monitor

Key Takeaways

VMSS is Azure’s managed fleet service — identical instances, one control plane, automatic load distribution.

Flexible orchestration is the modern default — choose it for all new deployments unless you have a specific legacy reason to use Uniform.

Autoscale requires paired rules — a scale-out rule without a matching scale-in rule means your fleet only ever grows, and so does your Azure bill.

Rolling upgrade policy is production-safe — batch updates with health checks prevent full outages during model updates.

Combine VMSS with Availability Zones for the highest SLA (99.99%) and zone-resilient auto-scaling.

Standard Load Balancer is mandatory — the Basic SKU is retired for new deployments and does not support Availability Zones.

For the AZ-104 exam — know the difference between Uniform and Flexible orchestration, the three upgrade policies, and when to use VMSS over Availability Sets.

🧠 Knowledge Check

AZ-104 VMSS Quick Quiz

Test your understanding before moving on. Select an answer, then click Check Answers.

1. Which orchestration mode does Microsoft recommend for all new VMSS deployments?

A. Uniform Orchestration
B. Flexible Orchestration
C. Stateful Orchestration
D. Classic Orchestration
Flexible Orchestration is the recommended mode for all new VMSS deployments — it provides full VM API access per instance while still supporting autoscale.

2. What SLA does VMSS achieve when deployed across all three Availability Zones?

A. 99.5%
B. 99.9%
C. 99.99%
D. 100%
VMSS deployed across Availability Zones achieves a 99.99% SLA — the highest available for Azure compute.

3. Which upgrade policy should you choose for production VMSS deployments?

A. Automatic — upgrades all instances immediately
B. Rolling — upgrades in batches with health checks
C. Manual — requires an admin to trigger upgrades
D. Scheduled — upgrades only during maintenance windows
Rolling upgrade policy is the production-safe choice. It upgrades instances in configurable batches and performs health checks between each batch, preventing full outages.

4. What happens if you configure a scale-out rule but forget to add a matching scale-in rule?

A. Your fleet only ever grows — and so does your Azure bill
B. Azure automatically creates a default scale-in rule
C. The scale-out rule is disabled until a scale-in rule is added
D. VMSS uses the minimum instance count as a scale-in trigger
Without a scale-in rule, VMSS will scale out when load increases but never scale back in — your instance count only grows, leading to runaway costs.

Frequently Asked Questions

Can VMSS instances use different VM sizes?

Yes — but only with Flexible orchestration. Uniform orchestration requires all instances to share the same VM SKU. Flexible mode lets you specify different sizes per instance, which is useful for mixed workloads or gradual hardware migrations.

Does VMSS support stateful workloads?

VMSS is designed primarily for stateless workloads. For stateful applications, use Azure Service Fabric or persist state externally via Azure SQL Database, Cosmos DB, or Azure Files. For containerised stateful apps, consider the stateful node pool concept in AKS.

What happens to an instance during a scale-in event?

By default, Azure selects the instance with the highest instance ID for termination. You can customise this using the Scale-in policy setting — choosing OldestVM, NewestVM, or Default (highest ID). When the terminate notification feature is enabled, instances receive a graceful shutdown signal before deletion.

How is Azure VMSS priced?

You pay for the individual VM instances within the scale set — VMSS itself does not carry an additional service charge. Each instance is billed at the standard VM compute rate for its size and OS. Using Azure Spot instances within VMSS can reduce compute costs by up to 90% for fault-tolerant, interruptible workloads.

Can I use VMSS with Azure Kubernetes Service (AKS)?

Yes. AKS node pools are backed by VMSS. When you enable the Cluster Autoscaler in AKS, it scales the underlying VMSS node pools based on pod scheduling pressure. Understanding VMSS VM SKUs and Availability Zone coverage is therefore important when designing production AKS node pools.


Azure VMSS AZ-104 Autoscale Azure Compute Load Balancer Availability Zones Windows Server 2025 vmorecloud.com

Leave a Reply

Your email address will not be published. Required fields are marked *