Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Azure Compute · AZ-104 Exam Prep
Auto-scale your workloads intelligently, eliminate manual VM management, and achieve high availability — all from the Azure Portal. Everything you need from concept to deployment.
Table of Contents
Azure Virtual Machine Scale Sets (VMSS) is a compute service that lets you deploy and manage a group of load-balanced, auto-scaling virtual machines. Rather than manually provisioning individual VMs and configuring each one, VMSS treats your entire fleet as a single managed resource — automatically adding or removing instances based on demand, health probes, or a schedule you define.
Think of it as Azure’s answer to the question: “What happens when one VM is no longer enough?” Whether you are running a web tier behind Application Gateway, a processing fleet for batch jobs, or a stateless microservice, VMSS gives you elastic capacity without the operational overhead of per-VM management.
AZ-104 Exam Relevance
VMSS is a testable topic under the Deploy and Manage Azure Compute Resources objective domain. Expect scenario questions on scaling policies, orchestration modes, and integration with Azure Load Balancer or Application Gateway.
Before deploying your first scale set, you need to understand the six pillars that define how VMSS works:
Instance Model
All VMs share the same OS image, size, and configuration. Changes to the model roll out via the upgrade policy.
Autoscale
Scales in/out based on CPU, memory, custom metrics, or a schedule. Rules are defined in Azure Monitor autoscale settings.
Upgrade Policy
Defines how model changes roll out: Manual, Automatic, or Rolling — each with different availability impact.
Health Monitoring
Application Health extension or Load Balancer probes report instance health. Unhealthy instances can be auto-repaired.
Fault Domains
Azure distributes instances across fault and update domains to maximise availability during hardware failures and maintenance.
Spot Instances
VMSS supports Azure Spot VMs for fault-tolerant workloads, cutting compute costs by up to 90% when capacity is available.
This is one of the most important distinctions to understand — both for real-world deployments and the AZ-104 exam. Azure VMSS supports two orchestration modes, and choosing the wrong one can severely limit your options.
| Feature | Uniform Orchestration | Flexible Orchestration |
|---|---|---|
| VM Management | Managed by VMSS profile only | Full VM API access per instance |
| VM Size | Single SKU across all instances | Multiple VM SKUs supported |
| Max Instances | Up to 1,000 (platform image) | Up to 1,000 per region |
| Autoscale | ✓ Supported | ✓ Supported |
| Availability Zones | Supported | Supported (better zone balance) |
| Use Case | Stateless web tiers, identical VMs | Mixed workloads, phased migration |
| Recommended For | Classic workloads | ✓ New deployments |
Best Practice
Microsoft recommends Flexible orchestration for all new VMSS deployments. It gives you per-instance control while still supporting the full autoscale and load balancing feature set.
A production VMSS deployment involves several Azure services working together. Here is how the layers stack from top to bottom:
VMSS Reference Architecture
Autoscale is the feature that makes VMSS genuinely powerful. Azure Monitor Autoscale evaluates your rules on a repeating cycle (typically every 1 minute) and scales in or out accordingly.
A typical scale-out rule triggers when the average CPU across all instances exceeds 75% for 10 minutes. Azure then adds the configured number of instances and enters a cooldown period to prevent rapid oscillation.
The matching scale-in rule removes instances when the average CPU drops below 30% for 10 minutes. Always pair a scale-in with a scale-out rule — without it, your fleet only ever grows and your costs climb indefinitely.
Common Gotcha
The cooldown period (default 5 minutes) applies after every scale action. If your workload spikes unpredictably, consider lowering it — but too short a cooldown risks thrashing. A minimum of 3 minutes is recommended for most web tiers.
Autoscale Simulator
Drag the sliders to see how VMSS responds to different CPU loads and rule configurations.
When you update the VMSS model — for example, changing the OS image version, adding an extension, or modifying the VM size — the upgrade policy controls how that change rolls out to existing instances.
| Policy | How It Works | Availability Impact | Best For |
|---|---|---|---|
| Manual | Existing instances keep the old model until you manually trigger an upgrade. | None | Dev / test environments |
| Automatic | Azure upgrades all instances immediately when the model changes. | High risk | Non-production workloads |
| Rolling | Upgrades instances in configurable batches with health checks between each batch. | Controlled | Production deployments |
Understanding where VMSS fits relative to other high-availability constructs is a frequently tested AZ-104 concept:
AZ-104 Exam Tip
If an exam scenario asks how to build a web tier that handles traffic spikes and survives a datacenter outage, the answer is VMSS deployed across Availability Zones, combined with a Standard Load Balancer. Neither Availability Sets alone nor a single-zone VMSS satisfies both requirements simultaneously.
VMSS instances are automatically registered with an Azure Load Balancer backend pool when you configure one during deployment. Key points to know:
AZ104-Compute-RG) · Basic familiarity with the Azure Portal.Navigate to Virtual Machine Scale Sets
▼Sign in to the Azure Portal at portal.azure.com. In the top search bar, type Virtual Machine Scale Sets and select the service from the search results.
Click + Create to begin the deployment wizard.
Configure the Basics Tab
▼Fill in the required fields on the Basics tab:
AZ104-Compute-RG or your existing RG.vmss-webfront-001Choose Image and VM Size
▼Set Administrator Account
▶Configure Scaling (Autoscale)
▶Configure Networking and Load Balancer
▶Set the Upgrade Policy
▶Review + Create
▶Verify Autoscale in Azure Monitor
▶VMSS is Azure’s managed fleet service — identical instances, one control plane, automatic load distribution.
Flexible orchestration is the modern default — choose it for all new deployments unless you have a specific legacy reason to use Uniform.
Autoscale requires paired rules — a scale-out rule without a matching scale-in rule means your fleet only ever grows, and so does your Azure bill.
Rolling upgrade policy is production-safe — batch updates with health checks prevent full outages during model updates.
Combine VMSS with Availability Zones for the highest SLA (99.99%) and zone-resilient auto-scaling.
Standard Load Balancer is mandatory — the Basic SKU is retired for new deployments and does not support Availability Zones.
For the AZ-104 exam — know the difference between Uniform and Flexible orchestration, the three upgrade policies, and when to use VMSS over Availability Sets.
Test your understanding before moving on. Select an answer, then click Check Answers.
1. Which orchestration mode does Microsoft recommend for all new VMSS deployments?
2. What SLA does VMSS achieve when deployed across all three Availability Zones?
3. Which upgrade policy should you choose for production VMSS deployments?
4. What happens if you configure a scale-out rule but forget to add a matching scale-in rule?
Can VMSS instances use different VM sizes?
▼Yes — but only with Flexible orchestration. Uniform orchestration requires all instances to share the same VM SKU. Flexible mode lets you specify different sizes per instance, which is useful for mixed workloads or gradual hardware migrations.
Does VMSS support stateful workloads?
▼VMSS is designed primarily for stateless workloads. For stateful applications, use Azure Service Fabric or persist state externally via Azure SQL Database, Cosmos DB, or Azure Files. For containerised stateful apps, consider the stateful node pool concept in AKS.
What happens to an instance during a scale-in event?
▼By default, Azure selects the instance with the highest instance ID for termination. You can customise this using the Scale-in policy setting — choosing OldestVM, NewestVM, or Default (highest ID). When the terminate notification feature is enabled, instances receive a graceful shutdown signal before deletion.
How is Azure VMSS priced?
▼You pay for the individual VM instances within the scale set — VMSS itself does not carry an additional service charge. Each instance is billed at the standard VM compute rate for its size and OS. Using Azure Spot instances within VMSS can reduce compute costs by up to 90% for fault-tolerant, interruptible workloads.
Can I use VMSS with Azure Kubernetes Service (AKS)?
▼Yes. AKS node pools are backed by VMSS. When you enable the Cluster Autoscaler in AKS, it scales the underlying VMSS node pools based on pod scheduling pressure. Understanding VMSS VM SKUs and Availability Zone coverage is therefore important when designing production AKS node pools.