Understanding vCenter Server Default Utilization Alarms

Managing a VMware vSphere environment requires constant vigilance over resource consumption and system health. vCenter Server ships with pre-configured alarms designed to alert administrators when resources reach critical thresholds. Understanding these default utilization alarms and knowing how to respond effectively can mean the difference between proactive management and reactive firefighting.

What Are vCenter Server Default Utilization Alarms?

vCenter Server includes dozens of pre-configured alarms that monitor various aspects of your virtual infrastructure. Utilization alarms specifically track resource consumption across your environment, including CPU, memory, storage, and network usage. These alarms trigger when resources exceed predefined thresholds, giving administrators early warning of potential performance issues or capacity constraints.

The default alarms are configured with industry-standard thresholds, but they can be customized to match your organization’s specific requirements and operational standards.

Common Default Utilization Alarms

Host CPU Usage: This alarm monitors the percentage of CPU resources consumed on ESXi hosts. By default, it triggers a warning when CPU usage exceeds 75% for a sustained period and enters a critical state at 90%. High CPU utilization can lead to performance degradation, increased latency, and poor user experience.

Host Memory Usage: Memory utilization alarms track how much RAM is being consumed on your hosts. The default warning threshold is typically set at 75%, with critical alerts at 90%. Unlike CPU, memory cannot be overcommitted without potential performance impacts, making this alarm particularly important.

Datastore Usage on Disk: Storage capacity alarms monitor the percentage of space consumed on your datastores. Default thresholds usually warn at 75% capacity and alert critically at 85%. Running out of datastore space can cause virtual machines to pause or fail, making this one of the most critical alarms to address.

Virtual Machine CPU Usage: This alarm tracks individual VM CPU consumption. When a VM consistently maxes out its allocated CPU resources, it may indicate the need for additional vCPUs or potential application issues.

Virtual Machine Memory Usage: Similar to host memory monitoring, this alarm tracks memory consumption at the VM level. Sustained high memory usage may indicate memory pressure and the need for additional resources.

Possible Actions for Utilization Alarms

When utilization alarms trigger, administrators have several remediation options depending on the specific resource being consumed.

CPU Utilization Actions

For host CPU alarms, you can migrate VMs to less utilized hosts using vMotion, which provides live migration with zero downtime. If your cluster is consistently running hot, adding additional hosts distributes the workload more evenly. Adjusting DRS (Distributed Resource Scheduler) settings to be more aggressive can help automatically balance loads across your cluster. In some cases, the issue lies within a specific VM consuming excessive CPU cycles. Investigating the application or processes running inside the VM may reveal inefficiencies or runaway processes.

At the VM level, you might increase the number of vCPUs allocated to the virtual machine, though this should be done judiciously to avoid unnecessary CPU scheduler overhead. Setting CPU reservations or limits can help control resource consumption for specific workloads.

Memory Utilization Actions

For host memory pressure, you can enable or adjust memory compression and ballooning settings, which are VMware’s mechanisms for managing memory overcommitment. Migrating VMs to hosts with available memory capacity provides immediate relief. Adding physical memory to hosts is a hardware solution that permanently increases capacity. Memory shares and reservations allow you to prioritize critical workloads during contention.

At the VM level, increasing allocated memory is the most straightforward solution when a VM genuinely needs more RAM. However, investigating memory leaks or inefficient application behavior should be your first step, as adding memory to a poorly behaving application only delays the inevitable.

Storage Utilization Actions

When datastore capacity alarms trigger, several approaches can resolve the issue. Storage vMotion allows you to migrate VMs to datastores with more available space without downtime. Deleting old snapshots is often the quickest win, as snapshots can consume substantial space over time. Removing unnecessary VM files, including old templates, ISOs, and abandoned VMs, can free significant capacity.

Thin provisioning converts thick-provisioned disks to thin-provisioned ones, reclaiming unused space. Enabling deduplication and compression at the storage array level can dramatically reduce space consumption for certain workloads. Expanding existing datastores or adding new ones provides additional capacity. Implementing Storage DRS automates the placement and migration of VMs based on space and performance metrics.

Network Utilization Actions

While less common, network utilization alarms indicate bandwidth saturation or connectivity issues. Solutions include adding additional network adapters to hosts, configuring NIC teaming for increased bandwidth and redundancy, implementing traffic shaping to prioritize critical workloads, and upgrading to higher-speed network interfaces such as 25GbE or 100GbE.

Connectivity Alarms and Remediation Actions

Beyond utilization, vCenter monitors connectivity between components in your infrastructure. These alarms detect when communication fails between critical systems.

Host Connection and Power State: This alarm triggers when vCenter loses contact with an ESXi host. Possible actions include verifying network connectivity between vCenter and the host, checking that the management network is properly configured, restarting management agents on the host, verifying that the host hasn’t entered lockdown mode, and in extreme cases, restarting the host itself.

vCenter Server Service Health: These alarms monitor the health of vCenter services. Resolution steps include restarting failed services through the vCenter Server Appliance Management Interface (VAMI), checking for certificate expiration issues, verifying database connectivity if using an external database, reviewing logs for service-specific errors, and ensuring adequate resources are available to the vCenter appliance.

Virtual Machine Connection State: When a VM becomes disconnected or inaccessible, actions include removing the VM from inventory and re-adding it, verifying the VM’s files exist on the datastore, checking for storage connectivity issues, and rescanning storage adapters if necessary.

Datastore Connectivity: When hosts lose connectivity to datastores, investigate the storage network for issues, verify HBA and switch configurations, check for failed storage paths if using multipathing, restart storage services on affected hosts, and verify credentials if using network-attached storage.

vCenter Server Alarms Lab

Prerequisites:

Access to a vCenter Server environment (version 7.0 or later recommended)
At least one ESXi host managed by vCenter
At least two running virtual machines
Administrative credentials for vCenter Server
Basic understanding of vSphere navigation

Lab Environment Setup

Before beginning this lab, ensure you have:

vCenter Server Appliance (VCSA) or vCenter Server running
Minimum one ESXi host with at least 50% free resources
Two test VMs that can be used for generating load
Network connectivity to vCenter web interface
SMTP server details (optional, for email notifications)

Part 1: Exploring Default Alarms

Step 1.1: Access the Alarms Interface

Open a web browser and navigate to your vCenter Server URL (https://vcenter.vmorecloud.com)
Log in with your administrator credentials
From the vSphere Client home screen, click on Menu (the hamburger icon in the top-left corner)
Select Administration from the menu
In the left navigation pane, expand Alarms and click Definitions

You should now see a comprehensive list of all pre-configured alarms in your vCenter environment.

Step 1.2: Filter and Review Utilization Alarms

In the alarm definitions list, locate the Filter box at the top
Type “usage” in the filter box to display utilization-related alarms
Examine the following key alarms by clicking on each one:
- Host CPU usage
- Host memory usage
- Datastore usage on disk
- Virtual machine CPU usage
- Virtual machine memory usage

Lab Note: Take a screenshot of the alarm definitions page for your lab documentation.

Step 1.3: Analyze Alarm Configuration

Click on Host CPU usage alarm to open its details
Review the Triggers tab to see:
- Warning threshold (typically 75% for 5 minutes)
- Alert threshold (typically 90% for 5 minutes)
- Trigger conditions and length
Click on the Actions tab to see configured responses
Note any existing actions such as “Send a notification email”

Expected Result: You should see the default trigger conditions showing percentage thresholds and duration requirements.

Part 2: Configuring Alarm Email Notifications

Step 2.1: Configure vCenter SMTP Settings

From the vSphere Client menu, go to Administration
Under Deployment, select System Configuration
Select your vCenter Server from the list
Click the Configure tab
Under Settings, select General
Click Edit next to Mail
Enter your SMTP configuration:
- SMTP Server: smtp.yourdomain.com
- Port: 25 (or 587 for TLS)
- Sender Account: vcenter@yourdomain.com
Click OK to save

Step 2.2: Test Email Configuration

In the Mail settings section, click Test
Enter a test recipient email address
Click OK and verify the test email is received
Check your email inbox for the vCenter test message

Troubleshooting: If the email doesn’t arrive, verify SMTP server connectivity, firewall rules, and authentication requirements.

Step 2.3: Add Email Action to an Alarm

Navigate back to Menu > Administration > Alarms > Definitions
Right-click on Host memory usage alarm
Select Edit Settings
Click on the Actions tab
Click Add to create a new action
Configure the action:
- Action: Send a notification email
- Trigger: When alarm status changes from Warning to Alert
- Frequency: Once
In the Configuration section:
- To: Enter your email address
- CC: (optional) Add additional recipients
- Subject: Leave default or customize: “ALERT: {targetName} – {alarmName}”
- Body: Leave default or customize the message
Click OK to add the action
Click OK to save the alarm configuration

Lab Note: Document the email notification settings you configured.

Part 3: Monitoring Existing Alarms

Step 3.1: View Triggered Alarms

Click on Menu > Hosts and Clusters
Select your datacenter object from the inventory
Click on the Monitor tab
Select Issues > Triggered Alarms
Review any currently triggered alarms in your environment

Step 3.2: Navigate to Specific Object Alarms

In the inventory, select one of your ESXi hosts
Click the Monitor tab
Select Issues > Triggered Alarms
Review host-specific alarms
Repeat for a virtual machine object

Expected Result: You’ll see different alarms are applicable to different object types (hosts vs. VMs vs. datastores).

Step 3.3: Review Alarm Details

If any alarms are triggered, click on one to expand its details
Review the information provided:
- Triggered time
- Current status (Warning or Alert)
- Triggering value
- Object affected
Note how long the condition has persisted

Part 4: Triggering CPU Utilization Alarms

Step 4.1: Prepare a Test Virtual Machine

Navigate to Menu > Hosts and Clusters
Select a test VM from your inventory
Verify the VM is powered on
Note the VM’s current CPU allocation
Right-click the VM and select Edit Settings
Note the number of vCPUs (we’ll use this information shortly)
Click Cancel

Step 4.2: Modify CPU Alarm Thresholds (Optional – for faster testing)

Go to Menu > Administration > Alarms > Definitions
Filter for “Virtual machine CPU usage”
Right-click the alarm and select Edit Settings
In the Triggers tab, modify:
- Warning: 60% for 1 minute
- Alert: 80% for 1 minute
Click OK

Lab Note: We’re lowering thresholds and duration to trigger alarms faster for lab purposes. In production, you’d use the default values.

Step 4.3: Generate CPU Load on the VM

For Linux VMs:

Open a console or SSH session to your test VM
Run the following command to generate CPU load:

# Install stress tool if not available
sudo apt-get install stress -y  # For Ubuntu/Debian
# OR
sudo yum install stress -y      # For RHEL/CentOS

# Generate CPU load on all cores for 5 minutes
stress --cpu $(nproc) --timeout 300s

For Windows VMs:

Open a console or RDP session to your test VM
Download and run CPU stress testing tool like Prime95 or CPUSTRES
Alternatively, use PowerShell:

# Run CPU intensive task
$result = 1
foreach ($number in 1..2147483647) {
    $result = $result * $number
}

Alternative Method – Using Multiple PowerShell Windows: Open multiple PowerShell windows (one per vCPU) and run the above command in each.

Step 4.4: Monitor the Alarm Trigger

Return to the vSphere Client
Navigate to your test VM
Click the Monitor tab
Select Performance > Overview
Watch the CPU usage climb
After 1-2 minutes, click Issues > Triggered Alarms
Observe the alarm progression:
- First: Yellow warning icon appears
- Then: Red alert icon if usage continues

Expected Result: You should see “Virtual machine CPU usage” alarm trigger with warning status, then alert status.

Step 4.5: Verify Email Notification

Check your email inbox
Look for the alarm notification email from vCenter
Review the email content including:
- Alarm name
- Affected VM
- Current CPU usage percentage
- Timestamp

Lab Note: Take a screenshot of the triggered alarm and the email notification.

Step 4.6: Stop the CPU Load

Return to your VM console/session
Stop the stress test (Ctrl+C for Linux stress command)
Return to vSphere Client and monitor the alarm
Watch as CPU usage drops
The alarm should automatically clear to green after usage falls below thresholds

Part 5: Triggering Memory Utilization Alarms

Step 5.1: Prepare for Memory Testing

Select your test VM in the vSphere Client
Note the current memory allocation
Check current memory usage in Monitor > Performance
Verify the VM has sufficient memory to safely test (at least 2GB recommended)

Step 5.2: Adjust Memory Alarm Thresholds

Navigate to Administration > Alarms > Definitions
Find “Virtual machine memory usage”
Edit the alarm settings:
- Warning: 65% for 1 minute
- Alert: 85% for 1 minute
Save the configuration

Step 5.3: Generate Memory Load

For Linux VMs:

Connect to your test VM
Run memory stress test:

# Install if needed
sudo apt-get install stress -y

# Allocate memory (adjust --vm-bytes based on your VM's RAM)
# This example uses 1.5GB
stress --vm 2 --vm-bytes 768M --timeout 300s

For Windows VMs:

Open PowerShell as Administrator
Run the following memory consumption script:

# Allocate memory (adjust size based on VM RAM)
$size = 1.5GB
$array = @()

Write-Host "Allocating memory..."
for ($i = 0; $i -lt ($size / 1MB); $i++) {
    $array += New-Object byte[] 1MB
    if ($i % 100 -eq 0) {
        Write-Host "Allocated: $($i)MB"
    }
}

Write-Host "Memory allocated. Press Enter to release..."
Read-Host
$array = $null
[System.GC]::Collect()

Step 5.4: Monitor Memory Alarm

In vSphere Client, navigate to the VM’s Monitor tab
Select Performance and observe memory usage climbing
Switch to Issues > Triggered Alarms
Wait for the alarm to trigger (typically 1-2 minutes)
Note the warning and alert progression

Expected Result: Memory usage alarm should trigger showing the percentage of consumed memory.

Step 5.5: Practice Memory Remediation

Now that the alarm has triggered, practice remediation steps:

Right-click the test VM and select Edit Settings
Increase the memory allocation by 512MB or 1GB
Click OK
Observe memory usage percentage decrease
Watch the alarm status return to normal

Lab Note: Document the memory before and after values and how long it took for the alarm to clear.

Step 5.6: Clean Up Memory Test

Stop the memory stress test in your VM
Release allocated memory
Verify the VM returns to normal memory consumption
Check that the alarm has cleared

Part 6: Testing Storage Utilization Alarms

Step 6.1: Check Current Datastore Usage

Navigate to Menu > Storage
Select a test datastore from your inventory
Click the Monitor tab
Note the current capacity and usage percentage
Verify you have enough space to safely create test files

Step 6.2: Configure Datastore Alarm

Right-click your test datastore
Select Alarms > New Alarm Definition
Configure:
- Name: Test Datastore Usage
- Target Type: Datastore
- Monitor: Specific conditions and events
In Triggers tab, click Add
Configure trigger:
- Trigger Type: Condition
- Condition: Datastore > Disk Usage (%)
- Operator: Is above
- Warning: 70%
- Alert: 80%
- Condition Length: 1 minute
Click OK to add trigger
In Actions tab, add email notification (optional)
Click OK to create the alarm

Step 6.3: Generate Storage Load (Caution – Test Environment Only)

Method 1: Create Large VM Snapshots

Select a VM on your test datastore
Right-click and select Snapshots > Take Snapshot
Name it “Test Storage Load 1”
Click OK
Power on the VM and use it briefly to generate delta disk activity
Create 2-3 more snapshots
Monitor datastore usage increasing

Method 2: Upload ISO Files

Right-click your datastore
Select Browse Files
Click Upload Files
Upload large ISO files (Windows, Linux distributions)
Monitor capacity usage

Method 3: Expand a VM Disk (Safest)

Power off a test VM
Edit VM settings
Increase a virtual disk size by 10-20GB
Click OK
Monitor datastore usage

Step 6.4: Monitor Storage Alarm

Navigate to your datastore’s Monitor tab
Click Issues > Triggered Alarms
Watch for the alarm to trigger as usage crosses thresholds
Observe the capacity percentage in real-time

Step 6.5: Practice Storage Remediation

Practice these storage remediation techniques:

Option 1: Delete Snapshots

Navigate to your test VM
Right-click and select Snapshots > Manage Snapshots
Select a snapshot and click Delete
Click Delete All to consolidate all snapshots
Monitor space reclamation on the datastore

Option 2: Storage vMotion

Right-click a VM on the full datastore
Select Migrate
Choose Change storage only
Select a different datastore with more space
Click Finish
Monitor the migration progress

Option 3: Delete Unused Files

Browse the datastore files
Identify and delete uploaded ISO files
Remove any orphaned VMDK files
Check for old log files or VM templates

Expected Result: Datastore usage should decrease and the alarm should clear.

Part 7: Testing Connectivity Alarms

Step 7.1: Review Connectivity Alarms

Navigate to Administration > Alarms > Definitions
Filter for “connection” or “connectivity”
Review these key alarms:
- Host connection and power state
- Cannot connect to storage
- Lost connection to virtual machine

Step 7.2: Simulate Host Connectivity Issue (Optional – Advanced)

Warning: Only perform this if you have a non-production host and understand the implications.

SSH to your ESXi host
Temporarily disable the management network:

# View current vmkernel adapters
esxcli network ip interface list

# Disable management interface (typically vmk0)
esxcli network ip interface set -e false -i vmk0

In vSphere Client, monitor the host status
Watch for “Host connection and power state” alarm to trigger
The host will show as disconnected

To Restore:

Access the ESXi host directly via DCUI (console)
Press F2 and login
Navigate to Configure Management Network > IPv4 Configuration
Re-enable the management network
Or re-enable via SSH:

esxcli network ip interface set -e true -i vmk0
```

### Step 7.3: Simulate VM Connectivity Issue

1. Select a test VM
2. Right-click and select **Edit Settings**
3. Expand **Network Adapter 1**
4. Uncheck **Connected** and **Connect at power on**
5. Click **OK**
6. Monitor for network connectivity alarms (if configured)

**To Restore**:
1. Edit VM settings again
2. Check both network connection boxes
3. Click **OK**

### Step 7.4: Practice Connectivity Remediation

Document remediation steps for common connectivity issues:

**For Host Disconnection**:
- Verify network connectivity from vCenter to host
- Check ESXi management network configuration
- Restart management agents: `/etc/init.d/hostd restart`
- Verify firewall rules
- Check if host is in lockdown mode

**For VM Connection Issues**:
- Verify VM network adapter is connected
- Check port group assignment
- Verify VLAN configuration
- Review virtual switch configuration

## Part 8: Creating Custom Alarms (15 minutes)

### Step 8.1: Create a Custom VM CPU Ready Time Alarm

1. Navigate to **Hosts and Clusters**
2. Select a VM from your inventory
3. Click **Monitor** > **Issues** > **Definitions**
4. Click **Add** (the + icon)
5. Configure the new alarm:
- **Name**: High CPU Ready Time
- **Description**: Alerts when VM experiences CPU scheduling delays
- **Target**: Virtual Machines
- **Monitor**: Specific conditions
6. Click **Add** in the Triggers section:
- **Trigger Type**: Condition
- **Metric**: CPU > Ready (ms)
- **Operator**: Is above
- **Warning**: 1000 ms
- **Alert**: 2000 ms
- **Condition Length**: 5 minutes
7. Add email action in the **Actions** tab
8. Click **OK**

### Step 8.2: Create a Custom Snapshot Age Alarm

1. Create a new alarm at the VM level
2. Configure:
- **Name**: Old VM Snapshot
- **Target**: Virtual Machines
- **Monitor**: Specific events occurring on this object
3. Add trigger:
- **Event**: VM snapshot created
- **Status**: Warning
- **Action**: Send notification email
4. Set up a daily check or manual review process

### Step 8.3: Create a Custom Datastore Latency Alarm

1. Right-click a datastore
2. Create new alarm:
- **Name**: High Datastore Latency
- **Target**: Datastore
3. Add trigger:
- **Metric**: Datastore > Read latency OR Write latency
- **Warning**: 15 ms
- **Alert**: 25 ms
- **Length**: 5 minutes
4. Configure email notifications
5. Save the alarm

**Expected Result**: You now have custom alarms monitoring advanced metrics specific to your environment's needs.

## Part 9: Alarm Best Practices Implementation (10 minutes)

### Step 9.1: Disable Unnecessary Alarms

1. Review all alarm definitions
2. Identify alarms not relevant to your environment
3. Right-click irrelevant alarms and select **Disable**
4. Document which alarms were disabled and why

### Step 9.2: Create Alarm Categories

1. Use naming conventions for your custom alarms:
- `CRITICAL - [Alarm Name]` for business-critical alerts
- `WARNING - [Alarm Name]` for informational alerts
- `CAPACITY - [Alarm Name]` for capacity planning
2. Update your custom alarm names to follow this convention

### Step 9.3: Document Alarm Responses

Create a response playbook for each alarm type:
```
Alarm: Host Memory Usage
Severity: Alert (Red)
Response Time: 15 minutes

Immediate Actions:
1. Check which VMs are consuming most memory
2. Migrate non-critical VMs to other hosts
3. Check for memory ballooning

Secondary Actions:
1. Review memory reservations
2. Consider adding physical RAM
3. Review VM right-sizing

Part 10: Reset and Cleanup

Step 10.1: Reset Alarm Thresholds

Return all modified alarms to their original settings
For each alarm you changed:
- Host CPU usage: 75% warning, 90% alert, 5 minutes
- Host memory usage: 75% warning, 90% alert, 5 minutes
- Virtual machine CPU usage: 75% warning, 90% alert, 5 minutes
- Virtual machine memory usage: 80% warning, 95% alert, 5 minutes

Step 10.2: Clean Up Test Modifications

Delete all VM snapshots created during testing
Remove uploaded ISO files from datastores
Restore any VM CPU/memory changes to original values
Re-enable any disabled network adapters
Delete custom test alarms if not needed

Step 10.3: Verify System State

Check that all alarms have cleared
Verify all VMs are running normally
Confirm datastore usage is back to pre-lab levels
Ensure no connectivity alarms are triggered

Step 10.4: Document Lab Results

Create a lab report including:

Screenshots of triggered alarms
Email notifications received
Remediation actions performed
Time taken to resolve each alarm
Lessons learned
Any issues encountered

Conclusion

vCenter Server’s default utilization and connectivity alarms provide a solid foundation for infrastructure monitoring, but their effectiveness depends on proper configuration and timely response. Understanding what each alarm monitors, the thresholds that trigger notifications, and the available remediation actions empowers administrators to maintain healthy, performant virtual environments.

The key to successful alarm management lies in finding the balance between comprehensive monitoring and actionable alerts. Too few alarms leave you blind to developing issues. Too many create noise that obscures genuine problems. Start with the defaults, tune based on your environment’s behavior, and develop clear response procedures for your team.

By treating alarms as the valuable diagnostic tools they are rather than mere annoyances, you transform reactive troubleshooting into proactive management, ultimately delivering better performance and reliability to your organization.