Table of Contents
- Introduction to Nutanix Metro
- Architecture Overview and Core Concepts
- Prerequisites and Environmental Planning
- Step-by-Step Configuration (GUI, CLI, API)
- Advanced Workflows and Automation
- Best Practices for Production Deployments
- Troubleshooting Common and Complex Issues
- Real-World Use Cases
- Frequently Asked Questions (FAQ)
- Conclusion
1. Introduction to Nutanix Metro
Nutanix Metro, also called Nutanix Metro Availability, is a business continuity and disaster recovery solution built into the Nutanix platform. It provides synchronous data replication between two geographically separated Nutanix clusters, ensuring zero data loss and rapid application failover in the event of site outages. Metro is crucial for mission-critical workloads that demand maximum uptime and compliance with stringent recovery point objectives (RPOs).
2. Architecture Overview and Core Concepts
At its core, Nutanix Metro extends the data protection and high availability features of Nutanix AOS by synchronously mirroring data between two separate sites.
Key architectural concepts:
- Metro Clusters: Two Nutanix clusters, each at a different site, interconnected for synchronous replication.
- Stretched Volume Groups: Application volumes mirrored in real-time between both sites.
- Witness VM: An out-of-band component for split-brain avoidance and quorum.
Architecture Diagram:

3. Prerequisites and Environmental Planning
Hardware and Software Requirements
- Nutanix clusters running AOS 6.x or later
- Minimum of one Prism Central managing both clusters
- Supported hypervisor (AHV or ESXi), with identical hypervisor type and version required on both clusters
- Dedicated, low-latency, high-bandwidth network between sites
- Witness VM deployed at a third location (preferably cloud or a separate site)
Licensing
- Metro Availability is included in the Nutanix Ultimate Edition license.
- Both clusters must be licensed appropriately with Ultimate Edition or equivalent to enable Metro features.
- Always verify current licensing status and feature entitlements via Nutanix Support or your account representative.
Hypervisor Uniformity
- Both clusters must run the same hypervisor type and version (either AHV or ESXi).
- Mixed-hypervisor Metro configurations are not supported and will prevent proper Metro Availability operation.
Network & Latency
- Recommended latency: Less than 5ms round-trip time between clusters.
- Bandwidth: Sufficient to handle synchronous replication of all active workloads.
Security and Connectivity
- Ensure secure, firewalled network paths between clusters and witness VM.
- Consistent VLAN/subnet planning for stretched networks.
4. Step-by-Step Configuration (GUI, CLI, API)
4.1 Initial Setup via Prism (GUI)
- Log into Prism Central.
- Navigate to Protection Domains & Metro Availability.
- Select Create Metro Availability.
- Add both clusters to the Metro configuration.
- Select the volumes or VMs to protect.
- Configure stretched network and witness details.
4.2 Witness VM Deployment
- Download and deploy the Witness OVA (for VMware) or QCOW2 (for AHV) at a third site.
- Power on and configure IP/networking.
- Register the Witness in Prism Central.
Witness Placement:

4.3 Advanced Configuration (CLI)
A. Check Metro Readiness:
ncli metro-cluster ls
B. Enable Metro on a Protection Domain:
ncli pd metro-availability-enable \
name="prod-db-protect" \
remote-cluster-name="Cluster-B"
C. Add Volumes to Metro Domain:
ncli pd add-entity \
name="prod-db-protect" \
entity-type=vm \
entity-names="AppServer01,DB01"
D. API Example: Create Metro Protection
curl -u admin:password -X POST \
-H "Content-Type: application/json" \
-d '{
"remote_cluster": "Cluster-B",
"entities": ["AppServer01", "DB01"]
}' \
https://prism-central-ip:9440/api/nutanix/v3/metro_availability
5. Advanced Workflows and Automation
Automated Failover (CLI Example)
ncli metro-cluster failover \
name="prod-db-protect" \
force=true
Automated Monitoring (Script Example)
#!/bin/bash
# Nutanix Metro Health Check
CLUSTERS=("Cluster-A" "Cluster-B")
for cluster in "${CLUSTERS[@]}"
do
ncli --cluster=${cluster} metro-cluster get-status
done
Scheduled Metro Health Checks
- Use Nutanix Prism Central Scheduled Reports to send daily Metro health status to administrators.
- API endpoint:
/api/nutanix/v3/metro_availability/status
6. Best Practices for Production Deployments
- Network Health: Regularly monitor latency and bandwidth between sites.
- Witness Isolation: Place Witness VM in a neutral third site or cloud, not within either primary cluster’s data center.
- Test Failover: Conduct quarterly planned failover and failback drills to validate business continuity.
- Protection Domain Design: Group related workloads (app and database) in a single domain for consistent failover.
- Alerting: Enable proactive alerting for Metro status changes or witness failures.
- Version Alignment: Keep both clusters at the same AOS and hypervisor patch level.
- Hypervisor Consistency: Both Metro clusters must be kept at identical hypervisor versions and patch levels. Plan for simultaneous upgrades to avoid configuration drift.
- Licensing Compliance: Ensure both clusters are always covered by Nutanix Ultimate Edition licensing for uninterrupted Metro protection.
- Runbooks: Maintain clear runbooks for manual failover, failback, and troubleshooting.
7. Troubleshooting Common and Complex Issues
Witness VM Unavailability and Failover Automation
- Critical Note:
If the Witness VM is unavailable, automated failover is disabled.
Manual intervention is required to ensure data integrity and prevent split-brain scenarios. - Operational Planning:
Always monitor the status of the Witness VM and ensure high-availability for its underlying infrastructure.
Witness Connectivity Problems
- Symptom: Metro state shows “Degraded” or “Disconnected”
- Check:
- Witness VM network interface up?
- Firewall ports open between witness and both clusters?
- Witness service running?
- CLI:
ncli metro-cluster get-status
Split-Brain Condition
- Cause: Loss of communication to witness and one cluster
- Action:
- Identify which cluster is active
- Restore connectivity or perform controlled failover as per runbook
Resync Failures
- Symptom: Protection domain fails to resync after network outage
- Check:
- Sufficient bandwidth?
- Disk space on both clusters?
- Review logs via Prism or CLI
Performance Impact
- Monitor:
- Storage latency metrics in Prism Central
- Impacted VMs with high IOPS
8. Real-World Use Cases
Financial Services
- Zero RPO database failover for core banking systems between two metropolitan data centers
Healthcare
- Synchronous EMR application protection across two hospitals for HIPAA compliance
Retail
- 24/7 e-commerce workload protection, instant recovery from datacenter outage
Public Sector
- Metro clusters for critical infrastructure with automated disaster drills
9. Frequently Asked Questions (FAQ)
Q: Is Nutanix Metro included in my existing license?
A: Metro Availability requires Nutanix Ultimate Edition licensing. Both participating clusters must have the correct license level to enable Metro.
Q: Can I mix hypervisors between Metro clusters?
A: No. Metro clusters require the same supported hypervisor type and version on both sites.
Q: What happens if the Witness VM is unavailable?
A: Automated failover is disabled, and manual intervention is necessary. Operational continuity planning must account for this scenario.
Q: How often should I test failover?
A: At least quarterly, or after any major infrastructure changes.
Q: Is Metro suitable for asynchronous replication?
A: Metro is for synchronous use cases. For async, use Nutanix NearSync or traditional DR.
10. Conclusion
Nutanix Metro is a powerful tool for ensuring data resilience and business continuity across mission-critical environments. By following advanced configuration steps, enforcing best practices, and regularly testing your setup, you can achieve near-zero downtime and seamless recovery. Stay proactive with monitoring, licensing, and up-to-date runbooks to maximize your Metro deployment’s effectiveness.
Disclaimer: The views expressed in this article are those of the author and do not represent the opinions of Nutanix, my employer or any affiliated organization. Always refer to the official Nutanix documentation before production deployment.
Introduction Once your Ansible control node is configured, deploying a virtual machine on Nutanix AHV is just a few lines of YAML...