Table of Contents
- Introduction to Nutanix Asynchronous Replication
- How Nutanix Replication Works
- Prerequisites and Planning
- Configuring Asynchronous Replication (Prism Central & Prism Element)
- Policy Creation, Schedules, and SLAs
- Failover, Failback, and Test Workflows
- Real-World Use Cases
- Monitoring and Troubleshooting
- CLI/API Examples
- Best Practices
- Frequently Asked Questions
- References and Published Links
1. Introduction to Nutanix Asynchronous Replication
In today’s enterprise datacenters, disaster recovery (DR) is a top priority. Nutanix Asynchronous Replication (supported on AOS 6.6 and newer) enables organizations to protect workloads with flexible, policy-driven snapshot and replication technology. This capability ensures business continuity without the performance cost of synchronous replication. Using Nutanix AHV, administrators can replicate virtual machines (VMs) and volumes to a secondary site at regular intervals, optimizing both RPO (Recovery Point Objective) and network utilization.
2. How Nutanix Replication Works
Nutanix Asynchronous Replication leverages Protection Domains (PDs) and Remote Sites to automate and manage data replication. Snapshots are taken at the source cluster and transferred to the target cluster over a secure channel. Only changed data blocks (using changed block tracking, or CBT) are sent, which makes the process bandwidth-efficient.
[Diagram]

Replication Types:
- Asynchronous: Periodic (scheduled) replication.
- NearSync: Lower RPO than async, but requires higher bandwidth.
- Synchronous: Not covered in this article.
3. Prerequisites and Planning
Key requirements:
- Nutanix clusters running AOS 6.6 or newer
- Prism Central/Element access
- Sufficient network bandwidth (ideally with QoS)
- Remote Site configured and accessible
- Time synchronization (NTP)
- Proper firewall rules
Firewall Ports:
- 2009/tcp: Inter-cluster communication
- 2020/tcp: Data replication traffic
- 9440/tcp: Prism Central/API management
- (Confirm these for your environment; additional ports may be needed for remote support, licensing, or Prism Central integration.)
Licensing:
- Nutanix Advanced or Ultimate (for full DR features including replication, failover/failback automation, and protection policies)
Other:
- Compatible AHV versions (both clusters)
- Test network latency (less than 100ms round-trip preferred for reliable DR)
[Checklist]

4. Configuring Asynchronous Replication
4.1. Pairing Remote Sites
- Login to Prism Element
- Navigate to Data Protection → Remote Sites
- Click Create Remote Site
- Enter cluster IP or FQDN, credentials, network settings
- Verify connectivity
4.2. Creating Protection Domains
- Go to Data Protection → Protection Domains
- Click Create Protection Domain
- Add VMs or Volume Groups to be protected
4.3. Scheduling Replication
- Inside the PD, configure Local Snapshot and Remote Schedule
- Define RPO (e.g., every 15 mins, 1 hour, etc.)
- Set retention policies for local and remote copies
Retention Policy:
Specify units. Example:
- Keep 24 snapshots locally (1 hour RPO, 24 hours total)
- Retain 48 remote snapshots (2 days)
[Sample Prism Workflow]
1. Data Protection > Protection Domains > Create
2. Add VM(s) > Set Schedules > Save
4.4. CLI Example
# Create a remote site
ncli remote-site add name=DR-Site address=10.0.2.50
# Create a protection domain
ncli pd create name=prod-dr
# Add VMs to protection domain
ncli pd add-vm name=prod-dr vm-list=WebApp01,SQL01
# Set snapshot schedule (every 1 hour, keep 24 local, 48 remote)
ncli pd set-schedule name=prod-dr schedule-type=remote interval=1h retention=24 retention-type=hours
4.5. API Example (Python/REST)
import requests
url = "https://prism.example.com:9440/api/nutanix/v3/protection_domains"
payload = {
"name": "prod-dr",
"vms": ["WebApp01", "SQL01"],
"remote_sites": ["DR-Site"],
"schedule": {
"type": "remote",
"interval": "1h",
"retention": 48, # Number of snapshots
"retention_type": "hours"
}
}
# Authentication omitted for brevity (use HTTP basic or bearer token)
response = requests.post(url, json=payload, verify=False)
print(response.json())
5. Policy Creation, Schedules, and SLAs
You can customize policies for RPO, retention, and failover priorities.
- RPO: How recent the replica will be (e.g., 15 minutes, 1 hour)
- Retention: Duration or number of snapshots to retain (e.g., 48 hours, 24 snapshots)
- Priority: Tier apps for failover order
| Policy Type | RPO | Retention | Use Case |
|---|---|---|---|
| Gold | 15 minutes | 72 hours | Critical apps |
| Silver | 1 hour | 48 hours | Standard workloads |
| Bronze | 4 hours | 24 hours | Non-prod/test |
6. Failover, Failback, and Test Workflows
Failover Steps:
- Ensure replica is up to date.
- In Prism Element, navigate to Data Protection → Protection Domains.
- Select PD, click Activate at Remote Site.
- Map network segments:
- In failover wizard, map original VLANs/IP subnets to DR site equivalents.
- Example: Production VLAN 100 mapped to DR VLAN 2100, or use network profile mapping features.
Failback:
Reverse the protection direction after recovery. After restoring production, re-map networks and synchronize incremental changes.
Test Failover:
Use the Test Failover button to spin up VMs on the DR site in an isolated network. No impact to production. Always verify application health and network reachability during test cycles.
7. Real-World Use Cases
- Datacenter Migration: Move workloads with minimal downtime.
- Ransomware Recovery: Rollback to clean snapshot after an incident.
- Compliance: Retain offsite backups for regulatory requirements.
- Multi-Site Operations: Run dev/test from DR site when needed.
Reference: Nutanix DR Real-World Guide
8. Monitoring and Troubleshooting
- Prism Central Dashboards: View replication status, lag, failures
- Alarms & Events: Receive alerts for failed jobs or connectivity issues
- Health Checks: Regularly test failover/failback
Common Issues:
- Network connectivity loss
- Outdated credentials between sites
- Insufficient storage at DR site
9. CLI/API Examples
List Protection Domains
ncli pd list
Trigger Manual Snapshot
ncli pd snapshot name=prod-dr
Check Replication Status
ncli pd list-replication-status name=prod-dr
10. Best Practices
- Always test failover/failback quarterly
- Monitor network latency and bandwidth
- Use dedicated DR VLANs and encrypted links
- Document DR runbooks with clear steps
- Automate reporting for compliance (integrate with SIEM if needed)
11. Frequently Asked Questions
Q: Can I replicate between different AHV or AOS versions?
A: Supported within same major release, but always check the official compatibility matrix for supported version pairs.
Q: How do I automate replication workflows?
A: Use Nutanix API, ncli, or Prism Central Playbooks.
Q: What happens if replication is interrupted?
A: Sync resumes from last successful snapshot when connectivity restores.