Site icon Digital Thought Disruption

Nutanix Asynchronous Replication: Architecture, Configuration, and Real-World Operations (AOS 6.6+)

Table of Contents

  1. Introduction to Nutanix Asynchronous Replication
  2. How Nutanix Replication Works
  3. Prerequisites and Planning
  4. Configuring Asynchronous Replication (Prism Central & Prism Element)
  5. Policy Creation, Schedules, and SLAs
  6. Failover, Failback, and Test Workflows
  7. Real-World Use Cases
  8. Monitoring and Troubleshooting
  9. CLI/API Examples
  10. Best Practices
  11. Frequently Asked Questions
  12. References and Published Links

1. Introduction to Nutanix Asynchronous Replication

In today’s enterprise datacenters, disaster recovery (DR) is a top priority. Nutanix Asynchronous Replication (supported on AOS 6.6 and newer) enables organizations to protect workloads with flexible, policy-driven snapshot and replication technology. This capability ensures business continuity without the performance cost of synchronous replication. Using Nutanix AHV, administrators can replicate virtual machines (VMs) and volumes to a secondary site at regular intervals, optimizing both RPO (Recovery Point Objective) and network utilization.


2. How Nutanix Replication Works

Nutanix Asynchronous Replication leverages Protection Domains (PDs) and Remote Sites to automate and manage data replication. Snapshots are taken at the source cluster and transferred to the target cluster over a secure channel. Only changed data blocks (using changed block tracking, or CBT) are sent, which makes the process bandwidth-efficient.

[Diagram]

Replication Types:


3. Prerequisites and Planning

Key requirements:

Firewall Ports:

Licensing:

Other:

[Checklist]


4. Configuring Asynchronous Replication

4.1. Pairing Remote Sites

4.2. Creating Protection Domains

4.3. Scheduling Replication

Retention Policy:
Specify units. Example:

[Sample Prism Workflow]

1. Data Protection > Protection Domains > Create
2. Add VM(s) > Set Schedules > Save

4.4. CLI Example

# Create a remote site
ncli remote-site add name=DR-Site address=10.0.2.50

# Create a protection domain
ncli pd create name=prod-dr

# Add VMs to protection domain
ncli pd add-vm name=prod-dr vm-list=WebApp01,SQL01

# Set snapshot schedule (every 1 hour, keep 24 local, 48 remote)
ncli pd set-schedule name=prod-dr schedule-type=remote interval=1h retention=24 retention-type=hours

4.5. API Example (Python/REST)

import requests

url = "https://prism.example.com:9440/api/nutanix/v3/protection_domains"
payload = {
"name": "prod-dr",
"vms": ["WebApp01", "SQL01"],
"remote_sites": ["DR-Site"],
"schedule": {
"type": "remote",
"interval": "1h",
"retention": 48, # Number of snapshots
"retention_type": "hours"
}
}
# Authentication omitted for brevity (use HTTP basic or bearer token)
response = requests.post(url, json=payload, verify=False)
print(response.json())

5. Policy Creation, Schedules, and SLAs

You can customize policies for RPO, retention, and failover priorities.

Policy TypeRPORetentionUse Case
Gold15 minutes72 hoursCritical apps
Silver1 hour48 hoursStandard workloads
Bronze4 hours24 hoursNon-prod/test

6. Failover, Failback, and Test Workflows

Failover Steps:

  1. Ensure replica is up to date.
  2. In Prism Element, navigate to Data Protection → Protection Domains.
  3. Select PD, click Activate at Remote Site.
  4. Map network segments:
    • In failover wizard, map original VLANs/IP subnets to DR site equivalents.
    • Example: Production VLAN 100 mapped to DR VLAN 2100, or use network profile mapping features.

Failback:
Reverse the protection direction after recovery. After restoring production, re-map networks and synchronize incremental changes.

Test Failover:
Use the Test Failover button to spin up VMs on the DR site in an isolated network. No impact to production. Always verify application health and network reachability during test cycles.


7. Real-World Use Cases

Reference: Nutanix DR Real-World Guide


8. Monitoring and Troubleshooting

Common Issues:


9. CLI/API Examples

List Protection Domains

ncli pd list

Trigger Manual Snapshot

ncli pd snapshot name=prod-dr

Check Replication Status

ncli pd list-replication-status name=prod-dr

10. Best Practices


11. Frequently Asked Questions

Q: Can I replicate between different AHV or AOS versions?
A: Supported within same major release, but always check the official compatibility matrix for supported version pairs.

Q: How do I automate replication workflows?
A: Use Nutanix API, ncli, or Prism Central Playbooks.

Q: What happens if replication is interrupted?
A: Sync resumes from last successful snapshot when connectivity restores.


Exit mobile version