Site icon Digital Thought Disruption

VCF 9.0 GA Mental Model Part 2: Fleet Services vs Instance Management Planes (and Who Owns What)

TL;DR

Standardize on the official hierarchy: VCF private cloud -> VCF fleet -> VCF instance -> VCF domain -> vSphere clusters. A VCF fleet is managed by one set of fleet-level management components (notably VCF Operations and VCF Automation), while each VCF instance keeps its own management domain and domain-level control planes.

Your fastest path to org alignment is separating two things people constantly mix up:

Scope and code levels referenced (VCF 9.0 GA core):

Architecture Diagram

Legend:

Table of Contents

Scenario

You need architects, operators, and leadership to agree on:

Assumptions

Core vocabulary recap

Use these terms consistently in meetings, designs, and runbooks:

Core concept: separate fleet services from instance management planes

You get clean operations when you stop trying to force everything into a single “management plane” blob.

Instead, run this mental separation:

Fleet services

These are the things you deploy once per fleet to provide centralized capabilities:

Practical implication: if fleet services are impaired, governance and workflows degrade, but the instance-level control planes do not magically disappear.

Instance management planes

Every instance retains its own control plane boundaries:

This is where most “core infrastructure lifecycle” actually executes.

Domain-level control planes

Each workload domain is its own lifecycle and isolation boundary, typically with:

What runs where in VCF 9.0 GA

A clean greenfield deployment is intentionally opinionated:

Two other details matter for design reviews:

Who owns what

This table is meant to stop “that’s not my job” loops during incidents and upgrades.

Component or capabilityPlatform team (VCF)VI admin (domains and clusters)App and platform teams
Fleet bring-up (VCF Installer, fleet creation)OwnConsultInform
Fleet-level management components (VCF Operations, fleet management appliance, VCF Automation)OwnConsultInform
VCF Identity Broker and VCF Single Sign-On configurationOwnConsultInform
SDDC Manager (per instance)Own (platform governance)Own day-2 executionInform
Management domain vCenter and NSXSharedOwnInform
Workload domain lifecycle (create domain, add clusters, remediate hosts)SharedOwnInform
Workload consumption (Org structure, projects, templates, quotas, policies)Shared (guardrails)ConsultOwn
Backup and restore for fleet management componentsOwnConsultInform
Backup and restore for instance components (SDDC Manager, vCenter, NSX)Shared (standards)OwnInform
Day-2 password lifecycle (rotation, remediation)Own (policy + tooling)SharedInform
Certificates and trust (CA integration, renewal cadence)OwnSharedInform
DR plans for management components and identityOwnConsultInform
DR plans for workload domains and applicationsShared (platform)Shared (infra)Own

Ownership rule of thumb:

Day-0, day-1, day-2 map

This matters because VCF 9.0 pushes more workflows into a centralized console, but it does not eliminate domain-level responsibilities.

Day-0

Design-time decisions that are expensive to change later:

Day-1

Bring-up and initial enablement:

Day-2

Ongoing operations:

Identity and SSO boundaries that actually matter

What VCF Single Sign-On does (and does not)

VCF Single Sign-On is designed to streamline access across multiple VCF components with one authentication source configured from the VCF Operations console.

Key operational detail:

Identity pillars in VCF

Your identity design is built on three pillars:

Important constraint:

VCF Identity Broker deployment modes

Here’s the practical decision point.

Decision pointEmbedded (vCenter service)Appliance (3-node cluster)
Where it runsInside management domain vCenterStand-alone appliances deployed via VCF Operations fleet management
Multi-instance recommendationOne per instanceUp to five instances per Identity Broker appliance
Availability characteristicsRisk of being tied to mgmt vCenter availabilityDesigned for higher availability; handles node failure
Typical fitSingle instance, simpler environmentsMulti-instance, larger environments, stronger availability targets

Change management warning: moving from appliance to embedded mode requires resetting the VCF Single Sign-On configuration and re-adding users and groups. Treat the deployment mode decision as day-0.

Challenge: You need shared identity for convenience, but regulated isolation for some tenants

Solutions:

A) Shared enterprise IdP with fleet-wide SSO

B) Cross-instance SSO with multiple Identity Brokers in one fleet

C) Separate fleets for regulated isolation

Topology patterns for single site, two sites, and multi-region

Use the design blueprints as your baseline mental model. Then tune.

Challenge: Your topology is not one-size-fits-all

Solutions:

A) Single site with minimal footprint

B) Two sites in one region

C) Multi-region

Quick comparison:

TopologyFleet countInstance countTypical SSO scopePrimary operational risk
Single site11Single instance or fleet-wideSmall fault domain, tight coupling
Two sites, one region11Fleet-wide (common)Stretched dependencies for management availability
Multi-region1+2+Cross-instance or fleet-wideGovernance dependency on where fleet services run

Failure domain analysis

This is the conversation leadership actually needs.

Fleet services failure

If VCF Operations, fleet management, or VCF Automation are impaired:

If VCF Identity Broker is down:

Instance management domain failure

If an instance management domain is down:

Workload domain failure

If a workload domain’s vCenter or NSX is degraded:

Example RTO/RPO targets you can start with

These are practical starting points to drive a discussion. Adjust to your business requirements.

Operational runbook snapshot

Shutdown order matters

In multi-instance environments:

Within the management domain that hosts fleet services, a typical shutdown sequence starts with:

Operational gotcha:

Backups: get the SFTP target right early

You should treat SFTP backup targets as day-1 prerequisites, not an afterthought.

Password lifecycle: know which system is authoritative

Before running the lookup_passwords command on SDDC Manager, use this Bash example:

# On the SDDC Manager appliance
sudo lookup_passwords

Fast validation: confirm build levels in your environment

Before you start production onboarding, validate you are actually running the expected code level.

Use this PowerShell example with PowerCLI to validate vCenter and ESXi versions:

# Connect to vCenter
Connect-VIServer -Server <vcenter_fqdn>

# vCenter build and version
$about = (Get-View ServiceInstance).Content.About
[PSCustomObject]@{
  Product = $about.FullName
  Version = $about.Version
  Build   = $about.Build
}

# ESXi hosts build and version
Get-VMHost | Sort-Object Name | Select-Object Name, Version, Build

Anti-patterns

Avoid these early, or you will pay in incident response time later:

Summary and takeaways

Conclusion

VCF 9.0 becomes dramatically easier to operate when everyone can point to the same boundaries:

That shared mental model is what lets you scale without scaling confusion.

Sources

VMware Cloud Foundation 9.0 Documentation (VCF 9.0 and later): https://techdocs.broadcom.com/us/en/vmware-cis/vcf/vcf-9-0-and-later/9-0.html

Exit mobile version