Skip to content

Conversation

@olliewalsh
Copy link

Create a scenario to adopt DCN deployments, based on the HCI scenario.

olliewalsh and others added 6 commits December 17, 2025 15:24
Create a scenario to adopt DCN deployments, based on the HCI scenario.

Signed-off-by: Oliver Walsh <owalsh@redhat.com>
The DCN deployment templates are missing route definitions and
the DCN roles were using central subnets instead of their own.
This patch fixes that problem by making the following changes.

1. Network Routes Added (network_data.yaml.j2)

  a. dcn1/network_data.yaml.j2: Added routes to InternalApi,
     Storage, and Tenant subnets pointing to central
     (172.17.0.0/24, 172.18.0.0/24, 172.19.0.0/24) and dcn2
     (172.17.20.0/24, 172.18.20.0/24, 172.19.20.0/24)

  b. dcn2/network_data.yaml.j2: Added routes to InternalApi,
     Storage, and Tenant subnets pointing to central
     (172.17.0.0/24, 172.18.0.0/24, 172.19.0.0/24) and dcn1
     (172.17.10.0/24, 172.18.10.0/24, 172.19.10.0/24)

  c. central/network_data.yaml.j2: Added routes to InternalApi,
     Storage, and Tenant subnets pointing to dcn1
     (172.17.10.0/24, 172.18.10.0/24, 172.19.10.0/24) and dcn2
     (172.17.20.0/24, 172.18.20.0/24, 172.19.20.0/24)

2. Control Plane Routes Added (config_download.yaml)

  a. dcn1/config_download.yaml: Added host_routes to
     leaf1 subnet for central (192.168.122.0/24) and
     dcn2 (192.168.144.0/24)

  b. dcn2/config_download.yaml: Added host_routes to
     leaf2 subnet for central (192.168.122.0/24) and
     dcn1 (192.168.133.0/24)

  c. central/config_download.yaml: Added host_routes to
     ctlplane-subnet for dcn1 (192.168.133.0/24) and dcn2
     (192.168.144.0/24)

3. Subnet References Fixed (roles.yaml)

  a. dcn1/roles.yaml: Changed ComputeDcn1 networks to use
     internal_api_leaf1, tenant_leaf_1, storage_leaf1

  b. dcn2/roles.yaml: Changed ComputeDcn2 networks to use
     internal_api_leaf2, tenant_leaf_2, storage_leaf2

Signed-off-by: John Fulton <fulton@redhat.com>
Co-authored-by: Claude <claude@anthropic.com>
The new control plane defined in the architecture repo
(examples/dt/dcn_nostorage/control-plane/nncp/values.yaml)
uses unique VLAN IDs per site like central: 20-23, dcn1:
30-33 and dcn2: 40-43.

The old control plane defined in the data-plane-adoption repo
(tests/vars.dcn_nostorage.yaml) uses the same VLAN per site
like central: 20-23, dcn1: 20-23 and dcn2: 20-23.

This leads to problems during adoption testing which requires
manual renumbering to fix. This patch updates the original 17
test control plane to use the same unique VLAN IDs per site.

Co-authored-by: Claude <claude@anthropic.com>
Signed-off-by: John Fulton <fulton@redhat.com>
DCN deployments on RHEL 9.4 hypervisors require loose reverse path
filtering to allow asymmetric routing between DCN compute nodes and
the central controller.

Set KernelIpv4ConfAllRpFilter=2 to set net.ipv4.conf.all.rp_filter=2
on all overcloud nodes.

Without this setting, DCN compute nodes cannot communicate with the
central controller's Keystone service during deployment, causing the
nova_wait_for_compute_service task to fail.

Note: This issue does not occur on CentOS Stream 9 hypervisors.

Note: It is assumed that the same settings have already been made on
the hypervisor hosting the VMs.

Also, remove redundant ControllerExtraConfig and set
nova::availability_zone::default_schedule_zone: az-central
using the single ControllerExtraConfig.

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: John Fulton <fulton@redhat.com>
Add network-specific routes to dcn_nostorage.yaml stack configurations
to enable ci-framework's os-net-config template to render cross-site
routes for DCN compute nodes.

Problem:
- ci-framework pre-generates /etc/os-net-config/tripleo_config.yaml
  via ansible template before TripleO Heat deployment runs
- This bypasses TripleO Heat NIC templates completely
- Routes defined in network_data.yaml.j2 are never rendered to
  compute nodes

Solution:
- Add network_routes configuration to dcn1 and dcn2 stacks in
  dcn_nostorage.yaml
- ci-framework's os_net_config_overcloud.yml.j2 template will consume
  these routes and render them to /etc/os-net-config/tripleo_config.yaml

Routes Added:
- DCN1: Routes to central (172.17/18/19.0.0/24) and dcn2
  (172.17/18/19.20.0/24) via appropriate gateways
- DCN2: Routes to central (172.17/18/19.0.0/24) and dcn1
  (172.17/18/19.10.0/24) via appropriate gateways

This enables DCN compute nodes to reach OVN southbound DB and other
services on central controllers using the correct source IP addresses.

Related: Commit bfc6d4d added routes to network_data.yaml.j2, but
those routes were never being used due to ci-framework bypassing
Heat templates.

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: John Fulton <fulton@redhat.com>
Add routes field to DCN subnet definitions in netconfig_networks to enable
proper inter-site connectivity. Routes are templated from edpm_dcn1_routes
and edpm_dcn2_routes variables.

The NetConfig controller propagates these routes to IPSet status, which the
openstack-operator inventory generator reads to create {network}_host_routes
ansible variables for the EDPM network configuration template.

Changes:
- Add routes to internalapidcn1/dcn2 subnets for RabbitMQ/API connectivity
- Add routes to storagedcn1/dcn2 subnets for storage traffic
- Add routes to tenantdcn1/dcn2 subnets for tenant network traffic

This fixes the issue where DCN compute nodes couldn't connect to RabbitMQ
and other control plane services because they lacked routes to central site.

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: John Fulton <fulton@redhat.com>
@openshift-ci
Copy link

openshift-ci bot commented Dec 17, 2025

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci
Copy link

openshift-ci bot commented Dec 17, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign archana203 for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@olliewalsh olliewalsh requested a review from fultonj December 17, 2025 15:28
@softwarefactory-project-zuul
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/0f929a6fc0034828bbfa64369db4e79e

✔️ noop SUCCESS in 0s
adoption-standalone-to-crc-ceph FAILURE in 2h 06m 20s
adoption-standalone-to-crc-no-ceph FAILURE in 2h 13m 10s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants