-
Notifications
You must be signed in to change notification settings - Fork 79
Add support for DCN adoption #1184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add support for DCN adoption #1184
Conversation
Create a scenario to adopt DCN deployments, based on the HCI scenario. Signed-off-by: Oliver Walsh <owalsh@redhat.com>
The DCN deployment templates are missing route definitions and
the DCN roles were using central subnets instead of their own.
This patch fixes that problem by making the following changes.
1. Network Routes Added (network_data.yaml.j2)
a. dcn1/network_data.yaml.j2: Added routes to InternalApi,
Storage, and Tenant subnets pointing to central
(172.17.0.0/24, 172.18.0.0/24, 172.19.0.0/24) and dcn2
(172.17.20.0/24, 172.18.20.0/24, 172.19.20.0/24)
b. dcn2/network_data.yaml.j2: Added routes to InternalApi,
Storage, and Tenant subnets pointing to central
(172.17.0.0/24, 172.18.0.0/24, 172.19.0.0/24) and dcn1
(172.17.10.0/24, 172.18.10.0/24, 172.19.10.0/24)
c. central/network_data.yaml.j2: Added routes to InternalApi,
Storage, and Tenant subnets pointing to dcn1
(172.17.10.0/24, 172.18.10.0/24, 172.19.10.0/24) and dcn2
(172.17.20.0/24, 172.18.20.0/24, 172.19.20.0/24)
2. Control Plane Routes Added (config_download.yaml)
a. dcn1/config_download.yaml: Added host_routes to
leaf1 subnet for central (192.168.122.0/24) and
dcn2 (192.168.144.0/24)
b. dcn2/config_download.yaml: Added host_routes to
leaf2 subnet for central (192.168.122.0/24) and
dcn1 (192.168.133.0/24)
c. central/config_download.yaml: Added host_routes to
ctlplane-subnet for dcn1 (192.168.133.0/24) and dcn2
(192.168.144.0/24)
3. Subnet References Fixed (roles.yaml)
a. dcn1/roles.yaml: Changed ComputeDcn1 networks to use
internal_api_leaf1, tenant_leaf_1, storage_leaf1
b. dcn2/roles.yaml: Changed ComputeDcn2 networks to use
internal_api_leaf2, tenant_leaf_2, storage_leaf2
Signed-off-by: John Fulton <fulton@redhat.com>
Co-authored-by: Claude <claude@anthropic.com>
The new control plane defined in the architecture repo (examples/dt/dcn_nostorage/control-plane/nncp/values.yaml) uses unique VLAN IDs per site like central: 20-23, dcn1: 30-33 and dcn2: 40-43. The old control plane defined in the data-plane-adoption repo (tests/vars.dcn_nostorage.yaml) uses the same VLAN per site like central: 20-23, dcn1: 20-23 and dcn2: 20-23. This leads to problems during adoption testing which requires manual renumbering to fix. This patch updates the original 17 test control plane to use the same unique VLAN IDs per site. Co-authored-by: Claude <claude@anthropic.com> Signed-off-by: John Fulton <fulton@redhat.com>
DCN deployments on RHEL 9.4 hypervisors require loose reverse path filtering to allow asymmetric routing between DCN compute nodes and the central controller. Set KernelIpv4ConfAllRpFilter=2 to set net.ipv4.conf.all.rp_filter=2 on all overcloud nodes. Without this setting, DCN compute nodes cannot communicate with the central controller's Keystone service during deployment, causing the nova_wait_for_compute_service task to fail. Note: This issue does not occur on CentOS Stream 9 hypervisors. Note: It is assumed that the same settings have already been made on the hypervisor hosting the VMs. Also, remove redundant ControllerExtraConfig and set nova::availability_zone::default_schedule_zone: az-central using the single ControllerExtraConfig. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: John Fulton <fulton@redhat.com>
Add network-specific routes to dcn_nostorage.yaml stack configurations to enable ci-framework's os-net-config template to render cross-site routes for DCN compute nodes. Problem: - ci-framework pre-generates /etc/os-net-config/tripleo_config.yaml via ansible template before TripleO Heat deployment runs - This bypasses TripleO Heat NIC templates completely - Routes defined in network_data.yaml.j2 are never rendered to compute nodes Solution: - Add network_routes configuration to dcn1 and dcn2 stacks in dcn_nostorage.yaml - ci-framework's os_net_config_overcloud.yml.j2 template will consume these routes and render them to /etc/os-net-config/tripleo_config.yaml Routes Added: - DCN1: Routes to central (172.17/18/19.0.0/24) and dcn2 (172.17/18/19.20.0/24) via appropriate gateways - DCN2: Routes to central (172.17/18/19.0.0/24) and dcn1 (172.17/18/19.10.0/24) via appropriate gateways This enables DCN compute nodes to reach OVN southbound DB and other services on central controllers using the correct source IP addresses. Related: Commit bfc6d4d added routes to network_data.yaml.j2, but those routes were never being used due to ci-framework bypassing Heat templates. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: John Fulton <fulton@redhat.com>
Add routes field to DCN subnet definitions in netconfig_networks to enable
proper inter-site connectivity. Routes are templated from edpm_dcn1_routes
and edpm_dcn2_routes variables.
The NetConfig controller propagates these routes to IPSet status, which the
openstack-operator inventory generator reads to create {network}_host_routes
ansible variables for the EDPM network configuration template.
Changes:
- Add routes to internalapidcn1/dcn2 subnets for RabbitMQ/API connectivity
- Add routes to storagedcn1/dcn2 subnets for storage traffic
- Add routes to tenantdcn1/dcn2 subnets for tenant network traffic
This fixes the issue where DCN compute nodes couldn't connect to RabbitMQ
and other control plane services because they lacked routes to central site.
Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: John Fulton <fulton@redhat.com>
|
Skipping CI for Draft Pull Request. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Build failed (check pipeline). Post https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/0f929a6fc0034828bbfa64369db4e79e ✔️ noop SUCCESS in 0s |
Create a scenario to adopt DCN deployments, based on the HCI scenario.