Skip to content

Conversation

@gursewak1997
Copy link
Collaborator

Configure QEMU user-mode networking to use host DNS servers from /etc/resolv.conf instead of the default 10.0.2.3, which doesn't work when QEMU runs inside containers.

@gursewak1997 gursewak1997 force-pushed the fix/dns-resolution-ephemeral-guests branch 2 times, most recently from d5e8558 to f693004 Compare December 2, 2025 06:26
Copy link
Collaborator

@cgwalters cgwalters left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with this overall if you are; a few nits.

That said, an integration test (run as part of just test-integration ephemeral) would be both easy and IMO mandatory for changes like this.

@gursewak1997 gursewak1997 force-pushed the fix/dns-resolution-ephemeral-guests branch 8 times, most recently from 542253f to 8b859ad Compare December 2, 2025 23:18
@gursewak1997 gursewak1997 marked this pull request as ready for review December 3, 2025 01:25
@gursewak1997 gursewak1997 force-pushed the fix/dns-resolution-ephemeral-guests branch 3 times, most recently from dcc3ad4 to 9996db4 Compare December 5, 2025 19:52
@gursewak1997
Copy link
Collaborator Author

gursewak1997 commented Dec 5, 2025

Network connectivity test (podman pull) failed: stdout: Error: configure storage: 'overlay' is not supported over overlayfs, a mount_program is required: backing file system is unsupported for this graph driver
Can switch to something simpler which doesn't require podman
Edit: Switching the test to use HTTP

@gursewak1997 gursewak1997 force-pushed the fix/dns-resolution-ephemeral-guests branch from 9996db4 to 5e28fd5 Compare December 5, 2025 21:31
@cgwalters
Copy link
Collaborator

cgwalters commented Dec 5, 2025

Error: configure storage: 'overlay' is not supported over overlayfs, a mount_program is required: backing file system is unsupported for this graph driver

Ah yes that relates to #22 - basically all of /var needs to be equivalent to a VOLUME in docker/podman terms - it needs to be copied up to a tmpfs.

Copy link
Collaborator

@cgwalters cgwalters left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One nit otherwise looks sane (though I'd reiterate we probably really do want a bigger fix of --net=host per the issue, but this helps for now)

@gursewak1997 gursewak1997 force-pushed the fix/dns-resolution-ephemeral-guests branch from 5e28fd5 to 57af585 Compare December 5, 2025 21:52
@gursewak1997 gursewak1997 linked an issue Dec 5, 2025 that may be closed by this pull request
@gursewak1997 gursewak1997 force-pushed the fix/dns-resolution-ephemeral-guests branch from 57af585 to eb89500 Compare December 6, 2025 08:00
@gursewak1997
Copy link
Collaborator Author

Reworking this PR as I think it will require a few more changes.

@cgwalters cgwalters marked this pull request as draft December 10, 2025 17:35
@gursewak1997 gursewak1997 force-pushed the fix/dns-resolution-ephemeral-guests branch 2 times, most recently from 5cf80aa to 3bf0686 Compare December 10, 2025 22:25
@gursewak1997 gursewak1997 marked this pull request as ready for review December 10, 2025 23:04
@gursewak1997 gursewak1997 force-pushed the fix/dns-resolution-ephemeral-guests branch from 3bf0686 to 0cd76fc Compare December 11, 2025 22:33
@gursewak1997
Copy link
Collaborator Author

This one's just timing out on test_to_disk_for_image_quay_io_centos_bootc_centos_bootc_stream9 test but should be good to go though

@gursewak1997 gursewak1997 force-pushed the fix/dns-resolution-ephemeral-guests branch from 0cd76fc to 998adc2 Compare December 13, 2025 00:00
QEMU's slirp reads /etc/resolv.conf from the container namespace,
which contains unreachable bridge DNS servers. On systemd-resolved
hosts, it only has 127.0.0.53 (stub resolver).

Read upstream DNS servers from host's /run/systemd/resolve/resolv.conf,
pass them to the container, and write /etc/resolv.conf before starting QEMU.

Signed-off-by: gursewak1997 <gursmangat@gmail.com>
@gursewak1997 gursewak1997 force-pushed the fix/dns-resolution-ephemeral-guests branch from 998adc2 to b91195d Compare December 15, 2025 20:39
@gursewak1997 gursewak1997 merged commit 800e600 into main Dec 15, 2025
7 checks passed
@gursewak1997 gursewak1997 deleted the fix/dns-resolution-ephemeral-guests branch December 15, 2025 22:03
@cgwalters
Copy link
Collaborator

So I get this from every run now:

WARN No usable DNS servers found in system configuration, falling back to public DNS (8.8.8.8, 1.1.1.1). This may not work in air-gapped environments.

Looks like what's happening is when I'm on my work VPN the DNS is switched over to 10.x which matches is_private. It's not totally clear to me why we'd need to filter out private IP addresses.

In general I would say we probably want to get sooner rather than later out of the business of parsing DNS at all and ensuring that libvirt is handling networking consistently.

gursewak1997 added a commit that referenced this pull request Dec 18, 2025
PR #167 introduced DNS filtering that excluded all private IP addresses
(10.x, 172.16-31.x, 192.168.x, fc00::/7) assuming they would be
unreachable from QEMU's slirp networking. However, this breaks VPN
scenarios where private DNS servers are actually reachable.

This change removes the overly aggressive private IP filtering, now
only filtering out localhost and link-local addresses. Private network
DNS servers are allowed through since they may be reachable (e.g., via
VPN or air-gapped networks). If they're actually unreachable, DNS will
fail naturally, which is better than prematurely filtering them out.

Also downgraded the fallback warning from WARN to debug level since
falling back to public DNS is a normal case, not an error condition.

Assisted-by: Claude Code (Sonnet 4.5)
Signed-off-by: gursewak1997 <gursmangat@gmail.com>
@gursewak1997
Copy link
Collaborator Author

So I get this from every run now:

WARN No usable DNS servers found in system configuration, falling back to public DNS (8.8.8.8, 1.1.1.1). This may not work in air-gapped environments.

Looks like what's happening is when I'm on my work VPN the DNS is switched over to 10.x which matches is_private. It's not totally clear to me why we'd need to filter out private IP addresses.

A fix for now: #182

gursewak1997 added a commit that referenced this pull request Dec 18, 2025
PR #167 introduced DNS filtering that excluded all private IP addresses
(10.x, 172.16-31.x, 192.168.x, fc00::/7) assuming they would be
unreachable from QEMU's slirp networking. However, this breaks VPN
scenarios where private DNS servers are actually reachable.

This change removes the overly aggressive private IP filtering, now
only filtering out localhost and link-local addresses. Private network
DNS servers are allowed through since they may be reachable (e.g., via
VPN or air-gapped networks). If they're actually unreachable, DNS will
fail naturally, which is better than prematurely filtering them out.

Also downgraded the fallback warning from WARN to debug level since
falling back to public DNS is a normal case, not an error condition.

Assisted-by: Claude Code (Sonnet 4.5)
Signed-off-by: gursewak1997 <gursmangat@gmail.com>
gursewak1997 added a commit that referenced this pull request Dec 18, 2025
PR #167 introduced DNS filtering that excluded all private IP addresses
(10.x, 172.16-31.x, 192.168.x, fc00::/7) assuming they would be
unreachable from QEMU's slirp networking. However, this breaks VPN
scenarios where private DNS servers are actually reachable.

This change removes the overly aggressive private IP filtering, now
only filtering out localhost and link-local addresses. Private network
DNS servers are allowed through since they may be reachable (e.g., via
VPN or air-gapped networks). If they're actually unreachable, DNS will
fail naturally, which is better than prematurely filtering them out.

Also downgraded the fallback warning from WARN to debug level since
falling back to public DNS is a normal case, not an error condition.

Assisted-by: Claude Code (Sonnet 4.5)
Signed-off-by: gursewak1997 <gursmangat@gmail.com>
gursewak1997 added a commit that referenced this pull request Dec 18, 2025
PR #167 introduced DNS filtering that excluded all private IP addresses
(10.x, 172.16-31.x, 192.168.x, fc00::/7) assuming they would be
unreachable from QEMU's slirp networking. However, this breaks VPN
scenarios where private DNS servers are actually reachable.

This change removes the overly aggressive private IP filtering, now
only filtering out localhost and link-local addresses. Private network
DNS servers are allowed through since they may be reachable (e.g., via
VPN or air-gapped networks). If they're actually unreachable, DNS will
fail naturally, which is better than prematurely filtering them out.

Also downgraded the fallback warning from WARN to debug level since
falling back to public DNS is a normal case, not an error condition.

Assisted-by: Claude Code (Sonnet 4.5)
Signed-off-by: gursewak1997 <gursmangat@gmail.com>
gursewak1997 added a commit that referenced this pull request Dec 18, 2025
PR #167 introduced DNS filtering that excluded all private IP addresses
(10.x, 172.16-31.x, 192.168.x, fc00::/7) assuming they would be
unreachable from QEMU's slirp networking. However, this breaks VPN
scenarios where private DNS servers are actually reachable.

This change removes the overly aggressive private IP filtering, now
only filtering out localhost and link-local addresses. Private network
DNS servers are allowed through since they may be reachable (e.g., via
VPN or air-gapped networks). If they're actually unreachable, DNS will
fail naturally, which is better than prematurely filtering them out.

Also downgraded the fallback warning from WARN to debug level since
falling back to public DNS is a normal case, not an error condition.

Assisted-by: Claude Code (Sonnet 4.5)
Signed-off-by: gursewak1997 <gursmangat@gmail.com>
cgwalters pushed a commit that referenced this pull request Dec 19, 2025
PR #167 introduced DNS filtering that excluded all private IP addresses
(10.x, 172.16-31.x, 192.168.x, fc00::/7) assuming they would be
unreachable from QEMU's slirp networking. However, this breaks VPN
scenarios where private DNS servers are actually reachable.

This change removes the overly aggressive private IP filtering, now
only filtering out localhost and link-local addresses. Private network
DNS servers are allowed through since they may be reachable (e.g., via
VPN or air-gapped networks). If they're actually unreachable, DNS will
fail naturally, which is better than prematurely filtering them out.

Also downgraded the fallback warning from WARN to debug level since
falling back to public DNS is a normal case, not an error condition.

Assisted-by: Claude Code (Sonnet 4.5)
Signed-off-by: gursewak1997 <gursmangat@gmail.com>
Copy link
Collaborator

@cgwalters cgwalters left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assisted-by: OpenCode (Claude Sonnet 4)

AI-generated review based on REVIEW.md guidelines.
Comments prefixed with "AI:" are unedited AI output.

Post-merge cross-check review

The key issue (filtering private network DNS servers) was already identified and fixed in follow-up commit 8f8d301. Overall this is a well-structured fix with good documentation and testing.


// Filter out localhost, link-local, and private network addresses
// QEMU runs in user networking mode (slirp) inside a container, which cannot
// reach private network addresses (10.x.x.x, 172.16-31.x.x, 192.168.x.x for IPv4,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AI: Important: The filtering of private network addresses (!ip.is_private() for IPv4, ULA check for IPv6) will break DNS for many enterprise/VPN users whose DNS servers are on private networks. Consider only filtering loopback and link-local. (Note: This was fixed in follow-up commit 8f8d301)

"-c",
r#"
if command -v curl >/dev/null 2>&1; then
curl -sS --max-time 10 https://quay.io/v2/ >/dev/null
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AI: (low) This test requires external network access to quay.io, which may fail in air-gapped CI environments or during quay.io outages. Consider documenting this as an integration test requirement, or adding a skip mechanism for offline testing.

pub host_dns_servers: Option<Vec<String>>,
}

/// Parse DNS servers from resolv.conf format content
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AI: Good structure - having parse_resolv_conf take &str makes it unit-testable without filesystem dependencies. Consider adding unit tests for edge cases (empty file, malformed lines, mixed IPv4/IPv6).

@cgwalters
Copy link
Collaborator

⬆️ was a test case for bootc-dev/infra#64

@gursewak1997
Copy link
Collaborator Author

I like how it pointed out an issue and also mentioned (Note: This was fixed in follow-up commit https://github.com/bootc-dev/bcvk/commit/8f8d301a453d5cfeff5e329c8f3da94e18a0cbad)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

DNS resolving not working in ephemeral guests

3 participants