Skip to content

Conversation

@rsmarples
Copy link
Member

DHCPv6 RELEASE requires the addresses to be dropped before a RELEASE message is sent.
We now wait for an acknowledgement or a timeout before notifying that DHCPv6 has stopped for the interface.

DHCPv4 RELEASE is the other way around, there is no acknowledgement. So we wait for 1 second after sending the message before removing the address and notifying DHCP has stopped for the interface.

If we are not releasing then we notify dhcpcd that the protocol has stopped right away when we drop the lease.

dhcpcd will exit once there are no running protocols for the interfaces.

Fixes #513.
Hopefully #535, #519 and #509 as well.

@coderabbitai
Copy link

coderabbitai bot commented Oct 6, 2025

Walkthrough

Centralizes DHCPv4 teardown into dhcp_deconfigure with delayed deconfigure on release, tightens STOPPING/EXITING guards across protocol handlers, adjusts DHCPv6 release sequencing and callbacks, signals interface teardown via dhcpcd_dropped, and coordinates daemon exit with interface-stop tracking and an exit timeout. (50 words)

Changes

Cohort / File(s) Summary
DHCPv4 cleanup & drop
src/dhcp.c
Add static dhcp_deconfigure() centralizing DHCPv4 teardown; dhcp_drop() now calls dhcpcd_dropped(ifp) on early exits, schedules a 1s delayed deconfigure when releasing, and funnels cleanup through dhcp_deconfigure(). dhcp_free() clears ifp->if_data[IF_DATA_DHCP].
DHCPv6 release sequencing
src/dhcp6.c
dhcp6_startrelease() enforces MRC = REL_MAX_RC and MRCcallback = dhcp6_finishrelease, calls dhcp6_finishrelease() on makemessage failure, and dhcp6_recvif() handles DH6S_RELEASE ack by finishing release. dhcp6_freedrop() invokes dhcpcd_dropped(ifp).
Core stop/exit orchestration
src/dhcpcd.c, src/dhcpcd.h
Add dhcpcd_ifrunning() and public dhcpcd_dropped(); change stop_all_interfaces() to return bool; add dhcpcd_exit_timeout() and revise exit sequencing to wait for interface drop notifications or enforce timeout; callers updated to use new semantics.
IPv4LL drop cleanup
src/ipv4ll.c
ipv4ll_drop() cancels per-interface eloop timeout early, consolidates to a common free path that calls ipv4ll_free() and invokes dhcpcd_dropped(ifp) even when DHCPCD_NODROP is set.
Handler guards (STOPPING/EXITING/user-active)
src/dhcp.c, src/dhcp6.c, src/ipv6nd.c, src/dhcpcd.c, src/route.c
Protocol handlers now check interface user-active/STOPPING flags and protocol-enabled state before processing; route.c skips if_missfilter_apply() when context is exiting to avoid logs during shutdown.
Privsep stop/hangup unification
src/privsep.c
Treat ELE_HANGUP as a stop event: set DHCPCD_EXITING, jump to unified stop/cleanup path, and call eloop_exitall on stop.
Eloop / epoll adjustments
src/eloop.c, src/eloop.h, src/privsep-linux.c
Detect and prefer epoll_pwait2() with ENOSYS fallback; add epoll_pwait2_nosys flag and eloop_exitallinners(int); allow __NR_epoll_pwait2 in seccomp filter.
Route miss-filter guard (BSD)
src/route.c
Skip if_missfilter_apply() when context is exiting to avoid failure logs during shutdown in BSD-specific paths.
Docs / hooks
hooks/dhcpcd-run-hooks.8.in
Update manpage date and add RELEASE / RELEASE6 hook entry: "dhcpcd has released the lease."

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant CLI as CLI/Signal
  participant D as dhcpcd
  participant IF as Interface
  participant V4 as DHCPv4
  participant V6 as DHCPv6
  participant LLA as IPv4LL
  participant K as Kernel

  CLI->>D: stop_all_interfaces(opts)
  D->>IF: stop_interface() (mark STOPPING)
  par per-protocol
    IF->>V4: dhcp_drop(RELEASE?)
    alt RELEASE
      V4->>K: send DHCPv4 RELEASE
      V4-->>V4: schedule dhcp_deconfigure(+1s)
    else other
      V4->>V4: dhcp_deconfigure()
    end

    IF->>V6: dhcp6_startrelease()
    V6->>V6: set MRC=REL_MAX_RC & callback=dhcp6_finishrelease
    V6->>K: send DHCPv6 RELEASE
    K-->>V6: RELEASE ACK (DH6S_RELEASE)
    V6->>V6: dhcp6_finishrelease()
    V6->>D: dhcpcd_dropped(ifp)

    IF->>LLA: ipv4ll_drop()
    LLA->>D: dhcpcd_dropped(ifp)
  end
  D->>D: dhcpcd_ifrunning() check
  alt none running
    D->>D: exit
  else waiting
    D->>D: schedule dhcpcd_exit_timeout()
  end
Loading
sequenceDiagram
  autonumber
  participant RX as Packet/Event
  participant H as Handler
  participant IF as Interface
  participant C as Context

  RX->>H: invoke handler
  Note over H: guard checks — IF user-active, not STOPPING, proto enabled, ctx not EXITING
  alt guard tripped
    H-->>RX: return (skip processing)
  else
    H->>H: normal processing
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Areas needing extra attention:

  • src/dhcp.c — dhcp_deconfigure correctness, timing/race around 1s delayed deconfigure, and proper invocation of dhcpcd_dropped.
  • src/dhcp6.c — MRC/MRCcallback changes, makemessage failure path, and DH6S_RELEASE handling.
  • src/dhcpcd.c / src/dhcpcd.h — stop_all_interfaces() semantics, dhcpcd_dropped() coordination, and exit-timeout integration across callers.
  • src/eloop.c & src/privsep-linux.c — epoll_pwait2 detection/fallback behavior and seccomp syscall inclusion.
  • Cross-module teardown ordering where dhcpcd_dropped() is invoked (ipv4ll, dhcp, dhcp6) and potential double-free or ordering hazards.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
Out of Scope Changes check ❓ Inconclusive Changes to eloop (epoll_pwait2 support), privsep-linux (seccomp), and route.c are supporting infrastructure enabling coordinated shutdown, while dhcp/dhcp6/ipv4ll/ipv6nd/dhcpcd changes implement the core protocol-notification mechanism; however, epoll_pwait2 addition appears tangential to the core issue resolution. Clarify whether epoll_pwait2 kernel feature support is essential to fixing #513 or if it is an unrelated infrastructure improvement that could be addressed separately.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Protocols will notify when dhcpcd can exit' accurately captures the main objective: implementing a notification mechanism so protocols signal when they stop, allowing dhcpcd to exit only when all protocols are done.
Description check ✅ Passed The description details the release behavior changes for DHCPv4/v6 and exit coordination, directly relating to the changeset which implements protocol notifications and coordinated shutdown.
Linked Issues check ✅ Passed The PR addresses issue #513 by implementing coordinated shutdown: DHCPv6 waits for RELEASE acknowledgement before notifying stop [#513], DHCPv4 waits 1 second after RELEASE [#513], and dhcpcd exits only when protocols signal completion, eliminating need for external delays [#513].
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch release

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6c47ee7 and ae22cf0.

📒 Files selected for processing (1)
  • src/dhcpcd.c (11 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-11-03T19:24:39.851Z
Learnt from: rsmarples
Repo: NetworkConfiguration/dhcpcd PR: 536
File: src/eloop.c:53-59
Timestamp: 2025-11-03T19:24:39.851Z
Learning: In the dhcpcd codebase (NetworkConfiguration/dhcpcd), compile-time feature detection based on header versions (e.g., LINUX_VERSION_CODE for epoll_pwait2) does not require runtime fallback handling for syscall unavailability (ENOSYS). Users are responsible for ensuring headers match their target runtime kernel.

Applied to files:

  • src/dhcpcd.c
🧬 Code graph analysis (1)
src/dhcpcd.c (5)
src/script.c (1)
  • script_runreason (745-802)
src/eloop.c (2)
  • eloop_exit (610-615)
  • eloop_exitallinners (628-639)
src/if.c (1)
  • if_free (81-104)
compat/pidfile.c (1)
  • pidfile_read (135-167)
src/privsep-root.c (1)
  • ps_root_stop (973-1020)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
  • GitHub Check: netbsd (--disable-ipv6)
  • GitHub Check: netbsd (--disable-ipv4ll)
  • GitHub Check: netbsd (--disable-ipv4)
  • GitHub Check: netbsd (--disable-arp, -DSMALL)
  • GitHub Check: netbsd (--disable-ipv4ll, -DSMALL)
  • GitHub Check: netbsd (--disable-arp)
  • GitHub Check: netbsd (--disable-dhcp6)
  • GitHub Check: netbsd (--disable-dhcp6, -DSMALL)
  • GitHub Check: freebsd
  • GitHub Check: netbsd (--disable-ipv6, -DSMALL)
  • GitHub Check: netbsd (--disable-ipv4, -DSMALL)
  • GitHub Check: netbsd (-DSMALL)
  • GitHub Check: netbsd
  • GitHub Check: openbsd

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (8)
src/dhcpcd.h (1)

276-276: Add documentation for the new public API.

The function signature and placement look good. However, consider adding a documentation comment to clarify the purpose and usage of this new API, especially since it's part of the interface lifecycle coordination mechanism described in the PR.

For example:

/**
 * dhcpcd_dropped - Notify dhcpcd that a protocol has finished dropping an interface
 * @ifp: Interface that was dropped
 *
 * Called by protocol handlers (DHCP, DHCPv6, IPv4LL) after they have completed
 * releasing resources and dropping an interface. This allows dhcpcd to coordinate
 * clean shutdown when all protocols have finished.
 */
void dhcpcd_dropped(struct interface *);
src/ipv6nd.c (1)

1928-1932: Update the comment for the new guard.

The early-return now also covers the STOPPING state, so the comment that follows no longer reflects the full condition. Please tweak the comment to capture the broader guard.

src/dhcp.c (4)

2869-2904: dhcp_deconfigure: solid teardown; add a defensive NULL-state guard

Looks correct and matches deconfig semantics. To harden against unexpected races (timer fires after state freed), add a quick state NULL check before deref.

Apply this minimal guard:

 static void
 dhcp_deconfigure(void *arg)
 {
   struct interface *ifp = arg;
-  struct dhcp_state *state = D_STATE(ifp);
+  struct dhcp_state *state = D_STATE(ifp);
+  if (state == NULL) {
+    dhcpcd_dropped(ifp);
+    return;
+  }
   struct if_options *ifo = ifp->options;

Also confirm ipv4_applyaddr only reads state->old/new during the call (they’re freed immediately after).


2927-2927: Reason pointer: const-correctness and lifetime

Ensure state->reason is declared as const char * (not char *) and that reason has static/long‑lived storage; otherwise copy/duplicate it.


2972-2972: Direct deconfigure call is fine; optional cast

dhcp_deconfigure takes void *. Direct call with ifp is fine in C; add (void *)ifp cast if you want to silence strict warnings in some builds.


3959-3963: Nulling if_data after free: good; consider pre‑null for safety

Setting ifp->if_data[IF_DATA_DHCP] = NULL is correct. If reentrancy is a concern, consider nulling before free(state) to avoid transient readers seeing a stale pointer. Low priority.

src/dhcpcd.c (2)

460-486: Refactor: Use distinct variable name in loop to avoid shadowing.

The loop variable ifp on line 478 shadows the function parameter ifp. While technically valid, this is confusing and makes the code harder to follow, especially since the parameter represents a specific dropped interface while the loop variable iterates all interfaces.

Apply this diff to improve clarity:

-	TAILQ_FOREACH(ifp, ctx->ifaces, next) {
-		if (dhcpcd_ifrunning(ifp))
+	struct interface *iface;
+	TAILQ_FOREACH(iface, ctx->ifaces, next) {
+		if (dhcpcd_ifrunning(iface))
 			break;
 	}
 
 	/* All interfaces have stopped, we can exit */
-	if (ifp == NULL)
+	if (iface == NULL)
 		eloop_exit(ctx->eloop, EXIT_SUCCESS);

2326-2327: Consider documenting the 50-second exit timeout rationale.

The 50-second timeout serves as a safeguard against hung protocols, but the specific value appears arbitrary. Consider adding a comment explaining why 50 seconds was chosen (e.g., "Allow up to 50 seconds for DHCP RELEASE acknowledgements and network transmission delays") or defining it as a named constant.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2e3b0e5 and 148e1fd.

📒 Files selected for processing (7)
  • src/dhcp.c (7 hunks)
  • src/dhcp6.c (3 hunks)
  • src/dhcpcd.c (11 hunks)
  • src/dhcpcd.h (1 hunks)
  • src/ipv4ll.c (2 hunks)
  • src/ipv6nd.c (1 hunks)
  • src/route.c (1 hunks)
🔇 Additional comments (16)
src/route.c (1)

819-820: LGTM! Clean shutdown optimization.

The guard prevents if_missfilter_apply from being called during shutdown, which aligns with the PR's objective of ensuring clean DHCP RELEASE packet transmission before exit. The short-circuit evaluation ensures the function is never called when DHCPCD_EXITING is set, avoiding unnecessary operations during teardown.

src/ipv4ll.c (2)

451-451: Verify timeout deletion with NULL callback and early placement.

The eloop_timeout_delete call uses NULL as the callback parameter, which may delete all pending timeouts for this interface. Additionally, this deletion occurs before the DHCPCD_NODROP check at line 454.

Verify that:

  1. Using NULL as the callback parameter is the intended API usage and doesn't inadvertently cancel important timeouts
  2. Deleting timeouts before the NODROP check doesn't cause issues if the function returns early
  3. This approach is consistent with similar cleanup in dhcp.c and dhcp6.c

486-487: Verify cleanup is reached in all code paths.

The addition of ipv4ll_free and dhcpcd_dropped correctly implements the PR objective of notifying dhcpcd when the protocol can exit. The cleanup sequence (free resources → notify dropped) is logical and consistent with the PR's design.

However, ensure that:

  1. These cleanup calls are reached in all necessary code paths (i.e., not skipped by early returns from the NODROP check at line 454 or other conditions)
  2. ipv4ll_free and dhcpcd_dropped are idempotent or protected against being called multiple times if ipv4ll_drop can be invoked repeatedly
src/dhcp6.c (2)

3609-3613: LGTM!

The DH6S_RELEASE case properly handles the REPLY acknowledgement and triggers cleanup via dhcp6_finishrelease. The state check ensures this only processes REPLYs when expecting them.


4297-4297: Verify no duplicate notifications.

The placement of dhcpcd_dropped in dhcp6_freedrop is correct for notifying dhcpcd when the protocol stops. However, ensure that dhcp6_freedrop_addrs (called at line 2118 before RELEASE) does not also call dhcpcd_dropped, as this would result in duplicate notifications and potentially incorrect exit logic in dhcpcd.

src/dhcp.c (3)

3132-3137: Early guard in handler: LGTM

Avoids processing when not user‑active, stopping, or DHCP disabled. This should reduce spurious handling during shutdown.

Double‑check that no necessary “final” DHCP messages are dropped under DHCPCD_STOPPING (e.g., if packets arrive during shutdown).


2916-2916: Potential overbroad timer deletion (arg-wide)

eloop_timeout_delete(…, NULL, ifp) may cancel timeouts from other protocols that also use ifp as the callback arg. Prefer deleting only DHCP timeouts or use a DHCP‑scoped wrapper.

Run to confirm cross‑protocol use of ifp as arg in timeouts:


2941-2956: DHCPv4 RELEASE delay is good; refine error path and timer deletion scope

  • Good: delay removal 1s after RELEASE to allow send with configured IP.
  • Consider handling send_message failure: on error, skip the delay and deconfigure immediately.
  • Same timer scope concern: eloop_timeout_delete(…, NULL, ifp) may cancel non‑DHCP timeouts.

Apply:

- send_message(ifp, DHCP_RELEASE, NULL);
- eloop_timeout_delete(ifp->ctx->eloop, NULL, ifp);
+ if (send_message(ifp, DHCP_RELEASE, NULL) == -1) {
+   dhcp_deconfigure(ifp);
+   return;
+ }
+ /* Delete only DHCP timers here if possible. */
+ eloop_timeout_delete(ifp->ctx->eloop, NULL, ifp);

If you keep arg‑wide deletion, please verify other protocols (e.g., DHCPv6, IPv4LL, IPv6ND) don’t pass ifp as the timeout arg to avoid unintended cancellations (see script above).

src/dhcpcd.c (8)

441-458: LGTM! Helper function correctly checks protocol states.

The dhcpcd_ifrunning function properly checks if any DHCP protocol (DHCPv4, IPv4LL, or DHCPv6) is still active on the interface. The conditional compilation guards ensure only available protocols are checked.


488-495: LGTM! Function signature change aligns with new notification pattern.

Converting stop_interface to static void is appropriate since the exit decision now depends on protocol notifications via dhcpcd_dropped rather than immediate return values.


780-781: LGTM! Early return prevents interference during shutdown.

Skipping carrier change processing when the interface is stopping prevents race conditions and ensures clean shutdown.


1109-1109: LGTM! Proper shutdown flow for departed interfaces.

Using stop_interface for departed interfaces and the early return when exiting ensures consistent shutdown behavior and prevents processing new interface events during teardown.

Also applies to: 1116-1117


1180-1181: LGTM! Prevents link event processing during shutdown.

The early return when exiting is correct and consistent with similar guards in other event handlers.


1423-1431: LGTM! Return value enables proper wait-vs-exit decision.

The boolean return value correctly indicates whether any interface was actually stopped, allowing callers to determine if they should wait for protocol notifications or exit immediately. This is a key part of the graceful shutdown mechanism described in the PR objectives.

Also applies to: 1444-1447


1563-1572: LGTM! Signal handler correctly implements notification-based exit.

The changes properly:

  • Prevent re-entrant signal handling during teardown
  • Defer exit when interfaces are actively stopping (waits for dhcpcd_dropped notifications)
  • Exit immediately in test mode or when no interfaces were stopped

This aligns with the PR objective of ensuring RELEASE packets are fully transmitted before exit.


1714-1718: LGTM! Consistent exit handling in command-line args path.

The logic matches the signal handler approach, ensuring that explicit exit/release commands also wait for protocol notifications when interfaces are actively stopping.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
src/dhcp6.c (1)

2128-2133: Inconsistent brace style in if-else statement.

The if block uses braces but the else branch does not. For consistency and readability, consider using braces on both branches.

Apply this diff to add braces:

 	if (dhcp6_makemessage(ifp) == -1) {
 		logerr("%s: %s", __func__, ifp->name);
 		/* not much we can do apart from finish now */
 		dhcp6_finishrelease(ifp);
-	} else
+	} else {
 		dhcp6_sendrelease(ifp);
+	}
src/dhcpcd.c (2)

478-485: Variable shadowing in loop.

The loop at line 478 reuses the parameter variable name ifp, which shadows the function parameter. While functionally correct (since the parameter is no longer needed after line 473), this could be confusing for maintainability.

Consider using a different variable name for the loop:

-	TAILQ_FOREACH(ifp, ctx->ifaces, next) {
-		if (dhcpcd_ifrunning(ifp))
+	struct interface *ifp_iter;
+	TAILQ_FOREACH(ifp_iter, ctx->ifaces, next) {
+		if (dhcpcd_ifrunning(ifp_iter))
 			break;
 	}
 
 	/* All interfaces have stopped, we can exit */
-	if (ifp == NULL)
+	if (ifp_iter == NULL)
 		eloop_exit(ctx->eloop, EXIT_SUCCESS);

2330-2331: Consider documenting the 50-second timeout rationale.

The timeout value of 50 seconds appears to be chosen to allow sufficient time for the signaled process to exit. However, the choice of this specific value is not documented.

Consider adding a comment explaining the timeout duration, or defining it as a named constant:

+			/* Allow up to 50 seconds for process to exit gracefully */
 			eloop_timeout_add_sec(ctx->eloop, 50,
 				dhcpcd_exit_timeout, &ctx);

Or:

#define EXIT_WAIT_TIMEOUT 50  /* seconds to wait for process exit */
...
			eloop_timeout_add_sec(ctx->eloop, EXIT_WAIT_TIMEOUT,
				dhcpcd_exit_timeout, &ctx);
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 148e1fd and cd317bf.

📒 Files selected for processing (2)
  • src/dhcp6.c (3 hunks)
  • src/dhcpcd.c (11 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
src/dhcp6.c (1)
src/dhcpcd.c (1)
  • dhcpcd_dropped (460-486)
src/dhcpcd.c (4)
src/script.c (1)
  • script_runreason (745-802)
src/eloop.c (1)
  • eloop_exit (652-660)
src/if.c (1)
  • if_free (81-104)
compat/pidfile.c (1)
  • pidfile_read (135-167)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
  • GitHub Check: netbsd (--disable-ipv6, -DSMALL)
  • GitHub Check: netbsd (--disable-ipv6)
  • GitHub Check: netbsd (--disable-dhcp6)
  • GitHub Check: netbsd (--disable-dhcp6, -DSMALL)
  • GitHub Check: netbsd (--disable-ipv4ll, -DSMALL)
  • GitHub Check: netbsd (--disable-arp, -DSMALL)
  • GitHub Check: netbsd (--disable-arp)
  • GitHub Check: netbsd (-DSMALL)
  • GitHub Check: netbsd (--disable-ipv4ll)
  • GitHub Check: netbsd
  • GitHub Check: netbsd (--disable-ipv4)
  • GitHub Check: netbsd (--disable-ipv4, -DSMALL)
  • GitHub Check: freebsd
  • GitHub Check: openbsd
🔇 Additional comments (11)
src/dhcp6.c (2)

3611-3615: LGTM! Proper RELEASE acknowledgment handling.

The code correctly handles DHCPv6 RELEASE acknowledgment by logging and calling dhcp6_finishrelease, which aligns with the PR objective to wait for acknowledgment before completing the release sequence.


4299-4299: LGTM! Proper integration with exit coordination.

The call to dhcpcd_dropped correctly notifies the core daemon that DHCPv6 has finished cleanup, enabling coordinated shutdown as described in the PR objectives.

src/dhcpcd.c (9)

441-458: LGTM! Clear interface state check.

The function correctly checks whether any protocol (DHCP, IPv4LL, DHCPv6) is still running on an interface, which is essential for coordinating shutdown.


489-495: LGTM! Simplified stop interface logic.

The refactored function has a clearer single responsibility: mark the interface as stopping and drop protocols. The exit coordination is now properly handled by dhcpcd_dropped.


780-781: LGTM! Proper shutdown guard.

The early return when DHCPCD_STOPPING is set prevents unnecessary carrier state changes during shutdown, avoiding potential re-entry issues.


1116-1117: LGTM! Proper shutdown guard.

The early return when DHCPCD_EXITING is set prevents interface discovery during shutdown, which could interfere with the orderly exit process.


1180-1181: LGTM! Proper shutdown guard.

The early return when DHCPCD_EXITING is set prevents link event processing during shutdown, avoiding unnecessary work and potential state inconsistencies.


1423-1448: LGTM! Improved exit coordination.

The boolean return value correctly indicates whether any protocols were stopped, which is used by callers to determine whether to wait for completion or exit immediately. This is a key part of the coordinated shutdown mechanism.


1563-1574: LGTM! Proper signal handler coordination.

The modified logic correctly coordinates exit by:

  1. Preventing re-entry with the DHCPCD_EXITING guard
  2. Deferring exit if protocols were stopped (waiting for dhcpcd_dropped to be called)
  3. Exiting immediately if nothing was running

This aligns with the PR objective to ensure protocols complete their shutdown sequences.


1714-1718: LGTM! Consistent control command handling.

The exit coordination for control commands correctly uses the same pattern as the signal handler, ensuring consistent behavior whether shutdown is triggered by signal or control command.


1949-1969: LGTM! Improved pidfile handling addresses past review.

The code now properly handles pidfile_read returning -1 in both timeout functions:

  • dhcpcd_pidfile_timeout: Exits with SUCCESS if process gone, continues polling otherwise
  • dhcpcd_exit_timeout: Exits with SUCCESS if process gone, logs warning and fails if still running after timeout

This resolves the past review concern about the confusing "pid -1 failed to exit" log message.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
src/dhcp.c (1)

2949-2963: Delayed address removal for DHCPv4 RELEASE aligns with PR objectives.

The 1-second delay before calling dhcp_deconfigure allows the RELEASE packet to be transmitted before the address is removed. This addresses issue #513 by ensuring the packet is fully formed and sent before the interface is torn down.

However, consider these refinements:

  1. Replace magic number with named constant:
+#define DHCP_RELEASE_DELAY_SEC 1
+
 static void
 dhcp_drop(struct interface *ifp, const char *reason)
 {
 	...
-			const struct timespec delay = {
-				.tv_sec = 1,
-			};
+			const struct timespec delay = {
+				.tv_sec = DHCP_RELEASE_DELAY_SEC,
+			};
  1. Expand the comment to explain the rationale:
-			/* We need to delay removal of the IP address so the
-			 * message can be sent.
-			 * Unlike DHCPv6, there is no acknowledgement. */
+			/* DHCPv4 RELEASE has no acknowledgement from the server.
+			 * Delay address removal to ensure the RELEASE packet is
+			 * transmitted before the interface/address is torn down.
+			 * See issue #513. */
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 942195b and 57fed36.

📒 Files selected for processing (2)
  • hooks/dhcpcd-run-hooks.8.in (2 hunks)
  • src/dhcp.c (7 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/dhcp.c (4)
src/auth.c (1)
  • dhcp_auth_reset (78-95)
src/ipv4.c (1)
  • ipv4_applyaddr (746-833)
src/script.c (1)
  • script_runreason (745-802)
src/dhcpcd.c (1)
  • dhcpcd_dropped (460-488)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
  • GitHub Check: netbsd (--disable-dhcp6, -DSMALL)
  • GitHub Check: netbsd (--disable-dhcp6)
  • GitHub Check: netbsd (--disable-ipv6, -DSMALL)
  • GitHub Check: netbsd (--disable-arp, -DSMALL)
  • GitHub Check: netbsd (--disable-ipv6)
  • GitHub Check: netbsd (--disable-ipv4ll, -DSMALL)
  • GitHub Check: netbsd (--disable-ipv4)
  • GitHub Check: netbsd (--disable-arp)
  • GitHub Check: netbsd (--disable-ipv4ll)
  • GitHub Check: netbsd (--disable-ipv4, -DSMALL)
  • GitHub Check: netbsd (-DSMALL)
  • GitHub Check: netbsd
  • GitHub Check: freebsd
  • GitHub Check: openbsd
🔇 Additional comments (8)
hooks/dhcpcd-run-hooks.8.in (2)

25-25: LGTM!

The man page date update is appropriate for documenting the changes in this PR.


108-109: Well-documented new hook reason.

The RELEASE | RELEASE6 entry accurately documents the new hook reason for when dhcpcd releases a lease. The format, placement, and description are consistent with existing entries and align perfectly with the PR objectives to properly handle DHCP RELEASE packets.

src/dhcp.c (6)

2869-2912: Well-structured centralization of deconfiguration logic.

This new function consolidates cleanup sequences that were previously scattered, improving maintainability. The logic correctly handles:

  • Authentication state reset
  • State machine transition to DHS_NONE
  • Resource lifecycle (offer/old/new pointers)
  • Address application/removal via ipv4_applyaddr
  • Script execution with appropriate reason
  • Coordination with core via dhcpcd_dropped when stopping

The conditional flow based on DHCPCD_STOPPING ensures proper shutdown sequencing.


2924-2924: Good addition of notification for early-exit path.

Calling dhcpcd_dropped(ifp) when the state is absent or inactive ensures the core is notified that the interface has stopped, maintaining proper coordination with the shutdown sequence described in the PR objectives.


2960-2960: Verify the scope of timeout deletion is intentional.

The call eloop_timeout_delete(ifp->ctx->eloop, NULL, ifp) with a NULL callback pattern deletes all pending timeouts for this interface. This is aggressive and will cancel any scheduled operations (discover, renew, rebind, etc.).

While this is likely correct for a clean RELEASE shutdown, please confirm:

  1. This behavior is intentional and necessary to prevent interference with the release sequence
  2. No other operations need to complete before the delayed dhcp_deconfigure executes

If this is intended, consider adding a comment explaining why all timeouts are cleared:

/* Clear all pending DHCP operations (discover, renew, etc.) to ensure
 * clean release shutdown without interference. */
eloop_timeout_delete(ifp->ctx->eloop, NULL, ifp);

2980-2980: LGTM: Centralized cleanup call.

Replacing inline cleanup with the new dhcp_deconfigure function improves maintainability and consistency.


3140-3144: Essential guard prevents processing during shutdown.

This early-return check prevents DHCP message handling when:

  • The interface is not user-active
  • The STOPPING flag is set (shutdown in progress)
  • DHCP is not enabled for the interface

This is critical for clean shutdown coordination and prevents race conditions where DHCP responses arrive after the stop sequence has begun, addressing the core PR objective that "protocols notify when dhcpcd can exit."


3970-3970: Good practice: Clear pointer after free.

Setting ifp->if_data[IF_DATA_DHCP] = NULL after freeing the DHCP state prevents use-after-free bugs and makes debugging easier by ensuring the pointer doesn't dangle. This follows defensive programming best practices.

@ColinMcInnes
Copy link
Contributor

I will download this patch and give it a try.

@ColinMcInnes
Copy link
Contributor

I think it still appears to garble outgoing DHCPv6 Release6 message, which appears to be getting cut off.

Oct 07 14:17:11 dhcpcd[3606]: control command: /sbin/dhcpcd --debug --config /etc/dhcp/dhcpcd.conf --release eth0
Oct 07 14:17:11 dhcpcd[3606]: eth0: removing interface
Oct 07 14:17:11 dhcpcd[3606]: eth0: multicasting RELEASE6 (xid 0x37da53), next in 1.0 seconds
Oct 07 14:17:11 dhcpcd[3606]: eth0: executing: /usr/libexec/dhcpcd-run-hooks SEND6_RELEASE6
Oct 07 14:17:11 dhcpcd[3606]: received SIGTERM, stopping
Oct 07 14:17:11 dhcpcd[3606]: eth0: removing interface
Oct 07 14:17:11 dhcpcd[3606]: eth0: executing: /usr/libexec/dhcpcd-run-hooks RELEASE6
Oct 07 14:17:11 dhcpcd[3606]: eth0: deleting address 2001:192:168:226::119/128
Oct 07 14:17:11 dhcpcd[3606]: eth0: executing: /usr/libexec/dhcpcd-run-hooks STOP
Oct 07 14:17:11 dhcpcd[3606]: script_runreason: No such file or directory
Oct 07 14:17:11 dhcpcd[3606]: eth0: executing: /usr/libexec/dhcpcd-run-hooks RELEASE
Oct 07 14:17:11 dhcpcd[3606]: eth0: executing: /usr/libexec/dhcpcd-run-hooks STOPPED
Oct 07 14:17:12 dhcpcd[3606]: eth0: BPF BOOTP exited from PID 4114
Oct 07 14:17:12 dhcpcd[3606]: control proxy exited from PID 3608
Oct 07 14:17:12 dhcpcd[3606]: network proxy exited from PID 3607
Oct 07 14:17:12 dhcpcd[3606]: dhcpcd exited
Oct 07 14:17:12 dhcpcd[3606]: privileged proxy will exit from PID 3606

@ColinMcInnes
Copy link
Contributor

image

@ColinMcInnes
Copy link
Contributor

That packet was captured from the server side.

@Sime-Zupanovic
Copy link
Contributor

I have download you patch and give it a try on the dhcpcd version 10.2.2 baseline,
But I was getting a segmentation fault program crash in start up.

Thread 1 (LWP 8014):
#0 0x0000005579b13ef0 in dhcpcd_handlecarrier (ifp=ifp@entry=0x5583bae2b0, carrier=-1, flags=4098) at dhcpcd.c:778
#1 0x0000005579b28970 in link_netlink (arg=, nlm=, ctx=) at if-linux.c:1158
#2 link_netlink (ctx=0x7fd2d4da88, arg=, nlm=0x7fd2d499a8) at if-linux.c:1051
#3 0x0000005579b27374 in if_getnetlink (ctx=ctx@entry=0x7fd2d4da88, iov=0x7fd2d49998, iov@entry=0x7fd2d4d9b8, fd=9, flags=flags@entry=64, cb=cb@entry=0x5579b284b0 <link_netlink>, cbarg=cbarg@entry=0x0) at if-linux.c:666
--Type for more, q to quit, c to continue without paging--
#4 0x0000005579b27600 in if_handlelink (ctx=ctx@entry=0x7fd2d4da88) at if-linux.c:1173
#5 0x0000005579b150a0 in dhcpcd_handlelink (arg=0x7fd2d4da88, events=) at dhcpcd.c:1184
#6 0x0000005579b16cc4 in eloop_run_ppoll (signals=0x7fd2d4dcc8, ts=, eloop=0x5583b9d830) at eloop.c:1106
#7 eloop_start (eloop=0x5583b9d830, signals=signals@entry=0x7fd2d4dcc8) at eloop.c:1228
#8 0x0000005579b0fab4 in main (argc=, argv=, envp=) at dhcpcd.c:2707

(gdb)
(gdb) frame 0
#0 0x0000005579b13ef0 in dhcpcd_handlecarrier (ifp=ifp@entry=0x5583bae2b0, carrier=-1, flags=4098) at dhcpcd.c:778
778 in dhcpcd.c
(gdb) p *ifp
$1 = {ctx = 0x7fd2d4da88, next = {tqe_next = 0x5583bae3c0, tqe_prev = 0x5583bae1a8}, name = "port1\000\000\000\000\000\000\000\000", index = 45, active = 0, flags = 4098, hwtype = 1,
hwaddr = "Xp\177\220F\222", '\000' <repeats 13 times>, hwlen = 6 '\006', mtu = 2000, vlanid = 0, metric = 1045, carrier = -1, wireless = false, ssid = '\000' <repeats 31 times>, ssid_len = 0,
profile = '\000' <repeats 63 times>, options = 0x0, if_data = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}}
(gdb) p was_link_up
$2 = false

See was_link_up was false.

It seems we can get some network event even for interface which is not up. What is a bit weird to me?

So, after I applied your patch, see attached. protocols_will_notify_dhcpcd_can_exit.txt
I added this small fix.
fix_ifp_option_check_in_handlecarrier.txt

Just moved DHCPCD_STOPPING check below the carrier-down handling block, after basic interface validity has been verified.
Now, testing looks ok, no further program crashes. Will you consider this additional small patch?

@rsmarples
Copy link
Member Author

I think it still appears to garble outgoing DHCPv6 Release6 message, which appears to be getting cut off.


Oct 07 14:17:11 dhcpcd[3606]: control command: /sbin/dhcpcd --debug --config /etc/dhcp/dhcpcd.conf --release eth0

Oct 07 14:17:11 dhcpcd[3606]: eth0: removing interface

Oct 07 14:17:11 dhcpcd[3606]: eth0: multicasting RELEASE6 (xid 0x37da53), next in 1.0 seconds

Oct 07 14:17:11 dhcpcd[3606]: eth0: executing: /usr/libexec/dhcpcd-run-hooks SEND6_RELEASE6

Oct 07 14:17:11 dhcpcd[3606]: received SIGTERM, stopping

Oct 07 14:17:11 dhcpcd[3606]: eth0: removing interface

Oct 07 14:17:11 dhcpcd[3606]: eth0: executing: /usr/libexec/dhcpcd-run-hooks RELEASE6

Oct 07 14:17:11 dhcpcd[3606]: eth0: deleting address 2001:192:168:226::119/128

Oct 07 14:17:11 dhcpcd[3606]: eth0: executing: /usr/libexec/dhcpcd-run-hooks STOP

Oct 07 14:17:11 dhcpcd[3606]: script_runreason: No such file or directory

Oct 07 14:17:11 dhcpcd[3606]: eth0: executing: /usr/libexec/dhcpcd-run-hooks RELEASE

Oct 07 14:17:11 dhcpcd[3606]: eth0: executing: /usr/libexec/dhcpcd-run-hooks STOPPED

Oct 07 14:17:12 dhcpcd[3606]: eth0: BPF BOOTP exited from PID 4114

Oct 07 14:17:12 dhcpcd[3606]: control proxy exited from PID 3608

Oct 07 14:17:12 dhcpcd[3606]: network proxy exited from PID 3607

Oct 07 14:17:12 dhcpcd[3606]: dhcpcd exited

Oct 07 14:17:12 dhcpcd[3606]: privileged proxy will exit from PID 3606

It looks like dhcpcd had two commands to stop.
I've only been testing by using SIGTERM, but it looks like you had a stop event over the control socket too?

How are you stopping dhcpcd?

@rsmarples
Copy link
Member Author

I have download you patch and give it a try on the dhcpcd version 10.2.2 baseline,

But I was getting a segmentation fault program crash in start up.

Thread 1 (LWP 8014):

#0 0x0000005579b13ef0 in dhcpcd_handlecarrier (ifp=ifp@entry=0x5583bae2b0, carrier=-1, flags=4098) at dhcpcd.c:778

#1 0x0000005579b28970 in link_netlink (arg=, nlm=, ctx=) at if-linux.c:1158

#2 link_netlink (ctx=0x7fd2d4da88, arg=, nlm=0x7fd2d499a8) at if-linux.c:1051

#3 0x0000005579b27374 in if_getnetlink (ctx=ctx@entry=0x7fd2d4da88, iov=0x7fd2d49998, iov@entry=0x7fd2d4d9b8, fd=9, flags=flags@entry=64, cb=cb@entry=0x5579b284b0 <link_netlink>, cbarg=cbarg@entry=0x0) at if-linux.c:666

--Type for more, q to quit, c to continue without paging--

#4 0x0000005579b27600 in if_handlelink (ctx=ctx@entry=0x7fd2d4da88) at if-linux.c:1173

#5 0x0000005579b150a0 in dhcpcd_handlelink (arg=0x7fd2d4da88, events=) at dhcpcd.c:1184

#6 0x0000005579b16cc4 in eloop_run_ppoll (signals=0x7fd2d4dcc8, ts=, eloop=0x5583b9d830) at eloop.c:1106

#7 eloop_start (eloop=0x5583b9d830, signals=signals@entry=0x7fd2d4dcc8) at eloop.c:1228

#8 0x0000005579b0fab4 in main (argc=, argv=, envp=) at dhcpcd.c:2707

(gdb)

(gdb) frame 0

#0 0x0000005579b13ef0 in dhcpcd_handlecarrier (ifp=ifp@entry=0x5583bae2b0, carrier=-1, flags=4098) at dhcpcd.c:778

778 in dhcpcd.c

(gdb) p *ifp

$1 = {ctx = 0x7fd2d4da88, next = {tqe_next = 0x5583bae3c0, tqe_prev = 0x5583bae1a8}, name = "port1\000\000\000\000\000\000\000\000", index = 45, active = 0, flags = 4098, hwtype = 1,

hwaddr = "Xp\177\220F\222", '\000' <repeats 13 times>, hwlen = 6 '\006', mtu = 2000, vlanid = 0, metric = 1045, carrier = -1, wireless = false, ssid = '\000' <repeats 31 times>, ssid_len = 0,

profile = '\000' <repeats 63 times>, options = 0x0, if_data = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}}

(gdb) p was_link_up

$2 = false

See was_link_up was false.

It seems we can get some network event even for interface which is not up. What is a bit weird to me?

So, after I applied your patch, see attached. protocols_will_notify_dhcpcd_can_exit.txt

I added this small fix.

fix_ifp_option_check_in_handlecarrier.txt

Just moved DHCPCD_STOPPING check below the carrier-down handling block, after basic interface validity has been verified.

Now, testing looks ok, no further program crashes. Will you consider this additional small patch?

Thanks for that. I committed a slightly different patch as we still need to return early I think.
Does it work for you now?

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
src/dhcpcd.c (1)

460-488: Consider using a separate loop variable to avoid shadowing the parameter.

The function correctly coordinates interface teardown, but the TAILQ_FOREACH at line 480 reuses the ifp parameter as the loop iterator. While this pattern is common and works here (the parameter is no longer needed after line 475), it's error-prone and can confuse readers.

Apply this diff to use a separate loop variable:

 void
 dhcpcd_dropped(struct interface *ifp)
 {
 	struct dhcpcd_ctx *ctx = ifp->ctx;
+	struct interface *ifp1;
 
 	if (ifp->options == NULL ||
 	    !(ifp->options->options & DHCPCD_STOPPING) ||
 	    dhcpcd_ifrunning(ifp))
 		return;
 
 	/* De-activate the interface */
 	if (ifp->active) {
 		ifp->active = IF_INACTIVE;
 		ifp->options->options &= ~DHCPCD_STOPPING;
 		script_runreason(ifp, "STOPPED");
 	}
 
 	if (!(ctx->options & DHCPCD_EXITING))
 		return;
 
-	TAILQ_FOREACH(ifp, ctx->ifaces, next) {
-		if (dhcpcd_ifrunning(ifp))
+	TAILQ_FOREACH(ifp1, ctx->ifaces, next) {
+		if (dhcpcd_ifrunning(ifp1))
 			break;
 	}
 
 	/* All interfaces have stopped, we can exit */
-	if (ifp == NULL)
+	if (ifp1 == NULL)
 		eloop_exit(ctx->eloop, EXIT_SUCCESS);
 }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 57fed36 and 13d0ed9.

📒 Files selected for processing (1)
  • src/dhcpcd.c (11 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/dhcpcd.c (4)
src/script.c (1)
  • script_runreason (745-802)
src/eloop.c (1)
  • eloop_exit (652-660)
src/if.c (1)
  • if_free (81-104)
compat/pidfile.c (1)
  • pidfile_read (135-167)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
  • GitHub Check: netbsd (--disable-ipv6, -DSMALL)
  • GitHub Check: netbsd (--disable-ipv4, -DSMALL)
  • GitHub Check: netbsd (--disable-dhcp6, -DSMALL)
  • GitHub Check: netbsd (--disable-arp)
  • GitHub Check: netbsd (--disable-arp, -DSMALL)
  • GitHub Check: netbsd (--disable-dhcp6)
  • GitHub Check: netbsd (--disable-ipv4ll)
  • GitHub Check: netbsd (--disable-ipv4ll, -DSMALL)
  • GitHub Check: netbsd (--disable-ipv6)
  • GitHub Check: netbsd (--disable-ipv4)
  • GitHub Check: netbsd
  • GitHub Check: netbsd (-DSMALL)
  • GitHub Check: freebsd
  • GitHub Check: openbsd
🔇 Additional comments (11)
src/dhcpcd.c (11)

441-458: LGTM!

The dhcpcd_ifrunning helper correctly checks whether any protocol (DHCPv4, IPv4LL, DHCPv6) is still active on the interface. The logic is straightforward and aligns with the PR's goal of tracking running protocols before allowing exit.


490-497: LGTM!

The refactored stop_interface correctly sets the STOPPING flag before calling dhcpcd_drop, establishing the contract that dhcpcd_dropped later checks to determine when teardown is complete.


782-791: LGTM!

The guard correctly prevents carrier events from interfering during shutdown. The dual check (interface options vs. context) handles both active and inactive interfaces appropriately, and the early return avoids unnecessary protocol churn during teardown.


1126-1127: LGTM!

The guard correctly prevents new interface events from being processed during exit, avoiding races where an interface addition could interfere with the shutdown sequence.


1190-1191: LGTM!

The guard prevents link state changes from triggering protocol actions during exit, maintaining consistency with the shutdown guards in other handlers.


1433-1458: LGTM with observation.

The refactored function correctly:

  • Returns a bool indicating whether any interfaces were stopped (enabling callers to decide whether to wait for completion)
  • Processes interfaces in reverse order for proper teardown sequencing
  • Sets both STOPPING and EXITING flags

Note that line 1453 sets EXITING on interface options in addition to the context-level flag (line 1439). This duplication might be intentional for per-interface tracking, but consider whether both are necessary.


1573-1582: LGTM!

The modified signal handler correctly:

  • Guards against re-processing SIGTERM/SIGINT during teardown (preventing memory issues as noted in the comment)
  • Uses the return value from stop_all_interfaces to determine whether to wait for protocol completion or exit immediately
  • Defers the exit to dhcpcd_dropped when interfaces are still running

This is the core coordination logic that addresses the PR objectives.


1724-1728: LGTM!

The control command handler correctly mirrors the signal handler's coordination pattern, deferring exit until protocols complete when interfaces are actually stopped.


1959-1963: LGTM!

The modification correctly implements polling behavior, rescheduling the timeout check every 100ms until the pidfile indicates the process has exited.


2340-2341: LGTM!

The 50-second backstop timeout prevents indefinite hanging when waiting for another dhcpcd process to exit, providing a reasonable escape hatch.


1966-1979: Past review comment not fully addressed.

A previous review (marked as "✅ Addressed in commit cd317bf") flagged that if pidfile_read returns -1, logging "pid -1 failed to exit" is misleading (lines 1972-1976). However, the current code does not explicitly handle the -1 case before logging. The same check exists in dhcpcd_pidfile_timeout at line 1959, but is missing here.

Apply this diff to handle the error case as previously suggested:

 static void
 dhcpcd_exit_timeout(void *arg)
 {
 	struct dhcpcd_ctx *ctx = arg;
 	pid_t pid;
 
 	pid = pidfile_read(ctx->pidfile);
-	logwarnx("pid %lld failed to exit", (long long)pid);
-	eloop_exit(ctx->eloop, EXIT_FAILURE);
+	if (pid == -1)
+		eloop_exit(ctx->eloop, EXIT_SUCCESS);
+	else {
+		logwarnx("pid %lld failed to exit", (long long)pid);
+		eloop_exit(ctx->eloop, EXIT_FAILURE);
+	}
 }

Likely an incorrect or invalid review comment.

@Sime-Zupanovic
Copy link
Contributor

With my patch no crashes.
But you are correct, we should have this check earlier in dhcpcd_handlecarrier(), immediately on top.

if (ifp->options && (ifp->options->options & DHCPCD_STOPPING))
	return;

But based on pmd in offline gdb I think checking ifp->ctx->options can crash again, see below value is something wild:

(gdb) p *ifp->ctx
$9 = {pidfile = "/run/dhcpcd/port0.pid", '\000' <repeats 13 times>, vendor = "dhcpcd-10.2.2:", '\000' <repeats 191 times>, fork_fd = -1,
cffile = 0x55b41f0340 "/etc/pnc-dhcpcd.conf", options = 310326640726562825, logfile = 0x0, argc = 5, argv = 0x7fffd76148, ifac = 0, ifav = 0x0, ifdc = 0, ifdv = 0x0, ifc = 1, ifv = 0x7fffd76168, ifcc = 1,
ifcv = 0x55b4205600, duid_type = 0 '\000', duid = 0x55b420f6f0 "", duid_len = 16, ifaces = 0x55b42115d0, ctl_buf = 0x0, ctl_buflen = 0, ctl_bufpos = 0, ctl_extra = 0, routes = {rbt_root = 0x55b42130d8,
rbt_ops = 0x55940604c0 <rt_compare_os_ops>, rbt_minmax = {0x55b41f22c8, 0x55b4213008}}, froutes = {rbt_root = 0x55b4212238, rbt_ops = 0x55940604e0 <rt_compare_free_ops>, rbt_minmax = {0x55b41f1ad8,
0x55b4213278}}, rt_order = 1, pf_inet_fd = 15, priv = 0x55b4211530, link_fd = 9, link_rcvbuf = 0, seq = 70, sseq = 0, sigset = {__val = {0 <repeats 16 times>}}, eloop = 0x55b4205830,
script = 0x559404a738 "/usr/libexec/dhcpcd-run-hooks", script_fp = 0x55b4209410, script_buf = 0x55b42282c0 "PATH=/usr/bin:/usr/sbin:/bin:/sbin", script_buflen = 719, script_env = 0x55b42132b0, script_envlen = 32,
control_fd = -1, control_unpriv_fd = -1, control_fds = {tqh_first = 0x55b4210020, tqh_last = 0x55b4210020}, control_sock = "/run/dhcpcd/port0.sock", '\000' <repeats 14 times>,
control_sock_unpriv = "/run/dhcpcd/port0.unpriv.sock", '\000' <repeats 14 times>, control_group = 0, vivso = 0x0, vivso_len = 0, randomstate = 0x0, ps_user = 0x7f88514668 , ps_processes = {
tqh_first = 0x55b4210f10, tqh_last = 0x55b4211310}, ps_root = 0x55b4210f10, ps_inet = 0x55b42110f0, ps_ctl = 0x55b4211310, ps_data_fd = 8, ps_log_fd = -1, ps_log_root_fd = -1, ps_eloop = 0x55b4210e70,
ps_control = 0x55b4210020, ps_control_client = 0x0, dhcp_opts = 0x55b4201da0, dhcp_opts_len = 157, udp_rfd = -1, udp_wfd = -1, opt_buffer = 0x0, opt_buffer_len = 0, secret = 0x0, secret_len = 0, nd_fd = -1,
ra_routers = 0x55b42279a0, nd_opts = 0x55b42049d0, nd_opts_len = 7, dhcp6_rfd = -1, dhcp6_wfd = -1, dhcp6_opts = 0x55b420d940, dhcp6_opts_len = 84, dev_load = 0x0, dev_fd = -1, dev = 0x0, dev_handle = 0x0}

maybe just to do:

void
dhcpcd_handlecarrier(struct interface *ifp, int carrier, unsigned int flags)
{
bool was_link_up = if_is_link_up(ifp);
bool was_roaming = if_roaming(ifp);
ifp->carrier = carrier;
ifp->flags = flags;

if (ifp->options && (ifp->options->options & DHCPCD_STOPPING))
	return;

for inactive interface we will have ifp->options == NULL and skip the above check safely.

we will have was_link_up = false

and we will return
if (!if_is_link_up(ifp)) {
if (!ifp->active || (!was_link_up && !was_roaming)) --- as ifp->active = 0 and was_link_up = false
return; -- here

@ColinMcInnes
Copy link
Contributor

It looks like dhcpcd had two commands to stop. I've only been testing by using SIGTERM, but it looks like you had a stop event over the control socket too?

How are you stopping dhcpcd?

Same as in #513
On startup, via systemd I start dhcpcd in managed mode, and then on interface up (not readable through ifconfig unfortunately) I start up on eth0 via a trigger. That way I can control interface bringup from external stimuli. A corner case for my particular setup.

On system reboot, it stops those services in reverse order, sending systemctl stop to dhcpcd@.service, then stop to dhcpcd.service

Both service files in my testing have --release (@.service includes the interface).

My current workaround is to omit the ExecStop from dhcpcd@.service, relying on ExecReload instead, and only doing a --release in the main dhcpcd.service. And that seems to work ok.

If that's a valid workaround, I'm ok with moving forward marking this as a fix for the other issues.

@Sime-Zupanovic
Copy link
Contributor

any comment on dhcpcd_handlecarrier() ifp->options->options check?

@Sime-Zupanovic
Copy link
Contributor

After we delivered your : Protocols will notify when dhcpcd can exit patch with that additional updated in dhcpcd_handlecarrier()
Unfortunately we got new crash in dhcp_discover(), again in dhcpcd stop scenario.
where based on log:
dhcpcd[7837]: ribridge0: read_lease: truncated lease

we entered through dhcp_start1(), which probably was called from dhcp_start()

This is pmd backtrace

Program terminated with signal SIGSEGV, Segmentation fault.
#0 dhcp_discover (arg=0x5589e589a0) at dhcp.c:2006
2006 dhcp.c: No such file or directory.
(gdb) bt
#0 dhcp_discover (arg=0x5589e589a0) at dhcp.c:2006
#1 0x00000055757cce64 in eloop_start (eloop=0x5589e47830, signals=signals@entry=0x7fdf70e068) at eloop.c:1201
#2 0x00000055757c5ab4 in main (argc=, argv=, envp=) at dhcpcd.c:2707

(gdb) bt full
#0 dhcp_discover (arg=0x5589e589a0) at dhcp.c:2006
ifp = 0x5589e589a0
state = 0x0
ifo = 0x5589e58bc0
#1 0x00000055757cce64 in eloop_start (eloop=0x5589e47830, signals=signals@entry=0x7fdf70e068) at eloop.c:1201
error =
t = 0x5589e47250
ts = {tv_sec = 0, tv_nsec = 763219470}
tsp =
PRETTY_FUNCTION = "eloop_start"
#2 0x00000055757c5ab4 in main (argc=, argv=, envp=) at dhcpcd.c:2707
ctx = {pidfile = "/run/dhcpcd/ribridge0.pid", '\000' <repeats 13 times>, vendor = "dhcpcd-10.2.2:", '\000' <repeats 191 times>, fork_fd = -1, cffile = 0x5589e32340 "/etc/pnc-dhcpcd.conf", options = 310344232912607245, logfile = 0x0, argc = 5, argv = 0x7fdf70e478, ifac = 0, ifav = 0x0, ifdc = 0, ifdv = 0x0, ifc = 1, ifv = 0x7fdf70e498, ifcc = 1, ifcv = 0x5589e47600, duid_type = 0 '\000', duid = 0x5589e516f0 "", duid_len = 16, ifaces = 0x5589e535d0, ctl_buf = 0x0, ctl_buflen = 0, ctl_bufpos = 0, ctl_extra = 0, routes = {rbt_root = 0x0, rbt_ops = 0x55758194c0 <rt_compare_os_ops>, rbt_minmax = {0x7fdf70e000, 0x7fdf70e000}}, froutes = {rbt_root = 0x5589e54588, rbt_ops = 0x55758194e0 <rt_compare_free_ops>, rbt_minmax = {0x5589e33948, 0x5589e55448}}, rt_order = 0, pf_inet_fd = 15, priv = 0x5589e53530, link_fd = 9, link_rc--Type for more, q to quit, c to continue without paging--
vbuf = 0, seq = 68, sseq = 0, sigset = {__val = {0 <repeats 16 times>}}, eloop = 0x5589e47830, script = 0x55758036a8 "/usr/libexec/dhcpcd-run-hooks", script_fp = 0x5589e4b410, script_buf = 0x5589e6a4a0 "PATH=/usr/bin:/usr/sbin:/bin:/sbin", script_buflen = 259, script_env = 0x5589e54f80, script_envlen = 32, control_fd = -1, control_unpriv_fd = -1, control_fds = {tqh_first = 0x5589e52020, tqh_last = 0x5589e52020}, control_sock = "/run/dhcpcd/port0.sock", '\000' <repeats 14 times>, control_sock_unpriv = "/run/dhcpcd/port0.unpriv.sock", '\000' <repeats 14 times>, control_group = 0, vivso = 0x0, vivso_len = 0, randomstate = 0x0, ps_user = 0x7fb25c7668 , ps_processes = {tqh_first = 0x5589e52f10, tqh_last = 0x5589e53310}, ps_root = 0x5589e52f10, ps_inet = 0x5589e530f0, ps_ctl = 0x5589e53310, ps_data_fd = 8, ps_log_fd = -1, ps_log_root_fd = -1, ps_eloop = 0x5589e52e70, ps_control = 0x5589e52020, ps_control_client = 0x0, dhcp_opts = 0x5589e43da0, dhcp_opts_len = 157, udp_rfd = -1, udp_wfd = -1, opt_buffer = 0x0, opt_buffer_len = 0, secret = 0x0, secret_len = 0, nd_fd = -1, ra_routers = 0x5589e69a30, nd_opts = 0x5589e469d0, nd_opts_len = 7, dhcp6_rfd = -1, dhcp6_wfd = -1, dhcp6_opts = 0x5589e4f940, dhcp6_opts_len = 84, dev_load = 0x0, dev_fd = -1, dev = 0x0, dev_handle = 0x0}
ifaddrs = 0x0
ifo = 0x0
ifp =
family = 0
opt =
oi = 4
i = 1
logopts =
t =
len =
pid =
fork_fd = {5, 6}
sig =
siga = 0x0
si = 1
func = "main"

(gdb) thread apply all bt

Thread 1 (LWP 7836):
#0 dhcp_discover (arg=0x5589e589a0) at dhcp.c:2006
#1 0x00000055757cce64 in eloop_start (eloop=0x5589e47830, signals=signals@entry=0x7fdf70e068) at eloop.c:1201
#2 0x00000055757c5ab4 in main (argc=, argv=, envp=) at dhcpcd.c:2707

surely problem in dhcp_discover() is access to null pointer: state->state

void
dhcp_discover(void *arg)
{
struct interface *ifp = arg;
struct dhcp_state *state = D_STATE(ifp);
struct if_options *ifo = ifp->options;

state->state = DHS_DISCOVER;
...

from offline debugger above we can see:

(gdb) p *ifp
$1 = {ctx = 0x7fdf70de28, next = {tqe_next = 0x5589e58ab0, tqe_prev = 0x5589e58898}, name = "port0\000\000\000\000\000\000", index = 51, active = 0, flags = 69699, hwtype = 1,
hwaddr = "\360\261\a4\253O", '\000' <repeats 13 times>, hwlen = 6 '\006', mtu = 2000, vlanid = 0, metric = 1051, carrier = 1, wireless = false, ssid = '\000' <repeats 31 times>, ssid_len = 0,
profile = '\000' <repeats 63 times>, options = 0x5589e58bc0, if_data = {0x5589e473d0, 0x0, 0x0, 0x0, 0x5589e6a200, 0x5589e476a0, 0x0}}
(gdb) p *ifp->options
value of type `if_options' requires 67616 bytes, which is more than max-value-size

(gdb) p *ifp->options->options
Cannot access memory at address 0x44f902d80d85809
(gdb)
(gdb) p *ifp->ctx->options
Cannot access memory at address 0x44e906f80d8d80d
(gdb)

so, we can not use checking of DHCPCD_STOPPING at this point.

quick fix would be sanity check, like:

void
dhcp_discover(void *arg)
{
struct interface *ifp = arg;
if (ifp == NULL || ifp->options == NULL)
return;

struct dhcp_state *state = D_STATE(ifp);
struct if_options *ifo = ifp->options;


if (ifo && (ifo->options & DHCPCD_STOPPING))
	return;

if (state == NULL)
	return;
state->state = DHS_DISCOVER;
...

But could we fix this somewhere earlier?
Like to cancel all eloop_timeout_add*(), in order not to enter another racing condition?
Any thoughts?

@Sime-Zupanovic
Copy link
Contributor

As we need to resolve these segmentation crashes on dhcpcd stop.
I had in mind to add defensive macro guard for async callbacks.
To check if ifp, ifp->options, and D_STATE(ifp) are valid.
Take a look on potential patch.
dhcp-guard-async-callbacks-against-freed-if-state.txt

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 13d0ed9 and cbcdce4.

📒 Files selected for processing (8)
  • hooks/dhcpcd-run-hooks.8.in (2 hunks)
  • src/dhcp.c (7 hunks)
  • src/dhcp6.c (3 hunks)
  • src/dhcpcd.c (11 hunks)
  • src/dhcpcd.h (1 hunks)
  • src/ipv4ll.c (2 hunks)
  • src/ipv6nd.c (1 hunks)
  • src/route.c (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
  • hooks/dhcpcd-run-hooks.8.in
  • src/dhcpcd.h
  • src/route.c
  • src/dhcp.c
  • src/ipv6nd.c
🧰 Additional context used
🧬 Code graph analysis (3)
src/dhcpcd.c (4)
src/script.c (1)
  • script_runreason (745-802)
src/eloop.c (1)
  • eloop_exit (556-561)
src/if.c (1)
  • if_free (81-104)
compat/pidfile.c (1)
  • pidfile_read (135-167)
src/dhcp6.c (1)
src/dhcpcd.c (1)
  • dhcpcd_dropped (460-488)
src/ipv4ll.c (1)
src/dhcpcd.c (1)
  • dhcpcd_dropped (460-488)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
  • GitHub Check: freebsd
  • GitHub Check: netbsd (--disable-arp)
  • GitHub Check: netbsd (--disable-dhcp6)
  • GitHub Check: netbsd (--disable-ipv4ll)
  • GitHub Check: netbsd (--disable-arp, -DSMALL)
  • GitHub Check: netbsd (--disable-ipv6, -DSMALL)
  • GitHub Check: netbsd (--disable-ipv4)
  • GitHub Check: netbsd (--disable-dhcp6, -DSMALL)
  • GitHub Check: netbsd (--disable-ipv6)
  • GitHub Check: netbsd (--disable-ipv4, -DSMALL)
  • GitHub Check: netbsd (--disable-ipv4ll, -DSMALL)
  • GitHub Check: netbsd
  • GitHub Check: netbsd (-DSMALL)
  • GitHub Check: openbsd

rsmarples and others added 6 commits November 3, 2025 09:46
DHCPv6 RELEASE requires the addresses to be dropped before
a RELEASE message is sent.
We now wait for an acknowledgement or a timeout before notifying
that DHCPv6 has stopped for the interface.

DHCPv4 RELEASE is the other way around, there is no acknowledgement.
So we wait for 1 second after sending the message before removing
the address and notifying DHCP has stopped for the interface.

If we are not releasing then we notify dhcpcd that the protocol has
stopped right away when we drop the lease.

dhcpcd will exit once there are no running protocols for the
interfaces.

Fixes #513.
Hopefully #535, #519 and #509 as well.
If allows an interface to stop even if dhcpcd is not.
While here address some hangup issues.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
src/dhcp6.c (1)

2113-2132: DHCPv6 RELEASE should drop addresses before sending (but keep IA list for the message)

To meet the PR objective and RFC 8415 intent, remove the IPv6 addresses from the OS before the RELEASE is transmitted, while still including IA/IAADDR options in the RELEASE. Safe point: after dhcp6_makemessage() succeeds, before dhcp6_sendrelease().

Also ensure we never clear the in‑memory IA list before the OS drop, otherwise kernel addresses remain configured (a previously reported pitfall).

Apply this change:

 static void
 dhcp6_startrelease(struct interface *ifp)
 {
   struct dhcp6_state *state;

   state = D6_STATE(ifp);
   if (state->state != DH6S_BOUND) {
     dhcp6_finishrelease(ifp);
     return;
   }

   state->state = DH6S_RELEASE;
   state->RTC = 0;
   state->IMD = REL_MAX_DELAY;
   state->IRT = REL_TIMEOUT;
   state->MRT = REL_MAX_RT;
   state->MRC = REL_MAX_RC;
   state->MRCcallback = dhcp6_finishrelease;

-  if (dhcp6_makemessage(ifp) == -1) {
+  if (dhcp6_makemessage(ifp) == -1) {
     logerr("%s: %s", __func__, ifp->name);
     /* not much we can do apart from finish now */
     dhcp6_finishrelease(ifp);
-  } else
-    dhcp6_sendrelease(ifp);
+  } else {
+    /* Drop non-delegated addresses from the OS before sending RELEASE. */
+    dhcp6_freedrop_addrs(ifp, 1, IPV6_AF_DELEGATED, NULL);
+    dhcp6_sendrelease(ifp);
+  }
 }

Additionally, double‑check no other paths clear the IA list without dropping kernel addrs. One known spot is the lease‑read failure path; consider flipping the “drop” flag there as well:

- dhcp6_freedrop_addrs(ifp, 0, IPV6_AF_DELEGATED, NULL);
+ dhcp6_freedrop_addrs(ifp, 1, IPV6_AF_DELEGATED, NULL);

(Located in dhcp6_readlease() error path.)

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cbcdce4 and 93305c2.

📒 Files selected for processing (9)
  • hooks/dhcpcd-run-hooks.8.in (2 hunks)
  • src/dhcp.c (7 hunks)
  • src/dhcp6.c (3 hunks)
  • src/dhcpcd.c (10 hunks)
  • src/dhcpcd.h (1 hunks)
  • src/ipv4ll.c (2 hunks)
  • src/ipv6nd.c (1 hunks)
  • src/privsep.c (2 hunks)
  • src/route.c (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/dhcp.c
🧰 Additional context used
🧬 Code graph analysis (5)
src/ipv4ll.c (1)
src/dhcpcd.c (1)
  • dhcpcd_dropped (460-488)
src/dhcp6.c (1)
src/dhcpcd.c (1)
  • dhcpcd_dropped (460-488)
src/dhcpcd.h (1)
src/dhcpcd.c (1)
  • dhcpcd_dropped (460-488)
src/route.c (1)
src/if-bsd.c (1)
  • if_missfilter_apply (1638-1660)
src/dhcpcd.c (4)
src/script.c (1)
  • script_runreason (745-802)
src/eloop.c (1)
  • eloop_exit (556-561)
src/if.c (1)
  • if_free (81-104)
compat/pidfile.c (1)
  • pidfile_read (135-167)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
  • GitHub Check: netbsd (--disable-dhcp6)
  • GitHub Check: netbsd (--disable-ipv6)
  • GitHub Check: netbsd (--disable-ipv4ll)
  • GitHub Check: netbsd (--disable-ipv4)
  • GitHub Check: netbsd (--disable-ipv6, -DSMALL)
  • GitHub Check: netbsd
  • GitHub Check: netbsd (--disable-dhcp6, -DSMALL)
  • GitHub Check: netbsd (--disable-arp, -DSMALL)
  • GitHub Check: netbsd (--disable-ipv4, -DSMALL)
  • GitHub Check: netbsd (--disable-ipv4ll, -DSMALL)
  • GitHub Check: netbsd (--disable-arp)
  • GitHub Check: netbsd (-DSMALL)
  • GitHub Check: freebsd
  • GitHub Check: openbsd
🔇 Additional comments (12)
src/dhcpcd.c (2)

782-791: Guarding STOPPING carrier events

The early return for STOPPING interfaces avoids resurrecting link-state work while teardown is in flight, matching the crash reports.


1965-1976: Appreciate the explicit exit timeout

The pidfile-based timeout keeps the CLI wait path bounded instead of hanging forever if the manager never exits.

src/dhcpcd.h (1)

270-270: LGTM: new public API declaration matches implementation

void dhcpcd_dropped(struct interface *); aligns with the implementation and usage across modules. No issues.

src/route.c (1)

827-829: Good guard to avoid route miss-filter churn during exit

Skipping if_missfilter_apply() when DHCPCD_EXITING is set reduces shutdown races. Looks correct.

src/privsep.c (2)

1139-1143: Hangup triggers clean stop path

Treating ELE_HANGUP as zero-length read and reusing the stop path is sound; avoids duplicated cleanup.


1169-1177: Unified stop path sets EXITING and exits eloop

The consolidated stop: path properly sets DHCPCD_EXITING, frees processes, and exits with success on orderly hangup. Looks good.

hooks/dhcpcd-run-hooks.8.in (2)

25-25: Manpage date bump

Date update is fine.


108-110: Documented RELEASE | RELEASE6 hook reasons

Clear and consistent with other reason entries.

src/ipv6nd.c (1)

1929-1932: Avoid RA processing while stopping

Early-returning when DHCPCD_STOPPING is set prevents teardown races. Looks correct.

src/ipv4ll.c (1)

451-451: Good: cancel pending timeouts for this interface

Deleting timeouts keyed by ifp avoids callbacks during teardown.

src/dhcp6.c (2)

3609-3613: REPLY handling for RELEASE ack is correct

Logging and invoking dhcp6_finishrelease(ifp) on RELEASE acknowledgement aligns with the new sequencing.


4297-4297: Notify manager after DHCPv6 teardown

Calling dhcpcd_dropped(ifp) here is the right hook to unblock daemon exit once all protocols stop.

Comment on lines +461 to +488
dhcpcd_dropped(struct interface *ifp)
{
struct dhcpcd_ctx *ctx = ifp->ctx;

/* Delete all timeouts for the interfaces */
eloop_q_timeout_delete(ctx->eloop, ELOOP_QUEUE_ALL, NULL, ifp);
if (ifp->options == NULL ||
!(ifp->options->options & DHCPCD_STOPPING) ||
dhcpcd_ifrunning(ifp))
return;

/* De-activate the interface */
ifp->active = IF_INACTIVE;
ifp->options->options &= ~DHCPCD_STOPPING;
if (ifp->active) {
ifp->active = IF_INACTIVE;
ifp->options->options &= ~DHCPCD_STOPPING;
script_runreason(ifp, "STOPPED");
}

if (!(ctx->options & (DHCPCD_MANAGER | DHCPCD_TEST)))
eloop_exit(ctx->eloop, EXIT_FAILURE);
if (!(ctx->options & DHCPCD_EXITING))
return;

TAILQ_FOREACH(ifp, ctx->ifaces, next) {
if (dhcpcd_ifrunning(ifp))
break;
}

/* All interfaces have stopped, we can exit */
if (ifp == NULL)
eloop_exit(ctx->eloop, EXIT_SUCCESS);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Preserve deferred exit status

When SIGINT (and any non-success signal) arrives we set exit_code = EXIT_FAILURE, but once stop_all_interfaces finds active work we return from the handler and rely on dhcpcd_dropped to terminate. This helper now unconditionally calls eloop_exit(..., EXIT_SUCCESS), so a SIGINT ends up exiting with status 0 whenever an interface was active, breaking the documented signal semantics for callers and automation. Please stash the intended exit code (e.g., a new ctx->pending_exit_code) before returning from the signal handler and reuse it here instead of hard-coding EXIT_SUCCESS.

🤖 Prompt for AI Agents
In src/dhcpcd.c around lines 461-488, dhcpcd_dropped currently calls
eloop_exit(ctx->eloop, EXIT_SUCCESS) unconditionally which ignores any deferred
non-success signal exit status; change it to use a pending exit code stored on
the context (e.g., ctx->pending_exit_code) so the final exit honors the
signal-set value: ensure the signal handler sets ctx->pending_exit_code to
EXIT_FAILURE (or the appropriate non-success value) before returning, initialize
ctx->pending_exit_code to EXIT_SUCCESS where ctx is created, and replace the
hard-coded EXIT_SUCCESS in the eloop_exit call with ctx->pending_exit_code so
deferred exits return the stashed status.

@Sime-Zupanovic
Copy link
Contributor

I see you delivered this patch:
5738d9c

as pasted before, I think this will crash as ctx is wild.

I did just:

--------------------------------- src/dhcpcd.c ---------------------------------
index 5045c0f..d5fe2d9 100755
@@ -775,7 +775,7 @@ dhcpcd_handlecarrier(struct interface *ifp, int carrier, unsigned int flags)
ifp->carrier = carrier;
ifp->flags = flags;

  • if (ifp->options->options & DHCPCD_STOPPING)
  • if (ifp->options && (ifp->options->options & DHCPCD_STOPPING))
    return;

    if (!if_is_link_up(ifp)) {

Also we need to guard in dhcp_discover(void *arg) as state was null.

But I will try your additional patches anyway.

@rsmarples
Copy link
Member Author

as pasted before, I think this will crash as ctx is wild.

That is not possible as ctx is never removed from the ifp and it's set as the first thing after calloc.

@ColinMcInnes
Copy link
Contributor

I tested release branch with my test case (release) and it no longer garbles the DHCPv6 Release packet. it is successfully received by the server, as is the DHCPv4 release.

@ColinMcInnes
Copy link
Contributor

From the logs, there's still some issues with the exit.

sudo systemctl stop dhcpcd@eth0; sudo systemctl stop dhcpcd; sleep 5; sudo systemctl start dhcpcd; sudo systemctl start dhcpcd@eth0

Nov 03 11:25:30 systemd[1]: Stopping A minimalistic network configuration daemon with DHCPv4, rdisc and DHCPv6 support...
Nov 03 11:25:30 dhcpcd[26638]: sending signal ALRM to pid 7851
Nov 03 11:25:30 dhcpcd[26638]: waiting for pid 7851 to exit
Nov 03 11:25:30 dhcpcd[26638]: main: eloop_start: Invalid argument
Nov 03 11:25:30 dhcpcd[26638]: sending signal ALRM to pid 7851
Nov 03 11:25:30 dhcpcd[7852]: received SIGALRM, releasing
Nov 03 11:25:30 dhcpcd[7852]: eth0: removing interface
Nov 03 11:25:30 systemd[1]: dhcpcd.service: Control process exited, code=exited, status=1/FAILURE
Nov 03 11:25:30 dhcpcd[26638]: waiting for pid 7851 to exit
Nov 03 11:25:30 dhcpcd[26638]: main: eloop_start: Invalid argument
Nov 03 11:25:30 dhcpcd[7852]: received SIGTERM, stopping

@rsmarples
Copy link
Member Author

rsmarples commented Nov 3, 2025

From the logs, there's still some issues with the exit.

That looks like an issue with epoll(7) which we recently moved to. It might only work if we are looking at fd's - in this case we are not and just using it for a timeout which ppoll(2) supports.
All my testing has been done on BSD using kqueue(2) and I'm now happy with that.
I'll look at testing on devuan and alpine tomorrow.

As it more matches ppoll semantics.
Note that epoll_pwait does NOT work with zero events, so
use ppoll in this case.
@rsmarples
Copy link
Member Author

@ColinMcInnes your issue is now solved.

But I have a new one I can replicate when dhcpcd exits - ps_stopwait: eloop_start: No such file or directory
Will look into that when I can.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5f92561 and 3134f25.

📒 Files selected for processing (2)
  • src/eloop.c (2 hunks)
  • src/privsep-linux.c (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
  • GitHub Check: netbsd (--disable-dhcp6)
  • GitHub Check: netbsd (--disable-ipv4ll, -DSMALL)
  • GitHub Check: netbsd (--disable-ipv6, -DSMALL)
  • GitHub Check: netbsd (--disable-dhcp6, -DSMALL)
  • GitHub Check: netbsd (--disable-ipv4ll)
  • GitHub Check: netbsd (--disable-ipv4, -DSMALL)
  • GitHub Check: netbsd (--disable-ipv6)
  • GitHub Check: netbsd (--disable-ipv4)
  • GitHub Check: netbsd (--disable-arp)
  • GitHub Check: netbsd (--disable-arp, -DSMALL)
  • GitHub Check: netbsd (-DSMALL)
  • GitHub Check: netbsd
  • GitHub Check: freebsd
  • GitHub Check: openbsd
🔇 Additional comments (2)
src/privsep-linux.c (1)

328-330: LGTM!

The addition of __NR_epoll_pwait2 to the seccomp filter properly mirrors the existing __NR_epoll_pwait pattern and aligns with the new epoll_pwait2 usage introduced in src/eloop.c.

src/eloop.c (1)

938-964: Timeout conversion math is mathematically sound and handles edge cases correctly.

The overflow detection logic properly checks both components: if seconds exceed INT_MAX/1000 or if the combined millisecond value would overflow. The rounding formula (ts->tv_nsec + 999999) / 1000000 correctly rounds nanoseconds up to milliseconds. The code also properly handles the NULL timeout case (returning -1).

@rsmarples
Copy link
Member Author

How are we feeling about merging this to the master branch? Anyone tested it and/or have any objections?

@ColinMcInnes
Copy link
Contributor

@rsmarples
I wouldn't mind testing it on a 6.x kernel, but my 5.4 testing looked good.

@ColinMcInnes
Copy link
Contributor

@rsmarples I wouldn't mind testing it on a 6.x kernel, but my 5.4 testing looked good.

And wow, lots of yocto branches where the kernel mismatches the headers...

Building raspberry pi honister, kernel is 5.10.83, but libc headers are 5.14... so the ld breaks at epoll_pwait2

Will keep trying,

Because, IMHO, linux distro maintainers are dumb.
There is no logical reason to do this.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
src/eloop.c (1)

936-977: LGTM: Robust implementation with three-path strategy.

The refactored eloop_run_epoll correctly handles three scenarios:

  1. Zero events (lines 941-943): Uses ppoll(NULL, 0, ts, sigset) as a signal-aware sleep when no file descriptors need monitoring.

  2. epoll_pwait2 with runtime fallback (lines 944-956): Attempts the modern syscall, detects ENOSYS if unavailable at runtime, caches the result, and falls back to epoll_pwait.

  3. epoll_pwait with timeout conversion (lines 957-977): Converts timespec to milliseconds with proper overflow handling (capping at INT_MAX), ensuring the timeout never wraps.

The timeout conversion formula (tv_nsec + 999999) / 1000000 correctly implements ceiling rounding for nanoseconds-to-milliseconds conversion.

Optional: Consider adding a clarifying comment.

The conditional compilation and goto make the control flow harder to trace. A brief comment above line 941 explaining the three-path strategy would improve maintainability:

/* Three paths: (1) nfds==0 uses ppoll for timed sleep,
 * (2) try epoll_pwait2 if available, fallback on ENOSYS,
 * (3) use epoll_pwait with timeout conversion. */
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4dce5d4 and 448ab77.

📒 Files selected for processing (1)
  • src/eloop.c (3 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: rsmarples
Repo: NetworkConfiguration/dhcpcd PR: 536
File: src/eloop.c:53-59
Timestamp: 2025-11-03T19:24:39.833Z
Learning: In the dhcpcd codebase (NetworkConfiguration/dhcpcd), compile-time feature detection based on header versions (e.g., LINUX_VERSION_CODE for epoll_pwait2) does not require runtime fallback handling for syscall unavailability (ENOSYS). Users are responsible for ensuring headers match their target runtime kernel.
📚 Learning: 2025-11-03T19:24:39.833Z
Learnt from: rsmarples
Repo: NetworkConfiguration/dhcpcd PR: 536
File: src/eloop.c:53-59
Timestamp: 2025-11-03T19:24:39.833Z
Learning: In the dhcpcd codebase (NetworkConfiguration/dhcpcd), compile-time feature detection based on header versions (e.g., LINUX_VERSION_CODE for epoll_pwait2) does not require runtime fallback handling for syscall unavailability (ENOSYS). Users are responsible for ensuring headers match their target runtime kernel.

Applied to files:

  • src/eloop.c
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: netbsd (--disable-arp)
  • GitHub Check: netbsd (--disable-dhcp6, -DSMALL)
  • GitHub Check: netbsd (-DSMALL)
  • GitHub Check: netbsd (--disable-dhcp6)
  • GitHub Check: netbsd (--disable-ipv4ll, -DSMALL)
  • GitHub Check: netbsd (--disable-arp, -DSMALL)
  • GitHub Check: netbsd (--disable-ipv6)
  • GitHub Check: netbsd
  • GitHub Check: netbsd (--disable-ipv4, -DSMALL)
  • GitHub Check: netbsd (--disable-ipv4)
  • GitHub Check: freebsd
  • GitHub Check: openbsd
🔇 Additional comments (2)
src/eloop.c (2)

53-59: LGTM: Feature detection aligns with project design philosophy.

The compile-time detection of epoll_pwait2 based on kernel headers is consistent with the project's design philosophy that users are responsible for matching compile-time headers with runtime kernel versions. The addition of poll.h is necessary for the ppoll() fallback used when nfds == 0.

The runtime ENOSYS fallback implemented later in the code (lines 952-955) provides a safety net for environments where headers don't match the kernel, addressing practical deployment scenarios that surfaced during testing.

Based on learnings.


158-160: LGTM: Correctly caches syscall availability.

The epoll_pwait2_nosys field efficiently caches whether the epoll_pwait2 syscall returned ENOSYS, preventing repeated failed syscall attempts. The field is correctly zero-initialized through the calloc() in eloop_new() (line 826), so no explicit initialization is needed.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 448ab77 and 5483c83.

📒 Files selected for processing (1)
  • src/eloop.c (3 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-11-03T19:24:39.851Z
Learnt from: rsmarples
Repo: NetworkConfiguration/dhcpcd PR: 536
File: src/eloop.c:53-59
Timestamp: 2025-11-03T19:24:39.851Z
Learning: In the dhcpcd codebase (NetworkConfiguration/dhcpcd), compile-time feature detection based on header versions (e.g., LINUX_VERSION_CODE for epoll_pwait2) does not require runtime fallback handling for syscall unavailability (ENOSYS). Users are responsible for ensuring headers match their target runtime kernel.

Applied to files:

  • src/eloop.c
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
  • GitHub Check: netbsd (--disable-dhcp6, -DSMALL)
  • GitHub Check: netbsd (--disable-ipv4)
  • GitHub Check: netbsd (--disable-dhcp6)
  • GitHub Check: netbsd (--disable-arp, -DSMALL)
  • GitHub Check: netbsd (--disable-ipv4ll)
  • GitHub Check: netbsd (--disable-ipv6, -DSMALL)
  • GitHub Check: netbsd (--disable-ipv6)
  • GitHub Check: netbsd (--disable-ipv4ll, -DSMALL)
  • GitHub Check: netbsd (--disable-arp)
  • GitHub Check: netbsd (-DSMALL)
  • GitHub Check: netbsd
  • GitHub Check: netbsd (--disable-ipv4, -DSMALL)
  • GitHub Check: freebsd
  • GitHub Check: openbsd
🔇 Additional comments (3)
src/eloop.c (3)

53-59: Feature detection with runtime fallback added.

The compile-time detection of epoll_pwait2 is appropriate. Note that the code now includes runtime fallback handling (lines 952-955) that detects ENOSYS and falls back to epoll_pwait, which addresses concerns from previous reviews about compile-time/runtime kernel version mismatches.


158-160: LGTM: Clean runtime syscall availability tracking.

The epoll_pwait2_nosys flag is a clean way to avoid repeated failed syscalls. Since it's only ever set to true (never reset), there are no race conditions or state management concerns.


941-943: LGTM: Correct ppoll fallback for zero events.

Using ppoll(NULL, 0, ts, &sigset) when nfds == 0 is the right approach, as epoll cannot reliably handle zero events. This waits on signals and timeouts without monitoring file descriptors.

@borkra
Copy link

borkra commented Nov 10, 2025

@rsmarples isn’t new header old kernel image is the user setup issue? The problem is kernel headers and kernel image are 2 independent packages. When one changes kernel he supposed to update them all together, there is even meta package that ties them all together, and that what one should be changing. But there is no obligation to do that, and no clear documentation of which one to use, so you always start with changing wrong one. I have created and fixed this problem locally more time then I can count.
Bottom line you do not need to handle this case in dhcpcd, user needs to install proper packages on local system.

@rsmarples
Copy link
Member Author

@rsmarples isn’t new header old kernel image is the user setup issue? The problem is kernel headers and kernel image are 2 independent packages. When one changes kernel he supposed to update them all together, there is even meta package that ties them all together, and that what one should be changing. But there is no obligation to do that, and no clear documentation of which one to use, so you always start with changing wrong one. I have created and fixed this problem locally more time then I can count. Bottom line you do not need to handle this case in dhcpcd, user needs to install proper packages on local system.

in the ideal world i would agree but rn there are two reports of it failing because of this and this isn't even released yet.

@ColinMcInnes
Copy link
Contributor

isn’t new header old kernel image is the user setup issue?

Package maintainers, too. Several of my yocto system builds have a slight header/kernel mismatch. Enough to cause a problem. And you don't notice until you get a situation like this where you're using a feature that's in one minor version but not the other, but it doesn't complain until runtime.

Bottom line you do not need to handle this case in dhcpcd, user needs to install proper packages on local system.

The problem is there's no good way at compile time to figure out that the package maintainer screwed up. The way Roy is handling it is correct, check the headers, but it should be something that gets noted in release notes so that package maintainers can double check their setup to ensure they're either firmly in the 5.11+ range, or that their header/build match.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
src/dhcpcd.c (1)

460-488: Exit code from signal is lost.

When a signal triggers shutdown, the handler sets the exit code (EXIT_FAILURE for SIGINT, EXIT_SUCCESS for SIGTERM) and calls eloop_exitallinners(exit_code), but dhcpcd_dropped unconditionally passes EXIT_SUCCESS to eloop_exit on line 487. This means SIGINT (and other non-success signals) incorrectly exit with status 0 when interfaces are active at signal time.

Store the intended exit code in the context (e.g., ctx->pending_exit_code, initialized to EXIT_SUCCESS) when the signal handler returns early, and use it here instead of the hard-coded EXIT_SUCCESS.

Minor: Variable shadowing in loop.

The loop at lines 480–483 reuses the ifp parameter name for iteration, shadowing the function argument. While functionally correct (the parameter is no longer needed), this is confusing. Consider using a different loop variable name (e.g., ifp2 or ifn).

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 29a129f and 9405a42.

📒 Files selected for processing (2)
  • src/dhcpcd.c (11 hunks)
  • src/privsep.c (3 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-11-03T19:24:39.851Z
Learnt from: rsmarples
Repo: NetworkConfiguration/dhcpcd PR: 536
File: src/eloop.c:53-59
Timestamp: 2025-11-03T19:24:39.851Z
Learning: In the dhcpcd codebase (NetworkConfiguration/dhcpcd), compile-time feature detection based on header versions (e.g., LINUX_VERSION_CODE for epoll_pwait2) does not require runtime fallback handling for syscall unavailability (ENOSYS). Users are responsible for ensuring headers match their target runtime kernel.

Applied to files:

  • src/dhcpcd.c
🧬 Code graph analysis (2)
src/privsep.c (1)
src/eloop.c (1)
  • eloop_exitall (617-626)
src/dhcpcd.c (5)
src/script.c (1)
  • script_runreason (745-802)
src/eloop.c (2)
  • eloop_exit (610-615)
  • eloop_exitallinners (628-639)
src/if.c (1)
  • if_free (81-104)
compat/pidfile.c (1)
  • pidfile_read (135-167)
src/privsep-root.c (1)
  • ps_root_stop (973-1020)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
  • GitHub Check: netbsd (--disable-ipv6, -DSMALL)
  • GitHub Check: netbsd (--disable-dhcp6, -DSMALL)
  • GitHub Check: netbsd (--disable-dhcp6)
  • GitHub Check: netbsd (--disable-ipv6)
  • GitHub Check: netbsd (--disable-ipv4)
  • GitHub Check: netbsd (--disable-ipv4ll)
  • GitHub Check: netbsd (--disable-ipv4ll, -DSMALL)
  • GitHub Check: netbsd (-DSMALL)
  • GitHub Check: netbsd (--disable-arp, -DSMALL)
  • GitHub Check: netbsd (--disable-ipv4, -DSMALL)
  • GitHub Check: netbsd (--disable-arp)
  • GitHub Check: netbsd
  • GitHub Check: freebsd
  • GitHub Check: openbsd
🔇 Additional comments (9)
src/privsep.c (2)

763-764: LGTM! Correct error check.

The change from error == EXIT_SUCCESS to error < 0 properly detects eloop_start errors (which return negative values), fixing the incorrect success-check logic.


1139-1176: LGTM! Coordinated stop path.

The ELE_HANGUP handling and switch to eloop_exitall correctly coordinate shutdown across all event loops (including privsep inner loops), aligning with the broader teardown sequencing introduced in this PR.

src/dhcpcd.c (7)

441-458: LGTM! Correct protocol state detection.

The function correctly checks for active DHCP/IPv4LL/DHCPv6 state on an interface, providing the foundation for coordinated shutdown logic.


490-497: LGTM! Simplified stop flow.

The function correctly delegates to protocol-specific teardown via dhcpcd_drop, relying on the new coordination mechanism where protocols notify via dhcpcd_dropped when complete.


1119-1119: LGTM! Shutdown handling improvements.

Line 1119 correctly uses the new stop_interface flow, and lines 1126–1127 properly skip interface discovery when exiting.

Also applies to: 1126-1127


1430-1455: LGTM! Return type change supports coordination.

Changing the return type to bool correctly signals whether any interface work was started, allowing callers to decide whether to defer exit until dhcpcd_dropped notifications arrive.


1571-1583: Coordination flow is correct, but exit code handling issue remains.

The logic correctly prevents duplicate stops, sets the EXITING flag, and defers exit when interfaces are active. However, this interacts with the exit code preservation issue in dhcpcd_dropped (lines 460–488) — the exit_code passed to eloop_exitallinners is lost when dhcpcd_dropped later calls eloop_exit(..., EXIT_SUCCESS).


1721-1727: LGTM! Consistent control command handling.

The exit/release command handling correctly uses the new coordination flow: exit immediately if no interfaces are active, otherwise wait for dhcpcd_dropped notifications.

Also applies to: 1756-1756


1958-1963: LGTM! Exit timeout handling is correct.

The combination of dhcpcd_pidfile_timeout (polling every 100ms) and dhcpcd_exit_timeout (enforcing 50-second limit) correctly implements wait-with-timeout for process exit. The check for pid == -1 on lines 1958 and 1972 properly handles the case when the pidfile is removed/unreadable, addressing the concern from the previous review.

Also applies to: 1965-1978, 2343-2344

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9405a42 and 6c47ee7.

📒 Files selected for processing (1)
  • src/dhcpcd.c (11 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-11-03T19:24:39.851Z
Learnt from: rsmarples
Repo: NetworkConfiguration/dhcpcd PR: 536
File: src/eloop.c:53-59
Timestamp: 2025-11-03T19:24:39.851Z
Learning: In the dhcpcd codebase (NetworkConfiguration/dhcpcd), compile-time feature detection based on header versions (e.g., LINUX_VERSION_CODE for epoll_pwait2) does not require runtime fallback handling for syscall unavailability (ENOSYS). Users are responsible for ensuring headers match their target runtime kernel.

Applied to files:

  • src/dhcpcd.c
🧬 Code graph analysis (1)
src/dhcpcd.c (5)
src/script.c (1)
  • script_runreason (745-802)
src/eloop.c (2)
  • eloop_exit (610-615)
  • eloop_exitallinners (628-639)
src/if.c (1)
  • if_free (81-104)
compat/pidfile.c (1)
  • pidfile_read (135-167)
src/privsep-root.c (1)
  • ps_root_stop (973-1020)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
  • GitHub Check: netbsd (--disable-arp, -DSMALL)
  • GitHub Check: netbsd (--disable-ipv6, -DSMALL)
  • GitHub Check: netbsd (--disable-ipv6)
  • GitHub Check: netbsd (--disable-ipv4, -DSMALL)
  • GitHub Check: netbsd (--disable-ipv4)
  • GitHub Check: netbsd (--disable-dhcp6, -DSMALL)
  • GitHub Check: netbsd (--disable-dhcp6)
  • GitHub Check: netbsd (-DSMALL)
  • GitHub Check: netbsd
  • GitHub Check: netbsd (--disable-ipv4ll, -DSMALL)
  • GitHub Check: netbsd (--disable-ipv4ll)
  • GitHub Check: netbsd (--disable-arp)
  • GitHub Check: openbsd
  • GitHub Check: freebsd

@rsmarples rsmarples merged commit 665b573 into master Nov 11, 2025
17 checks passed
@Sime-Zupanovic
Copy link
Contributor

Can you help me confirm, I understood your d41c505#diff-66945e1e112a14eb5cb5931e2d600af419ed1f4c9a1567669f3de19d0f104c94

  1. --exit sets STOPPING.

  2. Each protocol sees STOPPING → calls *_release() → triggers *_deconfigure() → reason="RELEASE".

  3. e.g. DHCPv6 RELEASE should be sent to network to signal removal of lease and we should receive ACK on dhcpcd

  4. dhcp_free() and dhcpcd_dropped() run in correct order.

  5. When all interfaces drop, the main loop exits cleanly.

If we don't have RELEASE ACK timeout callbacks (which also run from eloop) such as
dhcp_discover, dhcp_start, dhcp_bind, dhcp_renew (dhcp.c)
or
dhcp6_* callbacks: dhcp6_start, dhcp6_recvif, dhcp6_freedrop, dhcp6_finishrelease
will be triggered?
I think that's how I saw mentioned dhcp_discover() crash

Program terminated with signal SIGSEGV, Segmentation fault.
#0 dhcp_discover (arg=0x5589e589a0) at dhcp.c:2006
2006 dhcp.c: No such file or directory.
(gdb) bt
#0 dhcp_discover (arg=0x5589e589a0) at dhcp.c:2006
#1 0x00000055757cce64 in eloop_start (eloop=0x5589e47830, signals=signals@entry=0x7fdf70e068) at eloop.c:1201
#2 0x00000055757c5ab4 in main (argc=, argv=, envp=) at dhcpcd.c:2707

(gdb) bt full
#0 dhcp_discover (arg=0x5589e589a0) at dhcp.c:2006
ifp = 0x5589e589a0
state = 0x0
ifo = 0x5589e58bc0
#1 0x00000055757cce64 in eloop_start (eloop=0x5589e47830, signals=signals@entry=0x7fdf70e068) at eloop.c:1201
error =
t = 0x5589e47250
ts = {tv_sec = 0, tv_nsec = 763219470}
tsp =
PRETTY_FUNCTION = "eloop_start"

I captured tcpdump during dhcp --exit, but I didn't saw DHCPv6 RELEASE?
On which interface it would be sent?

I appended as well:
6c47ee7
7b15c54
ae22cf0
93305c2
7ef3ba1

Please help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

DHCPv6 Release packet corrupted if scheduled right before exit

6 participants