Skip to content

fix nanomq issue 2242#1478

Draft
wanghaEMQ wants to merge 3 commits intomainfrom
wangha/fix_nanomq_issue_2242
Draft

fix nanomq issue 2242#1478
wanghaEMQ wants to merge 3 commits intomainfrom
wangha/fix_nanomq_issue_2242

Conversation

@wanghaEMQ
Copy link
Copy Markdown
Member

@wanghaEMQ wanghaEMQ commented Apr 1, 2026

Summary by CodeRabbit

  • New Features

    • Added a new public API to compose MQTT PUBLISH messages (supports retain, QoS, topic, payload, and client id).
  • Bug Fixes

    • More accurate reporting for QUIC handshake failures to reduce misleading network-unavailable messages.
  • Refactor

    • Connection timeout and message resend now use sensible default fallbacks when per-connection settings are absent, improving reliability.

…hen no bridge_conf.

Signed-off-by: wanghaemq <wangwei@emqx.io>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 1, 2026

📝 Walkthrough

Walkthrough

Added MQTT publish composer API and implementation; updated MQTT QUIC client timer logic to use per-bridge config with defaults; and added explicit QUIC handshake-failure logging on transport-initiated shutdown.

Changes

Cohort / File(s) Summary
MQTT QUIC Client Timing Logic
src/mqtt/protocol/mqtt/mqtt_quic_client.c
Introduce default constants (MQTT_QUIC_TIMER_BACKOFF_MAX, MQTT_QUIC_TIMER_RESEND_WAIT) and compute backoff_max / resend_wait from s->bridge_conf when present; use computed values for disconnect timeout and message resend checks.
QUIC Handshake Failure Handling
src/supplemental/quic/msquic_dial.c
Add explicit case QUIC_STATUS_HANDSHAKE_FAILURE in the QUIC_CONNECTION_EVENT_SHUTDOWN_INITIATED_BY_TRANSPORT switch to log handshake failure and avoid fall-through to generic handling.
MQTT Publish Composer API & Implementation
include/nng/protocol/mqtt/mqtt_parser.h, src/sp/protocol/mqtt/mqtt_parser.c
Add public declaration nano_pubmsg_composer(...) and its implementation that constructs MQTT PUBLISH messages (handles remaining-length encodings, topic, optional clientid, QoS/PID handling, proto version selection, payload pointer setup, and error checks).

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested reviewers

  • JaylinYu
  • xinyi-xs

Poem

🐇 I nibble bytes and stitch the frame,
Compose a publish, name by name.
Timers softened, defaults keep pace,
Handshakes logged with tidy grace,
Hopping lightly — code in place.

🚥 Pre-merge checks | ❌ 3

❌ Failed checks (2 warnings, 1 inconclusive)

Check name Status Explanation Resolution
Description check ⚠️ Warning No pull request description was provided; the required template sections are entirely missing. Add a description following the template format: include the issue number and synopsis, and explain the changes made and their purpose.
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title refers to fixing issue 2242 but lacks specificity about the actual changes made, making it unclear what the fix addresses. Improve the title to summarize the main change, such as 'Add MQTT QUIC timer configurability and handshake failure handling' or similar to clarify the primary fix.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch wangha/fix_nanomq_issue_2242

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/mqtt/protocol/mqtt/mqtt_quic_client.c`:
- Around line 827-830: The reads of s->bridge_conf->backoff_max and
->resend_wait must be done while holding the socket mutex to avoid a data race
with mqtt_quic_sock_set_bridge_config; move the evaluation of backoff_max and
resend_wait so they occur after acquiring s->mtx (the same mutex used around the
later timer logic in this function), using the existing ternaries but reading
s->bridge_conf only within the critical section, and keep the variable names
backoff_max and resend_wait for use later in the timer logic (lines that
reference them around mqtt_quic_client timers).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 7db92da6-a658-42e3-8927-b5a39c165df1

📥 Commits

Reviewing files that changed from the base of the PR and between 5ea4e82 and 0a7ecdf.

📒 Files selected for processing (2)
  • src/mqtt/protocol/mqtt/mqtt_quic_client.c
  • src/supplemental/quic/msquic_dial.c

Comment thread src/mqtt/protocol/mqtt/mqtt_quic_client.c Outdated
Signed-off-by: wanghaemq <wangwei@emqx.io>
@wanghaEMQ wanghaEMQ force-pushed the wangha/fix_nanomq_issue_2242 branch from 0a7ecdf to 72da3dd Compare April 1, 2026 08:23
…move it before merging

Signed-off-by: wanghaemq <wangwei@emqx.io>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/mqtt/protocol/mqtt/mqtt_quic_client.c`:
- Line 847: The timeout check using p->pingcnt vs backoff_max in
mqtt_quic_client.c is off-by-one; update the conditional in the keepalive/ping
handling (the `if (p->pingcnt > backoff_max)` check) to use `>=` so the client
disconnects when pingcnt reaches the backoff_max boundary rather than after one
extra retry interval; adjust any related comments to reflect the inclusive
boundary if present.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 18747b82-8489-44a9-b10f-1bf691491831

📥 Commits

Reviewing files that changed from the base of the PR and between 0a7ecdf and 72da3dd.

📒 Files selected for processing (2)
  • src/mqtt/protocol/mqtt/mqtt_quic_client.c
  • src/supplemental/quic/msquic_dial.c
✅ Files skipped from review due to trivial changes (1)
  • src/supplemental/quic/msquic_dial.c

return;
}
if (p->pingcnt > s->bridge_conf->backoff_max) {
if (p->pingcnt > backoff_max) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Use >= for keepalive timeout boundary.

Line 847 currently times out only when pingcnt is strictly greater than backoff_max, which delays disconnect by one extra retry interval. This is an off-by-one boundary mismatch for timeout behavior.

🔧 Proposed fix
-	if (p->pingcnt > backoff_max) {
+	if (p->pingcnt >= backoff_max) {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/mqtt/protocol/mqtt/mqtt_quic_client.c` at line 847, The timeout check
using p->pingcnt vs backoff_max in mqtt_quic_client.c is off-by-one; update the
conditional in the keepalive/ping handling (the `if (p->pingcnt > backoff_max)`
check) to use `>=` so the client disconnects when pingcnt reaches the
backoff_max boundary rather than after one extra retry interval; adjust any
related comments to reflect the inclusive boundary if present.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/sp/protocol/mqtt/mqtt_parser.c`:
- Around line 1233-1237: The packet-id generation in the block handling qos>0
uses NNI_PUT16(tlen, nni_random()%8000) which can produce 0 and unnecessarily
limits the range; change the logic to produce a valid MQTT packet identifier in
1..65535 (non-zero) as done in nano_encode_publish_msg: generate a 16-bit random
value in the full 1–65535 range (e.g. map nni_random() into 1..65535 or retry on
zero) and then write it with NNI_PUT16 into tlen before calling
nni_msg_append(msg, tlen, 2).
- Around line 1195-1196: nni_msg_alloc's return value is ignored, so if
allocation fails the code dereferences msgp and crashes; update the call site
around nni_msg_alloc(msgp, 0) / msg = *msgp to capture the return code (e.g.,
int rc = nni_msg_alloc(msgp, 0)), check if rc != 0, and on error propagate or
handle it (return rc or jump to the existing error/cleanup path) instead of
continuing; ensure msg is only read after a successful allocation (assign msg =
*msgp after the rc check) and document via the same function's error flow.
- Around line 1219-1246: The code captures ptr = nni_msg_body(msg) before
performing nni_msg_append calls which can reallocate and invalidate ptr; move
the computation of the payload pointer to after all nni_msg_append calls
(including topic, clientid, pid, and payload appends) so that
nni_msg_set_payload_ptr(msg, ptr) is passed a stable pointer; specifically, stop
using ptr immediately after its initial assignment and instead call ptr =
nni_msg_body(msg) (or the appropriate pointer accessor) once all appends are
finished, then call nni_msg_set_payload_ptr(msg, ptr) while keeping existing
logic in functions like nni_msg_append, nni_msg_set_payload_ptr, and the
CMD_PUBLISH/CMD_PUBLISH_V5 branch intact.
- Around line 1239-1244: The publish-path sets the v5 flag (proto_ver,
MQTT_PROTOCOL_VERSION_v5 and nng_msg_set_cmd_type with CMD_PUBLISH_V5) but does
not emit the MQTT5 properties field; update the PUBLISH assembly code that sets
CMD_PUBLISH_V5 to append the property length as a MQTT variable-length integer
(encode at least one byte for zero properties) immediately after the topic (if
qos == 0) or after the packet identifier (if qos > 0), and before the payload;
implement/ call an existing varint encoder or add one to encode the property
length, and ensure this code path is only used when proto_ver ==
MQTT_PROTOCOL_VERSION_v5 so CMD_PUBLISH remains unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: a303bcbb-372f-438f-b11a-1a7b141043f1

📥 Commits

Reviewing files that changed from the base of the PR and between 72da3dd and 5fc6655.

📒 Files selected for processing (2)
  • include/nng/protocol/mqtt/mqtt_parser.h
  • src/sp/protocol/mqtt/mqtt_parser.c

Comment on lines +1195 to +1196
nni_msg_alloc(msgp, 0);
msg = *msgp;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Missing error handling for message allocation.

nni_msg_alloc returns an error code, but it's not checked. If allocation fails, msg will be NULL and subsequent operations will cause a crash.

🐛 Proposed fix
-	nni_msg_alloc(msgp, 0);
-	msg = *msgp;
+	if (nni_msg_alloc(msgp, 0) != 0) {
+		return NULL;
+	}
+	msg = *msgp;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/sp/protocol/mqtt/mqtt_parser.c` around lines 1195 - 1196, nni_msg_alloc's
return value is ignored, so if allocation fails the code dereferences msgp and
crashes; update the call site around nni_msg_alloc(msgp, 0) / msg = *msgp to
capture the return code (e.g., int rc = nni_msg_alloc(msgp, 0)), check if rc !=
0, and on error propagate or handle it (return rc or jump to the existing
error/cleanup path) instead of continuing; ensure msg is only read after a
successful allocation (assign msg = *msgp after the rc check) and document via
the same function's error flow.

Comment on lines +1219 to +1246
ptr = nni_msg_body(msg);
if (clientid != NULL) {
NNI_PUT16(tlen, topic->len + clientid->len + 1);
nni_msg_append(msg, tlen, 2);
nni_msg_append(msg, topic->body, topic->len);
nni_msg_append(msg, "/", 1);
nni_msg_append(msg, clientid->body, clientid->len);
} else {
NNI_PUT16(tlen, topic->len);
ptr += 2;
nni_msg_append(msg, tlen, 2);
nni_msg_append(msg, topic->body, topic->len);
}

if (qos > 0) {
// Set a random pid in case collision
NNI_PUT16(tlen, nni_random()%8000);
nni_msg_append(msg, tlen, 2);
}

if (proto_ver == MQTT_PROTOCOL_VERSION_v5) {
// Set V4/V5 flag for msg adaptor in transport layer
nng_msg_set_cmd_type(msg, CMD_PUBLISH_V5);
} else {
nng_msg_set_cmd_type(msg, CMD_PUBLISH);
}
nni_msg_append(msg, payload->body, payload->len);
nni_msg_set_payload_ptr(msg, ptr);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Dangling pointer: ptr is captured before message body exists, then used after multiple appends.

ptr is assigned from nni_msg_body(msg) at line 1219 when the body is empty. Subsequent nni_msg_append calls may reallocate the internal buffer, invalidating ptr. Using it at line 1246 with nni_msg_set_payload_ptr(msg, ptr) results in a dangling pointer.

🐛 Proposed fix: Compute the pointer after all appends are complete
-	ptr = nni_msg_body(msg);
 	if (clientid != NULL) {
 		NNI_PUT16(tlen, topic->len + clientid->len + 1);
 		nni_msg_append(msg, tlen, 2);
 		nni_msg_append(msg, topic->body, topic->len);
 		nni_msg_append(msg, "/", 1);
 		nni_msg_append(msg, clientid->body, clientid->len);
 	} else {
 		NNI_PUT16(tlen, topic->len);
-		ptr += 2;
 		nni_msg_append(msg, tlen, 2);
 		nni_msg_append(msg, topic->body, topic->len);
 	}

 	if (qos > 0) {
-		// Set a random pid in case collision
-		NNI_PUT16(tlen, nni_random()%8000);
+		// Set a random non-zero pid
+		uint16_t pid = nni_random() % 0xFFFF;
+		NNI_PUT16(tlen, pid == 0 ? 1 : pid);
 		nni_msg_append(msg, tlen, 2);
 	}

 	if (proto_ver == MQTT_PROTOCOL_VERSION_v5) {
 		// Set V4/V5 flag for msg adaptor in transport layer
 		nng_msg_set_cmd_type(msg, CMD_PUBLISH_V5);
 	} else {
 		nng_msg_set_cmd_type(msg, CMD_PUBLISH);
 	}
 	nni_msg_append(msg, payload->body, payload->len);
-	nni_msg_set_payload_ptr(msg, ptr);
+	// Set payload_ptr to start of variable header (beginning of message body)
+	nni_msg_set_payload_ptr(msg, nni_msg_body(msg));

 	return msg;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/sp/protocol/mqtt/mqtt_parser.c` around lines 1219 - 1246, The code
captures ptr = nni_msg_body(msg) before performing nni_msg_append calls which
can reallocate and invalidate ptr; move the computation of the payload pointer
to after all nni_msg_append calls (including topic, clientid, pid, and payload
appends) so that nni_msg_set_payload_ptr(msg, ptr) is passed a stable pointer;
specifically, stop using ptr immediately after its initial assignment and
instead call ptr = nni_msg_body(msg) (or the appropriate pointer accessor) once
all appends are finished, then call nni_msg_set_payload_ptr(msg, ptr) while
keeping existing logic in functions like nni_msg_append,
nni_msg_set_payload_ptr, and the CMD_PUBLISH/CMD_PUBLISH_V5 branch intact.

Comment on lines +1233 to +1237
if (qos > 0) {
// Set a random pid in case collision
NNI_PUT16(tlen, nni_random()%8000);
nni_msg_append(msg, tlen, 2);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Packet ID generation can produce invalid value 0 and has unnecessarily limited range.

MQTT packet identifiers must be non-zero (1–65535). The current nni_random()%8000 can generate 0 and unnecessarily limits the range to 0–7999.

Compare with nano_encode_publish_msg at line 1314 which correctly handles this case.

🐛 Proposed fix
 	if (qos > 0) {
-		// Set a random pid in case collision
-		NNI_PUT16(tlen, nni_random()%8000);
+		// Set a random non-zero pid
+		uint16_t pid = nni_random() % 0xFFFF;
+		NNI_PUT16(tlen, pid == 0 ? 1 : pid);
 		nni_msg_append(msg, tlen, 2);
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/sp/protocol/mqtt/mqtt_parser.c` around lines 1233 - 1237, The packet-id
generation in the block handling qos>0 uses NNI_PUT16(tlen, nni_random()%8000)
which can produce 0 and unnecessarily limits the range; change the logic to
produce a valid MQTT packet identifier in 1..65535 (non-zero) as done in
nano_encode_publish_msg: generate a 16-bit random value in the full 1–65535
range (e.g. map nni_random() into 1..65535 or retry on zero) and then write it
with NNI_PUT16 into tlen before calling nni_msg_append(msg, tlen, 2).

Comment on lines +1239 to +1244
if (proto_ver == MQTT_PROTOCOL_VERSION_v5) {
// Set V4/V5 flag for msg adaptor in transport layer
nng_msg_set_cmd_type(msg, CMD_PUBLISH_V5);
} else {
nng_msg_set_cmd_type(msg, CMD_PUBLISH);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check how nano_pubmsg_composer is called and whether v5 properties are expected
rg -n -A 10 'nano_pubmsg_composer' --type c

Repository: nanomq/NanoNNG

Length of output: 1953


🏁 Script executed:

# Read the complete nano_pubmsg_composer function
sed -n '1182,1260p' src/sp/protocol/mqtt/mqtt_parser.c | cat -n

Repository: nanomq/NanoNNG

Length of output: 2609


🏁 Script executed:

# Search for v5 property handling in the codebase
rg -n "property|properties|CMD_PUBLISH_V5" src/sp/protocol/mqtt/mqtt_parser.c | head -30

Repository: nanomq/NanoNNG

Length of output: 1763


🏁 Script executed:

# Check if there's any v5 specific encoding logic
rg -n -B 5 -A 5 "proto_ver.*5|MQTT_PROTOCOL_VERSION_v5" src/sp/protocol/mqtt/mqtt_parser.c | head -50

Repository: nanomq/NanoNNG

Length of output: 1816


Add MQTT v5 properties field to PUBLISH messages.

MQTT 5.0 specification requires PUBLISH messages to include a properties section in the variable header. The current implementation sets the CMD_PUBLISH_V5 flag but does not encode the properties field itself.

For MQTT v5 compliance, add a variable-length integer encoding the property length (minimum 1 byte for zero properties) after the packet ID (if QoS > 0) or after the topic (if QoS == 0). This applies before appending the payload.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/sp/protocol/mqtt/mqtt_parser.c` around lines 1239 - 1244, The
publish-path sets the v5 flag (proto_ver, MQTT_PROTOCOL_VERSION_v5 and
nng_msg_set_cmd_type with CMD_PUBLISH_V5) but does not emit the MQTT5 properties
field; update the PUBLISH assembly code that sets CMD_PUBLISH_V5 to append the
property length as a MQTT variable-length integer (encode at least one byte for
zero properties) immediately after the topic (if qos == 0) or after the packet
identifier (if qos > 0), and before the payload; implement/ call an existing
varint encoder or add one to encode the property length, and ensure this code
path is only used when proto_ver == MQTT_PROTOCOL_VERSION_v5 so CMD_PUBLISH
remains unchanged.

@JaylinYu JaylinYu marked this pull request as draft April 16, 2026 11:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant