Skip to content

Add draft: OTA Firmware Updates for Matter Devices via nRF Cloud#629

Open
franc0is wants to merge 3 commits intomasterfrom
matter-ota-nrfcloud
Open

Add draft: OTA Firmware Updates for Matter Devices via nRF Cloud#629
franc0is wants to merge 3 commits intomasterfrom
matter-ota-nrfcloud

Conversation

@franc0is
Copy link
Copy Markdown
Member

@franc0is franc0is commented Mar 27, 2026

  • Fix trailing comma in YAML front matter description (OTA backend,OTA backend.)
  • Fix mixed reference-link/footnote syntax ([Memfault OTA documentation][^memfault_ota]Memfault OTA documentation[^memfault_ota])

@franc0is franc0is requested a review from a team as a code owner March 27, 2026 20:40
@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages bot commented Mar 27, 2026

Deploying interrupt with  Cloudflare Pages  Cloudflare Pages

Latest commit: 47b66f3
Status: ✅  Deploy successful!
Preview URL: https://b6d2e55f.interrupt.pages.dev
Branch Preview URL: https://matter-ota-nrfcloud.interrupt.pages.dev

View logs

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new draft blog post describing an approach for performing OTA firmware updates on Matter-over-Thread devices using CoAP blockwise transfers over DTLS with nRF Cloud as the backend, and MCUboot for safe image swapping.

Changes:

  • Introduces a new draft post detailing an end-to-end OTA architecture (version check, blockwise download, flash streaming, MCUboot swap/confirm).
  • Includes configuration snippets, illustrative CoAP request/Block2 examples, and a “Trying it out” walkthrough with sample logs.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@franc0is
Copy link
Copy Markdown
Member Author

franc0is commented Apr 2, 2026

@copilot apply changes based on the comments in this thread

Copy link
Copy Markdown

Copilot AI commented Apr 2, 2026

@copilot apply changes based on the comments in this thread

Applied both changes in commit 5baa440:

  1. Removed trailing comma in the YAML front matter description (OTA backend,OTA backend.)
  2. Fixed the mixed reference-link/footnote syntax ([Memfault OTA documentation][^memfault_ota]Memfault OTA documentation[^memfault_ota])

Copy link
Copy Markdown
Contributor

@noahp noahp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't currently require it, but a nice rewrap with npx prettier --write --prose-wrap=always _drafts/matter-ota-nrfcloud.md would be great, to make it easier to review/read.

great post! ship it 🚢

Comment on lines +7 to +8
with nRF Cloud as the OTA backend.
on the nRF54LM20 DK.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dangling fragment here?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed


`CONFIG_STREAM_FLASH` provides a buffered flash write API that handles page alignment for us. `CONFIG_IMG_MANAGER` and `CONFIG_MCUBOOT_IMG_MANAGER` give us the MCUboot APIs to request an image swap and confirm the new image after boot.

The nRF54LM20's partition layout places the MCUboot primary slot in internal RRAM (the nRF54LM20 uses RRAM, not traditional flash and the secondary slot on external SPI NOR flash (a MX25R6435F). This means the secondary slot erase is slow (~60 seconds for 2 MB), but the primary slot benefits from RRAM's fast write speeds.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The nRF54LM20's partition layout places the MCUboot primary slot in internal RRAM (the nRF54LM20 uses RRAM, not traditional flash and the secondary slot on external SPI NOR flash (a MX25R6435F). This means the secondary slot erase is slow (~60 seconds for 2 MB), but the primary slot benefits from RRAM's fast write speeds.
The nRF54LM20's partition layout places the MCUboot primary slot in internal RRAM (the nRF54LM20 uses RRAM, not traditional flash and the secondary slot on external SPI NOR flash (a MX25R6435F)). This means the secondary slot erase is slow (~60 seconds for 2 MB), but the primary slot benefits from RRAM's fast write speeds.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's a parenthesis bug here, but not the one you caught ;-). Fixed.

Comment on lines +168 to +172
int pos = coap_start_request(coap_buf, sizeof(coap_buf),
COAP_CODE_GET, 0);
coap_buf[0] = 0x48; /* Ver=1, Type=CON, TKL=8 */
memcpy(coap_buf + 4, token, 8);
pos = 12;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit- pos return value is overwritten here?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed it - this code isn't the best, I might take one more swing at it.

nrfcloud_send(s, coap_buf, pos);

/* Receive response, filtering by token to skip stale packets */
/* ... (token matching loop omitted for brevity) ... */
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe also note that error handling is omitted for brevity too, this loop will silently exit if all fail, and i think stream_flash_buffered_write will end up being called with junk

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a note about it.


`BOOT_UPGRADE_TEST` tells MCUboot to swap the images, but treat the new image as a **test**. If the new firmware does not explicitly confirm itself, MCUboot will revert to the previous image on the next reboot. This is the safety net for remote devices: a buggy firmware that crashes before confirming will be rolled back automatically.

The image is confirmed early in `main()`, before the application starts:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe note that it's unconditionally confirmed here, but could be done after the device has successfully established a connection, for example, in case a rollback is needed

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍


Block2 over Thread is reliable but not fast. At ~4 KB/s, a 250 KB image takes about a minute. For typical firmware sizes this is fine, but very large images would benefit from a faster transport.

nRF Cloud provides the fleet management features that Matter's DCL lacks: staged rollouts, cohort targeting, and version management with a fast development loop. It offers a free tier for up to 10 devices, and scales at $0.10/device/month.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe- soften it to "check nRF Cloud's pricing page for current tiers." so we don't have to update in the future, if pricing changes

Copy link
Copy Markdown
Member

@gminn gminn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! I few notes

Comment on lines +73 to +77
**An OTA backend that speaks UDP.** Our Matter device cannot use HTTP, so we need a backend that supports CoAP over DTLS. I am using nRF Cloud[^nrfcloud] because it provides this out of the box, along with firmware hosting, version management, staged rollouts, and cohort targeting. You could also roll your own.

**Application firmware** on the device that checks for updates, downloads the new image, and writes it to flash. This is the code we will walk through in this post.

**MCUboot**[^mcuboot] is the bootloader. It manages two firmware slots (primary and secondary) and can swap between them safely. If a new firmware fails to boot, MCUboot automatically reverts to the previous version.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit for readability: bullets

Suggested change
**An OTA backend that speaks UDP.** Our Matter device cannot use HTTP, so we need a backend that supports CoAP over DTLS. I am using nRF Cloud[^nrfcloud] because it provides this out of the box, along with firmware hosting, version management, staged rollouts, and cohort targeting. You could also roll your own.
**Application firmware** on the device that checks for updates, downloads the new image, and writes it to flash. This is the code we will walk through in this post.
**MCUboot**[^mcuboot] is the bootloader. It manages two firmware slots (primary and secondary) and can swap between them safely. If a new firmware fails to boot, MCUboot automatically reverts to the previous version.
- **An OTA backend that speaks UDP.** Our Matter device cannot use HTTP, so we need a backend that supports CoAP over DTLS. I am using nRF Cloud[^nrfcloud] because it provides this out of the box, along with firmware hosting, version management, staged rollouts, and cohort targeting. You could also roll your own.
- **Application firmware** on the device that checks for updates, downloads the new image, and writes it to flash. This is the code we will walk through in this post.
- **MCUboot**[^mcuboot] is the bootloader. It manages two firmware slots (primary and secondary) and can swap between them safely. If a new firmware fails to boot, MCUboot automatically reverts to the previous version.

Now the full update - check, erase, download, and reboot:

```
uart:~$ nrfcloud ota
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you have the timestamps from the zephyr logs, that would be nice to have here so users can see the progression

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm too lazy to rerun it all :-P

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very fair! A nice to have for sure

[00:00:02.145,000] <inf> chip: [SVR]Server initializing...
```

The whole process takes about 5 minutes.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could add a little celebration here! I also think it might feel better to have the time discussion here instead of the conclusion (I also added the timestamp callout, in case you add it):

Suggested change
The whole process takes about 5 minutes.
And we're done! 🎉
As the log timestamps show, the whole process takes about 5 minutes. Block2 over Thread is reliable but not fast. At ~4 KB/s, a 250 KB image takes about a minute. For typical firmware sizes this is fine, but very large images would benefit from a faster transport, such as HTTPS.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

pos = coap_append_option(coap_buf, pos, sizeof(coap_buf),
&prev_opt, COAP_OPT_MEMFAULT_KEY,
project_key, strlen(project_key));

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an optimization, but the proxy endpoint will now honor response format requests, and plaintext URL would be cleaner to parse than json. You should be able to add this here with:

Suggested change
const uint8_t content_format = 0; /* 0 = text/plain */
pos = coap_append_option(coap_buf, pos, sizeof(coap_buf),
&prev_opt, COAP_OPT_CONTENT_FORMAT,
&content_format, 1);

franc0is and others added 3 commits April 3, 2026 15:47
@franc0is franc0is force-pushed the matter-ota-nrfcloud branch from 5ea722c to 47b66f3 Compare April 3, 2026 19:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants