Skip to content

Conversation

@NelsonFerragut
Copy link

@NelsonFerragut NelsonFerragut commented Feb 8, 2025

Initial requirements for Task Watchdog.

Note: All requirements are already implemented in the Zephyr Project Task Watchdog.

Copy link
Member

@martinjaeger martinjaeger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Only got some minor comments.

FYI: This is the original list of requirements we came up with when introducing the Task Watchdog to Zephyr: zephyrproject-rtos/zephyr#28947 (comment)

Copy link

@nicpappler nicpappler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need some more clarification on some of the requirements, but generally the setup and phrasing is good.

Copy link

@nicpappler nicpappler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I propose to add Stephane's change requests, but handle the missing requirements in a new PR. And create an Issues with all open points found by Stephane, so we do not forget about it.

@parphane parphane added the Requirements Requirements work label Jul 29, 2025
COMPONENT: Task Watchdog
TITLE: Enable hardware failsafe option
STATEMENT: >>>
The Task Watchdog shall provide a configuration option to enable or disable the hardware watchdog failsafe.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added action in #97 to rephrase to "The Zephyr RTOS"
No action needed

COMPONENT: Task Watchdog
TITLE: Configure hardware watchdog timeout period
STATEMENT: >>>
Where the hardware watchdog failsafe is enabled, the Task Watchdog shall allow the hardware watchdog timeout period to be configured.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Where the hardware watchdog failsafe is enabled, the Task Watchdog shall allow the hardware watchdog timeout period to be configured.
When the hardware watchdog failsafe is enabled, the Task Watchdog shall allow the hardware watchdog timeout period to be configurable.

COMPONENT: Task Watchdog
TITLE: Configure maximum feed period
STATEMENT: >>>
Where the hardware watchdog failsafe is enabled, the Task Watchdog shall allow the maximum hardware watchdog feed period to be configured.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Where the hardware watchdog failsafe is enabled, the Task Watchdog shall allow the maximum hardware watchdog feed period to be configured.
When the hardware watchdog failsafe is enabled, the Task Watchdog shall allow the maximum hardware watchdog feed period to be configured.

<<<

[REQUIREMENT]
UID: ZEP-SRS-20-13
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the difference between 12 and 13?
Can you relate to task_wdt/Kconfig or the code?
TASK_WDT_HW_FALLBACK_DELAY vs TASK_WDT_MIN_TIMEOUT maybe?

COMPONENT: Task Watchdog
TITLE: Initialize feature
STATEMENT: >>>
Where the hardware watchdog failsafe is enabled, the Task Watchdog shall start the hardware watchdog upon initialization.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Upon initialization of what?
I am not an expert, but can we refer to a Zephyr state? Or rephrase "upon initialization of the HW task watchdog" or "upon its initialization"?
Could also maybe leave the concept of Zephyr initialization out and talk about when the HW watchdog feeding should start? (abstract ourselves from the implementation/architecture)

Suggested change
Where the hardware watchdog failsafe is enabled, the Task Watchdog shall start the hardware watchdog upon initialization.
When the hardware watchdog failsafe is enabled, the Task Watchdog shall start the hardware watchdog upon initialization of the Task Watchdog component.

COMPONENT: Task Watchdog
TITLE: Feed task watchdog timer
STATEMENT: >>>
The Task Watchdog shall provide a method to feed the task watchdog timer.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To stay consistent with ZEP-SRS-20-2

Suggested change
The Task Watchdog shall provide a method to feed the task watchdog timer.
The Task Watchdog shall provide a method to feed a task watchdog timer.

COMPONENT: Task Watchdog
TITLE: Callback on task failure
STATEMENT: >>>
Where a callback function is configured for a monitored task watchdog timer, if the timer expires, then the Task Watchdog shall invoke that function.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Where a callback function is configured for a monitored task watchdog timer, if the timer expires, then the Task Watchdog shall invoke that function.
When callback function is configured for a monitored task watchdog timer, when the timer expires, then the Task Watchdog shall invoke that function.

COMPONENT: Task Watchdog
TITLE: Reset on task failure
STATEMENT: >>>
Where a callback function is not configured for a monitored task watchdog timer, if the timer expires, then the Task Watchdog shall reset the device.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If Zephyr is running in a hypervisor, we likely do not have control over the device it is running on.

Suggested change
Where a callback function is not configured for a monitored task watchdog timer, if the timer expires, then the Task Watchdog shall reset the device.
When a callback function is not configured for a monitored task watchdog timer, when the timer expires, then the Task Watchdog shall reset the RTOS.

Not blocking: Also, general comment: Should Zephyr allow for that "default" mechanism to be configurable?
One may want an RTOS restart, another something else?
Added a suggestion in #97

COMPONENT: Task Watchdog
TITLE: Auto feed hardware watchdog
STATEMENT: >>>
Where the hardware watchdog failsafe is enabled, while all monitored tasks are feeding their respective watchdog timers, the Task Watchdog shall periodically feed the hardware watchdog.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Periodically requires refinement, consider using the predefined term (req above) "maximum hardware watchdog feed period"
Also consider removing the "while ..." part which only seems to be there because the implementation mechanism mixes feeding hw wdt and triggering a hw wdt feeding depending on other wdt.

Suggested change
Where the hardware watchdog failsafe is enabled, while all monitored tasks are feeding their respective watchdog timers, the Task Watchdog shall periodically feed the hardware watchdog.
When the hardware watchdog failsafe is enabled, the Task Watchdog shall feed the hardware watchdog at least every maximum hardware watchdog feed period.

<<<
USER_STORY: >>>
As a safety system designer, I want the Task Watchdog to manage feeding the hardware watchdog so that a failure on any monitored task will still cause a corrective reset.
<<<
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To get this pull going, I will be logging items in #97 for:

  1. task_wdt_resume
  2. task_wdt_suspend
  3. TASK_WDT_HW_FALLBACK_DELAY (depending on answers)
  4. TASK_WDT_HW_FALLBACK_PAUSE_IN_SLEEP
  5. TASK_WDT_SHELL

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Requirements Requirements work

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants