Skip to content

BUG: scheduling while atomic in __mt76_worker_fn on PREEMPT_RT kernels #1053

@Spunky84

Description

@Spunky84

Bug description

When loading any mt76-based driver on a CONFIG_PREEMPT_RT kernel, the worker threads immediately trigger:

BUG: scheduling while atomic: mt76-usb-rx phy/2852/0x00000002

The preempt_count of 0x00000002 indicates two levels of preemption nesting when schedule() is called.

The bug is 100% reproducible — every mt76 worker thread hits it on every RT kernel I tested. The driver fails to initialize and becomes unusable.

Affected code

The issue is in __mt76_worker_fn() in util.c:

int __mt76_worker_fn(void *ptr)
{
    struct mt76_worker *w = ptr;

    while (!kthread_should_stop()) {
        set_current_state(TASK_INTERRUPTIBLE);  // <-- (1)
        ...
        if (!test_and_clear_bit(MT76_WORKER_SCHEDULED, &w->state)) {
            schedule();                          // <-- (2) BUG here
            continue;
        }
        ...
        set_current_state(TASK_RUNNING);         // <-- (3)
        w->fn(w);
        ...
    }
}

On PREEMPT_RT, schedule() at (2) is called with preempt_count != 0, triggering the BUG.

Environment

  • Hardware: Revolution Pi Connect S (BCM2711, aarch64)
  • Kernel: 5.10.152-rt75-v8 (CONFIG_PREEMPT_RT=y)
  • OS: BalenaOS 3.0.8+rev2 (Yocto-based)
  • Adapter: EDUP AX3000M USB WiFi 6E (MediaTek MT7961, USB ID 0e8d:7961)
  • Driver: mt7921u backported from OpenWrt mt76 (2024-09-10 snapshot)

Note: I could not capture a full call stack because the device is an embedded industrial system with limited debug access. The one-liner BUG message is from dmesg. Happy to try to get a full trace if that helps.

Mainline status

The code in __mt76_worker_fn() is identical in:

  • OpenWrt mt76 master (as of 2026-02-12)
  • torvalds/linux master (as of 2026-02-12)

This means the bug affects any PREEMPT_RT kernel running any mt76 driver, not just the backported version I'm using.

Workaround

As a temporary workaround, I replaced the set_current_state/schedule pattern with usleep_range():

-       set_current_state(TASK_INTERRUPTIBLE);
        ...
-           schedule();
+           usleep_range(50, 500);
        ...
-       set_current_state(TASK_RUNNING);
+       __set_current_state(TASK_RUNNING);

This eliminates the BUG and makes the driver fully functional on RT kernels (tested with mt7921u, WiFi 6E on 2.4 GHz and 5 GHz, stable over days). However, this converts an event-driven wait into a polling loop, which is not suitable for upstream.

Questions

  1. Is there already work in progress to make mt76 PREEMPT_RT compatible?
  2. Would converting __mt76_worker_fn() to use rcuwait be the preferred approach?
  3. Or should the root cause of the elevated preempt_count be investigated first? The standard set_current_state/schedule pattern should theoretically be safe in a kthread context, so something else may be disabling preemption before the worker loop runs.

The workaround and a full reproducible build system are available at:
https://github.com/Spunky84/mt76-rt-backport

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions