Tasking subsystem rework #46

greatbridf · 2025-08-09T20:08:52Z

Unified the stackful and stackless coroutine schemes.

Rework the scheduler and rename it to Runtime. Use stackless coroutines as the basic scheduling unit so we can run only one scheduler with only one stack on each cores. This reduces complexity and improves performance A LOT.

In scenarios where we DO need a stackful execution environment, we provide an extra wrappers called stackful(). To run computational (or time comsuming) tasks on a time slice based preemptible basis is as simple as wrapping the future with it and throw it to the runtime:

RUNTIME.spawn(async {
    // Some computational work
    loop { println!("computational work") }
});

The task above will block the cpu forever if it is chosen by the scheduler to run as the poll function will never return. The solution is very simple:

RUNTIME.spawn(stackful(async {
    // Some computational work
    loop { println!("computational work") }
}));

You wrap the future with stackful(), and the future will be thrown to the dedicated stack, got preempted and yield the control back to the runtime. Voila.

Partial adaption of all async functions (this part may need further discussion).

The filesystem and block device subsystem is left unchanged although it would be as easy as noting the functions as #[async_trait] and change every function through the calling chain to async as well (they're migrated to the new block_on function as a temporary solution). The main reason is that async fns in dyn traits can only be implemented by using boxed dyn Futures, which would impact the performance severely on hot paths like the two.

The current solution is to wrap Thread::run with stackful() to make them preemptible so that we can continue to use block_on.... This is ugly but works well.

Fixed various bugs in the HAL's trap subsystem.

…r ext4 crate

Fix page cache's bug, add size check in read function. Add page cache's base operations for ext4, but the cachepage will not be dropped until kernel stop, so we need to call fsync function manually, consider use some strategy such as LRU.

temporary write back by timer, when write function is called, check if the time since the last write back is greater than 10 seconds. If it is, then write back.

Remove old Scheduler. Add Runtime as replacement. Use stackless coroutine as the low level tasking mechanism and build the stackful tasks on top of it. Redesign of the task state system. Rework the executor. Remove Run trait and anything related. Signed-off-by: greatbridf <greatbridf@icloud.com>

We use RUNNING to indicate that the task is on the cpu, and use READY to indicate that the task could be further run again and therefore put into the ready queue after one poll() call. When the task is acquired from the ready queue and put onto cpu, it's marked as RUNNING only, making it put suspended after we got the Poll::Pending from the poll() call. If we (or others) call Waker::wake() within the run, we'll set the READY flag then. And when we return from the poll call, we could find it by a CAS and put it back to the ready queue again. We've also done some adaption work to the rest of the kernel, mainly to remove *SOME* of the Task::block_on calls. But to completely remove it is not possible for now. We should solve that in further few commits. Signed-off-by: greatbridf <greatbridf@icloud.com>

Add tracing logs in Runtime::enter and other critical points. Pass trace_scheduler feature down to eonix_runtime crate, fixing the problem that the feature is not working. When the task is blocked, we set CURRENT_TASK to None as well. In early initialization stage, the stack is placed in identically mapped physical address. VirtIO driver might try converting the given buffer paths back to physical ones, which will generate errors. So BSP and AP should allocate an another stack and switch to it. We use TaskContext for the fix. Signed-off-by: greatbridf <greatbridf@icloud.com>

This is used only by Thread when we enter user execution context, when we need to save the "interrupt stack" to the local CPU so we can get the information needed to capture the trap. We need to support nested captured trap returns. So instead of setting that manually, we save the needed information when trap_return() is called (since we have precisely the trap context needed) and restore it after the trap is captured. Signed-off-by: greatbridf <greatbridf@icloud.com>

On riscv64 platforms, we load the kernel tp only if we've come from U mode to reduce overhead. But we would restore the tp saved in TrapContext even if we are returning to kernel space, which causes problems because the default tp is zero. We should save kernel tp register to the field in TrapContext structs when we set privilege mode to kernel. Signed-off-by: greatbridf <greatbridf@icloud.com>

We provide a simple block_on to constantly poll the given future and block the current execution thread as before. We also introduce a new future wrapper named `stackful` to convert any future into a stackful one. We allocate a stack and keep polling the future on the stack by constructing a TrapContext and call trap_return() to get into the stackful environment. Then we capture the timer interrupt to get preempts work. Signed-off-by: greatbridf <greatbridf@icloud.com>

If we don't pass in FEATURES or SMP, we will have no feature enabled. In this scenerio, the dangling --feature argument will cause cargo to panic. We provide the features and the --feature together to avoid this... Signed-off-by: greatbridf <greatbridf@icloud.com>

We can pass a function to be called after a success rcu_sync call. Signed-off-by: greatbridf <greatbridf@icloud.com>

Simple renamings... Further work is needed to make the system work. Signed-off-by: greatbridf <greatbridf@icloud.com>

The previous implementation has some bugs inside that will cause kernel space nested traps to lose some required information: - In kernel mode, trap contexts are saved above the current stack frame without exception, which is not what we want. We expect to read the trap data in the CAPTURED context. - The capturer task context is not saved as well, which will mess up the nested traps completely. - We are reading page fault virtual addresses in TrapContext::trap_type, which won't work since if the inner trap is captured, and the outer trap interleaves with the trap_type() call, we will lose the stval data in the inner trap. The solution is to separate our "normal" trap handling procedure out of captured trap handling procedure. We swap the stvec CSR when we set up captured traps and restore it afterwards so the two approach don't have to tell then apart in trap entries. Then, we can store the TrapContext pointer in sscratch without having to distinguish between trap handling types. In the way, we keep the procedure simple. The register stval is saved together with other registers to be used in page faults. Signed-off-by: greatbridf <greatbridf@icloud.com>

We've got everything done in order to make the system run. Add Thread::contexted to load the context needed for the thread to run. Wrap the Thread::real_run() with contexted(stackful(...)) in Thread::run(). We would use this for now. Later, we will make the thread completely asynchronous. This way we don't have to change its interface then. Signed-off-by: greatbridf <greatbridf@icloud.com>

Similar to 661a159: - Save previous {trap, task}_ctx and restore them afterwards. - Set kernel tp when setting trap context user mode. - Add the program counter with 4 bytes on breakpoints. Signed-off-by: greatbridf <greatbridf@icloud.com>

TODO: hide changes to the program counter in the HAL crate. Signed-off-by: greatbridf <greatbridf@icloud.com>

Copilot

Pull Request Overview

This PR implements a major rework of the tasking subsystem, unifying stackful and stackless coroutine schemes with a new runtime-based scheduling approach. The changes replace the existing scheduler with a simplified Runtime that uses stackless coroutines as the basic scheduling unit, while providing stackful execution via a wrapper for computational tasks that require preemption.

Key changes include:

Complete replacement of the Scheduler with a Runtime system using stackless coroutines
Introduction of stackful() wrapper for time-slice based preemptible execution of computational tasks
Migration from Task::block_on() to a new block_on() function throughout the codebase
Simplification of task state management and execution model

Reviewed Changes

Copilot reviewed 53 out of 53 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
src/rcu.rs	Updates RCU synchronization to use new runtime and adds `call_rcu` function
src/lib.rs	Major kernel initialization changes to use new runtime and task context switching
src/kernel/task.rs	Introduces new `block_on` and `stackful` functions, core of the new tasking model
crates/eonix_runtime/src/scheduler.rs	Complete rewrite from Scheduler to Runtime with simplified execution model
crates/eonix_runtime/src/task.rs	Simplified task structure using direct future polling instead of complex state machine
Multiple filesystem/driver files	Migration from `Task::block_on` to new `block_on` function

Copilot · 2025-08-09T20:10:03Z

src/rcu.rs

-                Task::block_on(rcu_sync());
+                call_rcu(move || {
+                    let _ = arc;
+                    todo!();


The todo!() macro will panic when reached, making this code path unusable. This should be replaced with proper implementation or removed if not needed.

Suggested change

todo!();

// Arc is dropped here after RCU grace period.

Copilot · 2025-08-09T20:10:03Z

src/kernel/vfs/filearray.rs

 use posix_types::open::{FDFlags, OpenFlags};

-#[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord)]
+#[derive(Clone, Copy, PartialEq, Eq, PartialOrd, Ord)]


[nitpick] The Debug trait was removed from the derive macro but a custom Debug implementation was added later. This could be confusing - consider keeping Debug in the derive macro and removing the custom implementation, or add a comment explaining why the custom implementation is needed.

Suggested change

#[derive(Clone, Copy, PartialEq, Eq, PartialOrd, Ord)]

#[derive(Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Debug)]

Copilot · 2025-08-09T20:10:04Z

crates/eonix_runtime/src/task.rs

+            state => unreachable!("Waking a {state:?} task"),
+        }) else {
+            return;
+        };


[nitpick] Using let Ok(old) = ... pattern and then ignoring the error case with early return could mask unexpected states. Consider handling the error case explicitly or adding a comment explaining when this is expected to fail.

Suggested change

};

match self.state.update(|state| match state {

TaskState::BLOCKED => Some(TaskState::READY),

TaskState::RUNNING => Some(TaskState::READY | TaskState::RUNNING),

TaskState::READY | TaskState::READY_RUNNING => None,

state => unreachable!("Waking a {state:?} task"),

}) {

Ok(old) => {

if old == TaskState::BLOCKED {

// If the task was blocked, we need to put it back to the ready queue.

self.rq().put(self.clone());

}

}

Err(e) => {

// This should not happen; log, assert, or handle as appropriate.

debug_assert!(false, "Failed to update task state in wake_by_ref: {:?}", e);

// Optionally, you could log or handle the error here.

}

}

Copilot · 2025-08-09T20:10:04Z

src/kernel/task.rs

+                        core::arch::asm!("ebreak");
+
+                        #[cfg(target_arch = "loongarch64")]
+                        core::arch::asm!("break 1");


[nitpick] Architecture-specific inline assembly blocks are scattered throughout the function. Consider extracting these into architecture-specific helper functions to improve readability and maintainability.

Suggested change

core::arch::asm!("break 1");

arch_breakpoint();

The current implementation use the WokenUp object to detect whether the stackful task is woken up somewhere. This is WRONG since we might lose wakeups as the runtime have no idea what we have done. If someone wakes us up, the task won't be enqueued so we will never have a second chance to get to the foreground. The fix is to use Arc<Task> to create a waker and check whether the task is ready each time we get back to the stackful poll loop. Signed-off-by: greatbridf <greatbridf@icloud.com>

We introduced a per-thread allocator inside the future object to allocate space for the syscalls. This ensures performance and saves memory. The allocator takes up 8K for now and is enough for current use. Signed-off-by: greatbridf <greatbridf@icloud.com>

Signed-off-by: greatbridf <greatbridf@icloud.com>

Use unwinding crate to unwind the stack and print stack trace. Sightly adjust the linker script and move eh_frame into rodata section. Due to limited kernel image size, there might be some problems on x86_64 platforms. Further fixes needed but won't be fixed for now. Signed-off-by: greatbridf <greatbridf@icloud.com> (cherry picked from commit 6bb54d9eae13b76768f011c44222b25b785b83e0) Signed-off-by: greatbridf <greatbridf@icloud.com>

The stackful tasks might be woken up before actually being put into sleep by returning a Poll::Pending. Thus, infinite sleep will occur since we are no longer on both the wait list and the ready queue. The solution is to remember that we are woken up in stackful wakers and check before putting us to sleep by wait_for_wakeups(). Also, implement Drop for RCUPointer by using call_rcu to drop the underlying data. We must mark T: Send + Sync + 'static in order to send the arc to the runtime... Signed-off-by: greatbridf <greatbridf@icloud.com>

…-rework

The current implementation ignores the given argument and uses the default arch. Change the wrong behavior... Signed-off-by: greatbridf <greatbridf@icloud.com>

SMS-Derfflinger and others added 23 commits July 23, 2025 23:40

perf: replace the annoyed ext4 crate with a new choice (called anothe…

f6c26c9

…r ext4 crate

feat(fs): impl write, create and mkdir for ext4 fs

f050373

feat(fs): impl remove file and dir.

d59a550

fix(fs): fix some informations

5c40166

feat(fs): impl rename

1d1a025

fix(fs): fix rename's metadata

22458ed

fix(fs): fix ext4's write offset update

806c4fe

Merge branch 'master' into ext4-replace

082c5c5

feat(fs): partial work for ext4's page cache

db1caeb

Fix page cache's bug, add size check in read function. Add page cache's base operations for ext4, but the cachepage will not be dropped until kernel stop, so we need to call fsync function manually, consider use some strategy such as LRU.

feat(fs): temporary cache write back strategy for ext4

a2c50b9

temporary write back by timer, when write function is called, check if the time since the last write back is greater than 10 seconds. If it is, then write back.

rcu: provide call_rcu() to call rcu drop asynchronously

21dd5ea

We can pass a function to be called after a success rcu_sync call. Signed-off-by: greatbridf <greatbridf@icloud.com>

task: migrate all Task::block_on calls to task::block_on

874a4fa

Simple renamings... Further work is needed to make the system work. Signed-off-by: greatbridf <greatbridf@icloud.com>

trap: introduce Breakpoint fault type

a622172

TODO: hide changes to the program counter in the HAL crate. Signed-off-by: greatbridf <greatbridf@icloud.com>

Copilot AI review requested due to automatic review settings August 9, 2025 20:08

Copilot AI reviewed Aug 9, 2025

View reviewed changes

greatbridf force-pushed the task-rework branch from 8f78f29 to bddb449 Compare August 10, 2025 17:08

greatbridf force-pushed the task-rework branch from bddb449 to dee96a3 Compare August 10, 2025 17:34

partial work: vfs asynchronize

973f6f2

Signed-off-by: greatbridf <greatbridf@icloud.com>

greatbridf added 4 commits August 15, 2025 02:05

partial work: file array rework and asynchronize

db931a8

Signed-off-by: greatbridf <greatbridf@icloud.com>

Merge remote-tracking branch 'SMS-Derfflinger/ext4-replace' into task…

c57b71f

…-rework

SMS-Derfflinger requested review from SMS-Derfflinger and joey-shao August 17, 2025 10:09

joey-shao approved these changes Aug 17, 2025

View reviewed changes

configure: check and use ARCH given in env

8c656b5

The current implementation ignores the given argument and uses the default arch. Change the wrong behavior... Signed-off-by: greatbridf <greatbridf@icloud.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tasking subsystem rework #46

Tasking subsystem rework #46

Uh oh!

greatbridf commented Aug 9, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Aug 9, 2025

Uh oh!

Copilot AI Aug 9, 2025

Uh oh!

Copilot AI Aug 9, 2025

Uh oh!

Copilot AI Aug 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

	#[derive(Clone, Copy, PartialEq, Eq, PartialOrd, Ord)]
	#[derive(Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Debug)]

-        };
+        match self.state.update(|state| match state {
+            TaskState::BLOCKED => Some(TaskState::READY),
+            TaskState::RUNNING => Some(TaskState::READY | TaskState::RUNNING),
+            TaskState::READY | TaskState::READY_RUNNING => None,
+            state => unreachable!("Waking a {state:?} task"),
+        }) {
+            Ok(old) => {
+                if old == TaskState::BLOCKED {
+                    // If the task was blocked, we need to put it back to the ready queue.
+                    self.rq().put(self.clone());
+                }
+            }
+            Err(e) => {
+                // This should not happen; log, assert, or handle as appropriate.
+                debug_assert!(false, "Failed to update task state in wake_by_ref: {:?}", e);
+                // Optionally, you could log or handle the error here.
+            }
+        }

Tasking subsystem rework #46

Are you sure you want to change the base?

Tasking subsystem rework #46

Uh oh!

Conversation

greatbridf commented Aug 9, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Aug 9, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Aug 9, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Aug 9, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Aug 9, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants