Skip to content

Conversation

@greatbridf
Copy link
Owner

  1. Unified the stackful and stackless coroutine schemes.

Rework the scheduler and rename it to Runtime. Use stackless coroutines as the basic scheduling unit so we can run only one scheduler with only one stack on each cores. This reduces complexity and improves performance A LOT.

In scenarios where we DO need a stackful execution environment, we provide an extra wrappers called stackful(). To run computational (or time comsuming) tasks on a time slice based preemptible basis is as simple as wrapping the future with it and throw it to the runtime:

RUNTIME.spawn(async {
    // Some computational work
    loop { println!("computational work") }
});

The task above will block the cpu forever if it is chosen by the scheduler to run as the poll function will never return. The solution is very simple:

RUNTIME.spawn(stackful(async {
    // Some computational work
    loop { println!("computational work") }
}));

You wrap the future with stackful(), and the future will be thrown to the dedicated stack, got preempted and yield the control back to the runtime. Voila.

  1. Partial adaption of all async functions (this part may need further discussion).

The filesystem and block device subsystem is left unchanged although it would be as easy as noting the functions as #[async_trait] and change every function through the calling chain to async as well (they're migrated to the new block_on function as a temporary solution). The main reason is that async fns in dyn traits can only be implemented by using boxed dyn Futures, which would impact the performance severely on hot paths like the two.

The current solution is to wrap Thread::run with stackful() to make them preemptible so that we can continue to use block_on.... This is ugly but works well.

  1. Fixed various bugs in the HAL's trap subsystem.

SMS-Derfflinger and others added 23 commits July 23, 2025 23:40
Fix page cache's bug, add size check in read function.

Add page cache's base operations for ext4, but the cachepage will not be
dropped until kernel stop, so we need to call fsync function manually,
consider use some strategy such as LRU.
temporary write back by timer, when write function is called, check if the time since the last write back is greater than 10 seconds. If it is, then write back.
Remove old Scheduler. Add Runtime as replacement.

Use stackless coroutine as the low level tasking mechanism and build the
stackful tasks on top of it.

Redesign of the task state system. Rework the executor.

Remove Run trait and anything related.

Signed-off-by: greatbridf <greatbridf@icloud.com>
We use RUNNING to indicate that the task is on the cpu, and use READY to
indicate that the task could be further run again and therefore put into
the ready queue after one poll() call.

When the task is acquired from the ready queue and put onto cpu, it's
marked as RUNNING only, making it put suspended after we got the
Poll::Pending from the poll() call. If we (or others) call Waker::wake()
within the run, we'll set the READY flag then. And when we return from
the poll call, we could find it by a CAS and put it back to the ready
queue again.

We've also done some adaption work to the rest of the kernel, mainly to
remove *SOME* of the Task::block_on calls. But to completely remove it
is not possible for now. We should solve that in further few commits.

Signed-off-by: greatbridf <greatbridf@icloud.com>
Add tracing logs in Runtime::enter and other critical points.

Pass trace_scheduler feature down to eonix_runtime crate, fixing the
problem that the feature is not working.

When the task is blocked, we set CURRENT_TASK to None as well.

In early initialization stage, the stack is placed in identically mapped
physical address. VirtIO driver might try converting the given buffer
paths back to physical ones, which will generate errors. So BSP and AP
should allocate an another stack and switch to it. We use TaskContext
for the fix.

Signed-off-by: greatbridf <greatbridf@icloud.com>
This is used only by Thread when we enter user execution context, when
we need to save the "interrupt stack" to the local CPU so we can get the
information needed to capture the trap.

We need to support nested captured trap returns. So instead of setting
that manually, we save the needed information when trap_return() is
called (since we have precisely the trap context needed) and restore it
after the trap is captured.

Signed-off-by: greatbridf <greatbridf@icloud.com>
On riscv64 platforms, we load the kernel tp only if we've come from U
mode to reduce overhead. But we would restore the tp saved in
TrapContext even if we are returning to kernel space, which causes
problems because the default tp is zero.

We should save kernel tp register to the field in TrapContext structs
when we set privilege mode to kernel.

Signed-off-by: greatbridf <greatbridf@icloud.com>
We provide a simple block_on to constantly poll the given future and
block the current execution thread as before.

We also introduce a new future wrapper named `stackful` to convert any
future into a stackful one. We allocate a stack and keep polling the
future on the stack by constructing a TrapContext and call trap_return()
to get into the stackful environment. Then we capture the timer
interrupt to get preempts work.

Signed-off-by: greatbridf <greatbridf@icloud.com>
If we don't pass in FEATURES or SMP, we will have no feature enabled. In
this scenerio, the dangling --feature argument will cause cargo to panic.

We provide the features and the --feature together to avoid this...

Signed-off-by: greatbridf <greatbridf@icloud.com>
We can pass a function to be called after a success rcu_sync call.

Signed-off-by: greatbridf <greatbridf@icloud.com>
Simple renamings... Further work is needed to make the system work.

Signed-off-by: greatbridf <greatbridf@icloud.com>
The previous implementation has some bugs inside that will cause kernel
space nested traps to lose some required information:

- In kernel mode, trap contexts are saved above the current stack frame
  without exception, which is not what we want. We expect to read the
  trap data in the CAPTURED context.
- The capturer task context is not saved as well, which will mess up the
  nested traps completely.
- We are reading page fault virtual addresses in TrapContext::trap_type,
  which won't work since if the inner trap is captured, and the outer
  trap interleaves with the trap_type() call, we will lose the stval
  data in the inner trap.

The solution is to separate our "normal" trap handling procedure out of
captured trap handling procedure. We swap the stvec CSR when we set up
captured traps and restore it afterwards so the two approach don't have
to tell then apart in trap entries. Then, we can store the TrapContext
pointer in sscratch without having to distinguish between trap handling
types. In the way, we keep the procedure simple.

The register stval is saved together with other registers to be used in
page faults.

Signed-off-by: greatbridf <greatbridf@icloud.com>
We've got everything done in order to make the system run.

Add Thread::contexted to load the context needed for the thread to run.
Wrap the Thread::real_run() with contexted(stackful(...)) in
Thread::run().

We would use this for now. Later, we will make the thread completely
asynchronous. This way we don't have to change its interface then.

Signed-off-by: greatbridf <greatbridf@icloud.com>
Similar to 661a159:
- Save previous {trap, task}_ctx and restore them afterwards.
- Set kernel tp when setting trap context user mode.
- Add the program counter with 4 bytes on breakpoints.

Signed-off-by: greatbridf <greatbridf@icloud.com>
TODO: hide changes to the program counter in the HAL crate.

Signed-off-by: greatbridf <greatbridf@icloud.com>
Copilot AI review requested due to automatic review settings August 9, 2025 20:08
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements a major rework of the tasking subsystem, unifying stackful and stackless coroutine schemes with a new runtime-based scheduling approach. The changes replace the existing scheduler with a simplified Runtime that uses stackless coroutines as the basic scheduling unit, while providing stackful execution via a wrapper for computational tasks that require preemption.

Key changes include:

  • Complete replacement of the Scheduler with a Runtime system using stackless coroutines
  • Introduction of stackful() wrapper for time-slice based preemptible execution of computational tasks
  • Migration from Task::block_on() to a new block_on() function throughout the codebase
  • Simplification of task state management and execution model

Reviewed Changes

Copilot reviewed 53 out of 53 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
src/rcu.rs Updates RCU synchronization to use new runtime and adds call_rcu function
src/lib.rs Major kernel initialization changes to use new runtime and task context switching
src/kernel/task.rs Introduces new block_on and stackful functions, core of the new tasking model
crates/eonix_runtime/src/scheduler.rs Complete rewrite from Scheduler to Runtime with simplified execution model
crates/eonix_runtime/src/task.rs Simplified task structure using direct future polling instead of complex state machine
Multiple filesystem/driver files Migration from Task::block_on to new block_on function

src/rcu.rs Outdated
Task::block_on(rcu_sync());
call_rcu(move || {
let _ = arc;
todo!();
Copy link

Copilot AI Aug 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The todo!() macro will panic when reached, making this code path unusable. This should be replaced with proper implementation or removed if not needed.

Suggested change
todo!();
// Arc is dropped here after RCU grace period.

Copilot uses AI. Check for mistakes.
use posix_types::open::{FDFlags, OpenFlags};

#[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord)]
#[derive(Clone, Copy, PartialEq, Eq, PartialOrd, Ord)]
Copy link

Copilot AI Aug 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The Debug trait was removed from the derive macro but a custom Debug implementation was added later. This could be confusing - consider keeping Debug in the derive macro and removing the custom implementation, or add a comment explaining why the custom implementation is needed.

Suggested change
#[derive(Clone, Copy, PartialEq, Eq, PartialOrd, Ord)]
#[derive(Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Debug)]

Copilot uses AI. Check for mistakes.
state => unreachable!("Waking a {state:?} task"),
}) else {
return;
};
Copy link

Copilot AI Aug 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Using let Ok(old) = ... pattern and then ignoring the error case with early return could mask unexpected states. Consider handling the error case explicitly or adding a comment explaining when this is expected to fail.

Suggested change
};
match self.state.update(|state| match state {
TaskState::BLOCKED => Some(TaskState::READY),
TaskState::RUNNING => Some(TaskState::READY | TaskState::RUNNING),
TaskState::READY | TaskState::READY_RUNNING => None,
state => unreachable!("Waking a {state:?} task"),
}) {
Ok(old) => {
if old == TaskState::BLOCKED {
// If the task was blocked, we need to put it back to the ready queue.
self.rq().put(self.clone());
}
}
Err(e) => {
// This should not happen; log, assert, or handle as appropriate.
debug_assert!(false, "Failed to update task state in wake_by_ref: {:?}", e);
// Optionally, you could log or handle the error here.
}
}

Copilot uses AI. Check for mistakes.
core::arch::asm!("ebreak");

#[cfg(target_arch = "loongarch64")]
core::arch::asm!("break 1");
Copy link

Copilot AI Aug 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Architecture-specific inline assembly blocks are scattered throughout the function. Consider extracting these into architecture-specific helper functions to improve readability and maintainability.

Suggested change
core::arch::asm!("break 1");
arch_breakpoint();

Copilot uses AI. Check for mistakes.
The current implementation use the WokenUp object to detect whether the
stackful task is woken up somewhere. This is WRONG since we might lose
wakeups as the runtime have no idea what we have done. If someone wakes
us up, the task won't be enqueued so we will never have a second chance
to get to the foreground.

The fix is to use Arc<Task> to create a waker and check whether the task
is ready each time we get back to the stackful poll loop.

Signed-off-by: greatbridf <greatbridf@icloud.com>
We introduced a per-thread allocator inside the future object to
allocate space for the syscalls. This ensures performance and saves
memory. The allocator takes up 8K for now and is enough for current use.

Signed-off-by: greatbridf <greatbridf@icloud.com>
Signed-off-by: greatbridf <greatbridf@icloud.com>
Signed-off-by: greatbridf <greatbridf@icloud.com>
Use unwinding crate to unwind the stack and print stack trace.

Sightly adjust the linker script and move eh_frame into rodata section.

Due to limited kernel image size, there might be some problems on x86_64
platforms. Further fixes needed but won't be fixed for now.

Signed-off-by: greatbridf <greatbridf@icloud.com>
(cherry picked from commit 6bb54d9eae13b76768f011c44222b25b785b83e0)
Signed-off-by: greatbridf <greatbridf@icloud.com>
The stackful tasks might be woken up before actually being put into
sleep by returning a Poll::Pending. Thus, infinite sleep will occur
since we are no longer on both the wait list and the ready queue.

The solution is to remember that we are woken up in stackful wakers and
check before putting us to sleep by wait_for_wakeups().

Also, implement Drop for RCUPointer by using call_rcu to drop the
underlying data. We must mark T: Send + Sync + 'static in order to send
the arc to the runtime...

Signed-off-by: greatbridf <greatbridf@icloud.com>
The current implementation ignores the given argument and uses the
default arch. Change the wrong behavior...

Signed-off-by: greatbridf <greatbridf@icloud.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants