NEX - Hardware Accelerator Full-Stack Simulation Framework

NEX is a simulation framework for running hardware-accelerated software full-stack end-to-end. NEX supports unmodified software stacks except that drivers for accelerators need to be modified slightly for NEX to interpose between software and hardware.

NEX runs software on actual CPUs without simulating any CPU architectures. NEX does support simulating more virtual CPUs than what's available.

NEX supports hardware accelerator simulators in the form of RTL simulators or DSim (LPN-based simulator models). Accelerators including Versatile Tensor Accelerator (VTA), hardware JPEG Decoder, and Protobuf serialization/deserialization accelerator (Protoacc) are integrated into NEX right now. NEX can run multiple such accelerators as configured.

NEX supports host-accelerator interconnect modeling and DMA latency modeling as well.

NEX supports time-warp features that let users manipulate timestamps in the application.

Features

Multi-Accelerator Support: VTA, JPEG decoder, and Protoacc accelerators
Dual Simulation Modes: RTL simulation and fast decoupled functional/performance di-simulation (DSim)
Host-Accelerator Interaction: Modeling of interconnect and memory subsystem
Scalable Architecture: Support for multiple accelerator instances
BPF-based Scheduling: Controlling CPU execution based on SCX.

Architecture

Building

(Using a virtual machine)

⚠️ A virtual machine should only be used for development of nex, not for testing the performance of nex or accuracy of nex.

If you'd like to run nex inside a virtual machine, you can do the following steps (note, nex will be slower inside a virtual machine).

cd virtual-machine
./download_vm.sh
tmux new-session -d -s nex-vm-noble "sudo ./launch.sh"

Note, if your machine has multiple sockets, it's better/important to try to use one socket if that has enough cores and use taskset to pin qemu onto those cores. Otherwise, nex when running inside qemu may get too slow, and the kernel will kill nex automatically and the experiments hangs.

Use the following to log into the virtual machine.

./login_vm.sh

Finally, clone nex then install packages inside the vm

cd virtual-machine
./install_inside_vm.sh

Note, when doing make autoconfig, you should use make autoconfig_vm which search configs within a different range, this is mainly due to different latency of local timer interrupts when the kernel runs inside a vm or not. If you don't find a config using make autoconfig_vm; adjust the last two or three parameters in ./test/autoconfig.sh $(CONFIG_PROJECT_PATH) <tolerance of error> <lowerbound> <upperbound>

Installing the Kernel

Install kernels that have SCX support, check SCX repo for installation guide.

For example, on Ubuntu 24.04:

$ sudo add-apt-repository -y --enable-source ppa:arighi/sched-ext
$ sudo apt install -y linux-generic-wip scx
$ sudo reboot

The above may be outdated, if so, try the following

$ sudo add-apt-repository -y --enable-source ppa:arighi/sched-ext
$ sudo apt upgrade
$ sudo reboot

(Nix Env)

You can optionally install nix and use the prepared environments

Install nix with only one command (check here for updated commands)

sh <(curl --proto '=https' --tlsv1.2 -L https://nixos.org/nix/install) --daemon

Then run the following commands to start

nix-shell

Build the NEX repo

Clone the NEX repo then initialize submodules.

git submodule update --init --recursive

Make scx (if encounter errors and you are in nix-shell, exit nix-shell and try again)

sudo make scx -j

Configure NEX

make menuconfig

Make the project.

make -j
make dsim -j

Install NEX so you can use it in other directories

sudo make install

Configuration

NEX uses a Kconfig-based configuration system. Key configuration options:

Host Simulation Modes

ROUND_BASED_SIM: Epoch-based CPU scheduling with configurable time slices
TOTAL_CORES: Enter the total number of cores on this system here
SIM_CORES: Enter the number of cores you want to use for NEX simulation
SIM_VIRT_CORES: Enter the number of virtual cores you want NEX to simulate; leave it 0 if you want every thread to run on a virtual core
ROUND_SLICE: Enter the epoch duration.
EXTRA_COST_TIME: Adjustment for the epoch duration. If you don't know what to set, try make autoconfig; NEX runs a script to find out
DEFAULT_ON_OFF: Whether you want to turn epoch-based scheduling on by default or not. Note: you can always turn the scheduling on or off in the application. When the scheduling is on, the application gets a slowdown of 10-20x

Accelerator Interactive Mode

USE_FAULT: NEX captures host-to-accelerator communication by segfaults
USE_TICK: NEX captures host-to-accelerator communication by illegal instructions as ticks (all accelerators integrated into NEX are now using this mode)
EAGER_SYNC: Turn on eager synchronization
EAGER_SYNC_PERIOD: Eager sync period in nanoseconds

Accelerator Configuration

VTA: Versatile Tensor Accelerator
JPEG: JPEG Decoder Accelerator
Protoacc: Protoacc Accelerator

For each accelerator, you can configure:

DSIM: Functional/performance decoupled simulator based on LPN
LEGACY_DSIM: The same DSim simulator but compiled together with nex and integrated more tightly with NEX, however this is less modular. In this mode, nex can only be configured with one accelerator at a time. Legacy dsim is faster than DSIM, but we recommend using DSim instead of legacy DSim.
RTL: Verilator compiled RTL simulators
FREQ: Frequency of the accelerator in MHz
LINK_DELAY: Link delay between the accelerator and host in nanoseconds (for example, PCIe link delay is typically a few hundred nanoseconds).
NUM: Number of accelerator instances.

Memory Subsystem

MEM_LPN: Memory subsystem simulator based on LPN
CACHE_HIT_LATENCY: Set the cache hit latency in nanoseconds
CACHE_MISS_FETCH_LATENCY: Set the cache miss fetch latency in nanoseconds
CACHE_FIRST_HIT: Set whether the first access to an empty cacheline is considered a hit or a miss
CACHE_SIZE: Set cache size in KB.
CACHE_ASSOC: Set cache associativity.

Usage

You can start NEX simulation simply by running the command:

sudo nex <your application>

Note: if you want to run multiple commands, you can put all your commands in a script, then run

sudo nex <your script>

Note: sudo will reset your environment variables, use sudo -E to keep those.

Note: if you set extra environment variables when launching nex, set it before, for example SOME_ENV=1 nex.

Running Prepared Experiments

Prepared experiments are available in experiments/:

Configuring NEX

You only need to enter the following 3 configs.

TOTAL_CORES
SIM_CORES: leave 16 cores at least
EXTRA_COST_TIME

Then run the following to update experiments settings

cd experiments/
./update_configs_for_all_exp.sh

This script will print Verification (first few updated files) at the end, please double check the udpated config is what you entered, if mismatch is found, try to make menuconfig in NEX, change some config to save the configs again. Then repeat ./update_configs_for_all_exp.sh.

VTA

Install environment first (note: you might need to fix environment issues manually, as noted in the build-tvm.sh). If you didn't use nix-shell, please run mannually export NEX_HOME=<nex_path>, one experiment needs this environment variable.

cd experiments/
./build-tvm.sh
cd vta_exp/
./run_all.sh

Note, when running ./build-tvm.sh if you have enough memory, you can increase the number after make -j to speedup build.

JPEG

cd jpeg_exp/
./run_all.sh

Protoacc

cd protoacc_exp
./run_all.sh

Results

The results are all in <NEX_Path>/results.

Running the following will compile the results into a Python dictionary stored in results/scripts/compiled_data

cd results/scripts
python extract_jpeg.py
python extract_protoacc.py
python extract_vta.py

Note: to plot the results in comparison with the gem5-based simulator, you need to copy the compiled results to results/scripts/gem5_compiled_data after running gem5, then do similary extraction, then run

python plot_simtime_speedup.py

Note: If your environment can't run plot, source the env we created for VTA/TVM experiments. Run this command first: source <NEX-path>/experiments/tvm-vta-env/bin/activate

To run gem5-related experiments, please refer to repo https://github.com/dslab-epfl/SimBricks-LPN/. Note, the gem5 is configured to match Intel(R) Xeon(R) Gold 6248R CPU @ 3.00GHz, if you run nex on a different CPU and compare results with gem5-based experiments, you may observe large differences because the CPUs don't match.

Contact

If you have any questions or suggestions, feel free to reach out to us at (jiacheng.ma@epfl.ch).

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
experiments		experiments
external		external
figs		figs
include		include
json_lib		json_lib
lib		lib
mk		mk
out		out
results		results
scripts		scripts
scx		scx
simulators		simulators
src		src
test		test
virtual-machine		virtual-machine
.envrc		.envrc
.gitignore		.gitignore
.gitmodules		.gitmodules
Kconfig		Kconfig
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
config.mk		config.mk
rules.mk		rules.mk
shell.nix		shell.nix

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

NEX - Hardware Accelerator Full-Stack Simulation Framework

Features

Architecture

Building

(Using a virtual machine)

Installing the Kernel

(Nix Env)

Build the NEX repo

Configuration

Host Simulation Modes

Accelerator Interactive Mode

Accelerator Configuration

Memory Subsystem

Usage

Running Prepared Experiments

Configuring NEX

VTA

JPEG

Protoacc

Results

Contact

About

Uh oh!

Releases

Packages

Languages

License

dslab-epfl/NEX

Folders and files

Latest commit

History

Repository files navigation

NEX - Hardware Accelerator Full-Stack Simulation Framework

Features

Architecture

Building

(Using a virtual machine)

Installing the Kernel

(Nix Env)

Build the NEX repo

Configuration

Host Simulation Modes

Accelerator Interactive Mode

Accelerator Configuration

Memory Subsystem

Usage

Running Prepared Experiments

Configuring NEX

VTA

JPEG

Protoacc

Results

Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages