This repository contains the code and dataset for artifact evaluation of NBReplay, as published in the proceedings of The IEEE International Symposium on Cluster, Cloud, and Internet Computing (CCGrid) 2026.
NBReplay is an end-to-end system for efficient, reproducible execution of distributed workflows in notebooks. It includes a checkpoint/restore system for Jupyter notebooks that enables auditing and repeating notebook executions. It consists of two components:
- NBRewind kernel — a custom Jupyter kernel that tracks cell execution and supports checkpoint/restore
- taskvine_rewind — task-level caching for distributed TaskVine/DaskVine workflows
- Linux
- Conda (Miniconda or Anaconda)
conda env create -f AE.yml --name ccgrid
conda activate ccgridFrom the root of this repository:
pip install -e .cd NBRewind
python install_kernels.pyThis registers two Jupyter kernels:
- NBRewind — checkpoint/restore kernel
- NBrewind Audit Kernel — provenance-tracking kernel (via sciunit)
Each experiment is in a subdirectory under dataset/. Start Jupyter, open the notebook in the workflow/ directory, select the NBRewind kernel (Kernel → Change Kernel → NBRewind), launch the TaskVine worker in a separate terminal, then run the notebook top to bottom.
Note for distributed clusters: If workers are running on remote machines, ensure that port 9123 is reachable from the worker nodes to the manager host. You can verify connectivity with:
nc -zv <manager-host> 9123On AWS EC2, open port 9123 in the instance's security group inbound rules.
vine_worker -M ctrendvine_worker -M cms-dv5vine_worker -M dask-taskvine-mapreduce-managervine_worker -M dconvvine_worker -M rag-liteNBRewind operates in two modes controlled by a magic command at the top of the notebook.
Add the following magic command in the first cell:
%audit onRun the notebook top to bottom. NBRewind will checkpoint cell outputs and track dependencies.
%audit offIn repeat mode, NBRewind replays previously checkpointed results without re-executing cells.
To reset and perform a fresh audit, remove all checkpoint and cache files from the notebook's working directory:
rm -rf *.pkl metadata.db rewind.txlog vine_outputs/Then re-run the notebook top to bottom with %audit on.
For citing our work, use the following:
@InProceedings{Azaz_2026_CCGrid,
author = {Azaz, Talha and Ahmad, Raza and Islam, Md Saiful and Thain, Douglas and Malik, Tanu},
title = {Efficiently Reproducing Distributed Workflows in Notebook-based Systems},
booktitle = {Proceedings of the IEEE International Symposium on Cluster, Cloud, and Internet Computing (CCGrid), 2026},
year = {2026}
}