⚓ Haven

A secure AI evaluation framework for running AI safety benchmarks within AWS Nitro Enclaves, combining Rust and Python components.

Overview

The following repository contains all artifacts for this MPhil project, which is a mix of rust and python (and too many scripts to glue stuff together). A good starting point is to look at root Makefile and the README.md files in all the subdirectories:

analysis - code and artefacts to reproduce plots in my thesis based on benchmark runs
enclave - first prototype for running the enclave and the host, sending files, running llama in the enclave, generating attestation documents. Does not include AI Safety benchmarks.
llama_runner - running llama et al. using llama-cpp2 through the actor model
bert_runner - running bert models using rust-bert through the actor model (compiles against mock implementation by default or links with libpytdorch if compiled through -F use_rust_bert)
evaluation - scaffolding (messages, file_transfer, dataset handling) to run AI safety benchmarks in haven. Configurations and prompts for the audit code for different graphs is included in evaluation/tasks. Recreates huggingfaces dataset library in dataset.rs. Includes code to run the evaluation without
evaluation_enclave - protocol for the enclave side. uses type-state pattern: InitializedState -> LlamaLoadedState -> BertLoadedState -> DatasetLoadedState -> EvaluatedState -> AttestedState
evaluation_host - protocol for the host side. uses type-state-pattern: Disconnected -> Connected -> LlamaSent -> BertSent -> DatasetSent -> EvaluationComplete -> AttestationReceived
quantization - quantize models in AWS Nitro Enclaves (through llama.cpp and pytorch)
scripts - the ducttape (basically)
vsock - abstracts away (most) of the vsock trouble, takes a server/client handle that interact with the sock
gpu_baseline - essentially evaluation but in Python to run on GPUs through vLLM

Run Enclave

Make sure you have AWS Nitro CLI and SDK installed here and sufficient memory allocated in the nitro allocator.

From the project directory:

cd enclave && cargo build 
make build-docker && make build-eif && make run-enclave

Run make terminate-enclave if you wish to stop the enclave.

Run Host

cd enclave && cargo build && cargo run

Analysis

Make sure to have uv installed.

All the raw data is in quantization_ablation_model for local analysis and in the remote_experiments for AWS. Make sure input datasets contain classification pairs for postprocessing.

cd analysis
uv run local_analysis.py
uv run aws_analysis.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

⚓ Haven

Overview

Run Enclave

Run Host

Analysis

AWS Analysis

Local Analysis

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 107 Commits
analysis		analysis
attestation		attestation
bert_runner		bert_runner
enclave		enclave
evaluation		evaluation
evaluation_enclave		evaluation_enclave
evaluation_host		evaluation_host
gpu_baseline		gpu_baseline
llama_runner		llama_runner
quantization		quantization
scripts		scripts
vsock		vsock
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
Makefile		Makefile
README.md		README.md

chrisschnabl/haven

Folders and files

Latest commit

History

Repository files navigation

⚓ Haven

Overview

Run Enclave

Run Host

Analysis

AWS Analysis

Local Analysis

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages