Skip to content

HazyResearch/intelligence-per-watt

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Intelligence Per Watt

Intelligence Per Watt

A benchmarking suite for LLM inference systems. Intelligence Per Watt sends workloads to your inference service and collects detailed telemetry—energy consumption, power usage, memory, temperature, and latency—to help you optimize performance and compare hardware configurations.

Installation

Prerequisites

Setup

git clone https://github.com/HazyResearch/intelligence-per-watt.git

# Create and activate virtual environment
uv venv
source .venv/bin/activate

# Build energy monitoring
uv run scripts/build_energy_monitor.py

# Install Intelligence Per Watt
uv pip install -e intelligence-per-watt

Optional inference clients ship as extras—install each one you need from the package directory, e.g. uv pip install -e 'intelligence-per-watt[ollama]' or uv pip install -e 'intelligence-per-watt[vllm]'.

Quick Start

# 1. List available inference clients
ipw list clients

# 2, Run a benchmark
ipw profile \
  --client ollama \
  --model llama3.2:1b \
  --client-base-url http://localhost:11434

# 3. Analyze the results
ipw analyze ./runs/profile_*

# 4. Generate plots
ipw plot ./runs/profile_*

What gets measured: For each query, Intelligence Per Watt captures energy consumption, power draw, GPU/CPU memory usage, temperature, time-to-first-token, throughput, and token counts.

Commands

ipw profile

Send prompts to the device, profile hardware usage, and calculate IPW/IPJ.

ipw profile --client <client> --model <model> [options]

Options:

  • --client - Inference client (e.g., ollama, vllm)
  • --model - Model name
  • --client-base-url - Client base URL
  • --eval-client - Judge client for scoring (default: openai)
  • --eval-base-url - Judge service URL (default: https://api.openai.com/v1)
  • --eval-model - Judge model (default: gpt-5-nano-2025-08-07)
  • --max-queries - Limit queries for testing
  • --dataset - Workload dataset (default: ipw)
  • --output-dir - Where to save results

Example:

ipw profile \
  --client ollama \
  --model llama3.2:1b \
  --client-base-url http://localhost:11434 \
  --max-queries 100

ipw analyze

By default, ipw analyze calculates IPW and IPJ for a dataset. summary stats. To see energy, power, and latency vs. intput/output length, use --analysis regression.

ipw analyze <results_dir>
# or explicitly choose a different analysis
# ipw analyze <results_dir> --analysis regression

ipw plot

Visualize profiling data (scatter plots, regression lines, distributions).

ipw plot <results_dir> [--output <dir>]

ipw list

Discover available clients, datasets, and analysis types.

ipw list <clients|datasets|analyses|visualizations|all>

Energy monitor test script

Validate that your system can collect energy telemetry before running full workloads.

uv run scripts/test_energy_monitor.py [--interval 2.0]

Output

Profiling runs save to ./runs/profile_<hardware>_<model>/:

runs/profile_<hardware>_<model>/
├── data-*.arrow        # Per-query metrics (HuggingFace dataset format)
├── summary.json        # Run metadata and totals
├── analysis/           # Regression coefficients, statistics
└── plots/              # Graphs

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 5