Skip to content

A hybrid optimizer that combines information-theoretic entropy minimization, Bayesian variational updates, and hierarchical clustering to achieve ultra-fast, deterministic convergence on black-box objectives.

License

Notifications You must be signed in to change notification settings

tommygrammar/entropy-guided-optimization

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Batched Coherent Entropy-Guided Optimizer (BCEGO)

A hybrid optimizer that combines information-theoretic entropy minimization, Bayesian variational updates, and hierarchical clustering to achieve ultra-fast, deterministic convergence on black-box objectives.


Table of Contents

  1. Motivation
  2. Key Ideas
  3. Algorithm Overview
  4. Code Components
  5. Installation & Setup
  6. Usage & Examples
  7. Parameters & Tuning
  8. Benchmark Results
  9. Comparison to Baselines
  10. How It Works
  11. Extending & Contributing
  12. License

Motivation

Traditional black-box optimizers like gradient ascent, evolutionary strategies (ES), random search each have strengths but also significant drawbacks:

  • Gradient Ascent relies on smoothness and often stalls on plateaus.
  • Evolutionary Strategies adapt distributions but can be slow to collapse uncertainty.
  • Random Search never refines its sampling distribution, wasting evaluations.

I wanted a method that:

  1. Explores globally in an information-efficient way.
  2. Identifies coherent, high-fitness regions via spatial clustering.
  3. Rapidly collapses uncertainty (entropy) once a mode is found.
  4. Locks in and iteratively refines in a near-deterministic fashion.
  5. Provides diagnostic transparency (e.g. “refocus” events).

This is how I came up with Batched Coherent Entropy-Guided Optimization (BCEGO).


Key Ideas

  • Variational Bayesian Update We treat our search distribution $q(x)$ as a multivariate Gaussian and update its mean $\mu$ and covariance $\Sigma$ via softmax-weighted samples:

    $$ w_i \propto \exp(\eta,f(x_i)),,\quad \mu\leftarrow\sum_i w_i x_i,\quad \Sigma\leftarrow\sum_i w_i(x_i-\mu)(x_i-\mu)^T $$

  • Hierarchical Clustering We enforce spatial coherence by building two levels of clusters on each batch:

    1. Macro-cluster: largest connected component under threshold $\tau_{\mathrm{macro}}$.
    2. Micro-cluster: within macro, a stricter threshold $\tau_{\mathrm{micro}}$.
  • Entropy Minimization Covariance determinant $\det\Sigma$ captures search-space volume → entropy. We collapse it aggressively by clustering and weighting.

  • Refocus Mechanism As soon as ≥ refocus_min_samples samples in a batch satisfy $\lvert f(x)\rvert \le$ refocus_thresh, we reset $(\mu, \Sigma)$ to that tight cluster, causing an instant entropy collapse.


Algorithm Overview

  1. Initialize $\mu=\mu_0$, $\Sigma=\Sigma_0$.

  2. Repeat for each iteration:

    • Draw a batch $X\sim\mathcal{N}(\mu,\Sigma)$.
    • Evaluate $f(X)$.
    • Refocus if enough $f(x)\approx0$: recenter on those samples.
    • Else apply macromicro clustering on batch.
    • Compute softmax weights $w\propto e^{\eta f}$ on micro-cluster.
    • Update $\mu,\Sigma$ via weighted sample moments.
  3. Stop after iters or upon convergence.


Code Components

  • main.py

    • class BCEGO_H: core optimizer.

    • Methods:

      • _cluster_mask: builds adjacency and finds largest component.
      • _try_refocus: checks refocus criteria, resets belief.
      • step: one optimization iteration.
      • optimize: run N iterations, collect history.
  • example.py

    • Demonstrates usage on a 2D quadratic with known optimum at $(1,2)$.
    • Prints per-iteration: mean, $\det\Sigma$, best $f$, [REFOCUS] flags.

Installation & Setup

  1. Clone the repository

    git clone https://github.com/yourusername/Batched-Coherent-Entropy-Guided-Optimizer.git
    cd Batched-Coherent-Entropy-Guided-Optimizer
  2. Python dependencies

    • Requires Python 3.7+
    • Only dependency: NumPy
  3. (Optional) Create a virtual environment

    python3 -m venv venv
    source venv/bin/activate
    pip install --upgrade pip
  4. Run the example

    python example.py

Usage & Examples

Basic Usage

from main import BCEGO_H
import numpy as np

# Define objective: 2D quadratic
def fobj(X):
    return -((X[:,0]-1)**2 + (X[:,1]-2)**2)

# Initialize optimizer
opt = BCEGO_H(
    f=fobj, dim=2,
    mu0=np.zeros(2), Sigma0=np.eye(2)*5,
    eta=1.0, batch_size=100,
    tau_macro=0.6, tau_micro=0.85,
    refocus_thresh=1e-2, refocus_min_samples=5,
    random_seed=42
)

# Run and collect history
history = opt.optimize(iters=50)
for i, (mu, detS, best, refocused) in enumerate(history,1):
    flag = " [REFOCUS]" if refocused else ""
    print(f"Iter {i:2d}: mean={mu}, detΣ={detS:.2e}, best f={best:.4f}{flag}")

Parameters & Tuning

Parameter Description Default
eta Fitness scaling for weight exponentiation 1.0
batch_size Number of samples per iteration 100
tau_macro Macro-cluster threshold (fraction of median similarity) 0.5
tau_micro Micro-cluster threshold within macro cluster 0.8
refocus_thresh Absolute f-value threshold to consider “near-optimal” 1e-2
refocus_min_samples Minimum count of near-optimal samples to trigger refocus 5
random_seed Seed for reproducibility None
  • Smaller tau_* → larger clusters (more exploration).
  • Larger eta → sharper weight focus (faster exploitation).
  • refocus_thresh & refocus_min_samples control the critical “phase transition” to exploitation.

Benchmark Results

Running each algorithm for 50 iterations on the same 2D quadratic:

Algorithm Iter to $f\approx0$ detΣ at Iter 5 Final detΣ
BCEGO_H 5 4.5 × 10⁻⁶ 1.2 × 10⁻⁷
Evolutionary ES 14 7.6 × 10⁻² ~4 × 10⁻¹⁶
Gradient Ascent 26 fixed 1e-8 fixed 1e-8
Random Search ~7.8 × 10¹ ~7.2 × 10¹
  • BCEGO_H: by Iter 5, hits near-optimal and collapses entropy orders of magnitude faster than ES.
  • ES: steady but slower contraction.
  • Gradient: stalls on covariance and slow f-improvement.
  • Random: never refines.

Comparison to Baselines

  • Speed: BCEGO_H reaches optimum in 5× fewer iterations than ES.
  • Entropy Reduction: achieves a 5-order magnitude drop in search-space volume in one step.
  • Data Efficiency: 500 function calls vs thousands typically.
  • Deterministic Behavior: once refocused, remains locked and smooth.
  • Transparency: [REFOCUS] flags signal phase change.

How It Works

  1. Batch Sampling: gather diversity from $\mathcal{N}(\mu,\Sigma)$.
  2. Refocus Check: if enough $\lvert f\rvert$ small, recenter belief—entropy shock-collapse.
  3. Macro/Micro Clustering: enforce spatial coherence, prune noise.
  4. Variational Update: weighted moment matching → new $\mu,\Sigma$.
  5. Repeat until convergence or max iterations.

This unifies Bayesian filtering, deterministic annealing, and evolutionary sampling.


Extending & Contributing

  • New objectives: drop in any f(X).
  • Higher dimensions: works in arbitrary $\mathbb{R}^d$.
  • Alternative clustering: replace _cluster_mask with your favorite graph-based method.
  • Contribution: PRs, issues, and suggestions welcome!

License

This project is licensed under the MIT License. Feel free to use, modify, and distribute!

About

A hybrid optimizer that combines information-theoretic entropy minimization, Bayesian variational updates, and hierarchical clustering to achieve ultra-fast, deterministic convergence on black-box objectives.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages