This file explains how to interpret solver diagnostics and how to use the debug trace for unstable or under-converged runs.
OTSolution.diagnostics reports the latest checkpoint values.
Key fields:
continuity_residual: how well the discrete continuity equation is satisfiedprimal_delta: change in the primal iteratedual_delta: change in the dual iteratemax_constraint_residual: worst current feasibility residualceh_cg_residual: residual of the innerCE_hconjugate-gradient solveceh_cg_iters: inner CG iterations used at the latest checkpoint
Mode note:
primal_deltaanddual_deltause weighted paper-style norms.OTConfig.numerics_modeis compatibility-only and must be"paper".
A run returns converged=False when the stopping rules were not all satisfied.
That usually means one of these:
- the solve hit
max_iters, - the inner
CE_hsolve is underpowered, - the outer PDHG iterate is still moving,
- the current iterate is numerically singular.
Important:
- a small continuity residual alone does not imply a solved run.
Use:
from jgot import OTConfig, solve_ot
sol = solve_ot(problem, OTConfig(record_debug_trace=True))
trace = sol.debug_traceThe trace is recorded:
- at checkpoint iterations only,
- inside the JIT-compiled solve path,
- in fixed-size arrays.
Only the first trace.num_records entries are valid.
Most useful fields:
trace.iterationstrace.actiontrace.continuity_residualtrace.primal_deltatrace.dual_deltatrace.min_vartheta
A rising action does not mean a bug by itself. The solver is PDHG, not a monotone descent method on the action. The action is evaluated on the current raw iterate, and raw objective values are not guaranteed to decrease.
This is the most important failure signature in the current implementation. If:
continuity_residualis already small,- but
actionbecomesinf, - and
min_varthetaapproaches0,
then the issue is usually:
- outer-iterate degeneracy on the
K/ action side, - not a failure of the continuity projection.
This is the best early warning for a singular run.
If min_vartheta steadily approaches 0, the iterate is moving toward a state
where the action denominator collapses.
The current large-grid diagnostics showed a representative 8x8 failure mode:
- continuity residual became very small,
- inner CG became very accurate,
- but the action still became non-finite,
- because
min_varthetacollapsed toward zero.
Interpretation:
- the continuity side was already working,
- the raw outer iterate was still drifting,
- the instability was on the
K/ action side.
This is why the trace records both:
- continuity,
- and
min_vartheta.
When a run is unstable or too slow, tune in this order:
cg_max_itersmax_itersstepsblob_size- only then, if needed:
tausigmarelaxation
Reason:
- first ensure the inner continuity projection is not the bottleneck,
- then reduce outer stiffness,
- only then touch the PDHG step sizes.
The large-grid example supports:
uv run python examples/large_grid_transport/run.py --debug-traceThis writes:
- a trace
.npz - a trace
.png
Use that path when:
converged=False,- action looks suspicious,
- or you want to see whether the run is failing on the continuity side or the
K/ action side.