A rate of convergence for gradient descent on smooth and convex functions, proved in Lean4.
From December 2023 through April 2025, I used completing this proof as a benchmark for testing formalization ability in language models. In January 2026, I tried Aristotle with a GPT-5.2 Pro (Extended Thinking) blueprint, and it completed the formalization in about 15 minutes.
I provided Aristotle from Harmonic with a proof sketch generated by GPT-5.2 Pro (Extended Thinking), then asked Aristotle to formalize it in Lean.
- Aristotle formalized proof: Gd/Lean/Aristotle/Proof.lean
- Aristotle tool: https://aristotle.harmonic.fun
- Prompt and full chat transcript used to generate the sketch: blueprint_prompt.md
- Raw Aristotle output file in this repo: e623ec93-eb6e-4db9-a318-9702e060e589-output.lean
- Codex cleanup diff (raw Aristotle output -> cleaned file): Gd/Lean/Aristotle/codex_cleanup.diff
- Toolchain note: the Aristotle proof is pinned to
leanprover/lean4:v4.24.0in Gd/Lean/Aristotle/lean-toolchain. Build it fromGd/Lean/Aristotle(repo root usesleanprover/lean4:v4.3.0-rc2).
Generation flow:
- I prompted GPT-5.2 Pro (Extended Thinking) for a detailed proof blueprint.
- I saved that full chat in
blueprint_prompt.md. - I took exactly the GPT output without manual edits and gave it to Aristotle.
- Aristotle produced the formal Lean proof linked above.
- I ran Codex on the code output from Aristotle.
Codex cleanup note: "file already typechecked; i only cleaned proof-script noise so it runs clean. specifically: replaced a few ring calls with ring_nf, removed unused simp args, and renamed unused h_min to _h_min to silence the linter. no mathematical changes."
I periodically tried to complete this proof with language models from December 2023 through April 2025, and none could do it. Aristotle was the first that could.
https://x.com/damekdavis/status/1728120500142940284?s=20
- https://x.com/damekdavis/status/1730983634570510819?s=20
- https://terrytao.wordpress.com/2023/12/05/a-slightly-longer-lean-4-proof-tour/