gd-lean

A rate of convergence for gradient descent on smooth and convex functions, proved in Lean4.

From December 2023 through April 2025, I used completing this proof as a benchmark for testing formalization ability in language models. In January 2026, I tried Aristotle with a GPT-5.2 Pro (Extended Thinking) blueprint, and it completed the formalization in about 15 minutes.

Final Update (February 8, 2026)

I provided Aristotle from Harmonic with a proof sketch generated by GPT-5.2 Pro (Extended Thinking), then asked Aristotle to formalize it in Lean.

Aristotle formalized proof: Gd/Lean/Aristotle/Proof.lean
Aristotle tool: https://aristotle.harmonic.fun
Prompt and full chat transcript used to generate the sketch: blueprint_prompt.md
Raw Aristotle output file in this repo: e623ec93-eb6e-4db9-a318-9702e060e589-output.lean
Codex cleanup diff (raw Aristotle output -> cleaned file): Gd/Lean/Aristotle/codex_cleanup.diff
Toolchain note: the Aristotle proof is pinned to leanprover/lean4:v4.24.0 in Gd/Lean/Aristotle/lean-toolchain. Build it from Gd/Lean/Aristotle (repo root uses leanprover/lean4:v4.3.0-rc2).

Generation flow:

I prompted GPT-5.2 Pro (Extended Thinking) for a detailed proof blueprint.
I saved that full chat in blueprint_prompt.md.
I took exactly the GPT output without manual edits and gave it to Aristotle.
Aristotle produced the formal Lean proof linked above.
I ran Codex on the code output from Aristotle.

Codex cleanup note: "file already typechecked; i only cleaned proof-script noise so it runs clean. specifically: replaced a few ring calls with ring_nf, removed unused simp args, and renamed unused h_min to _h_min to silence the linter. no mathematical changes."

I periodically tried to complete this proof with language models from December 2023 through April 2025, and none could do it. Aristotle was the first that could.

Hand-Proof History

Some twitter threads:

Thread summarizing progress:

https://x.com/damekdavis/status/1728120500142940284?s=20

A lemma about convergence rates of sequences (with a guest appearance by Terry Tao):

Formalizing the 'gradient inequality' for smooth convex functions:

https://x.com/damekdavis/status/1734238424083661046?s=20

Formalizing the Descent Lemma for differentiable functions with Lipschitz gradients

https://x.com/damekdavis/status/1734961810241953896?s=20

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
Gd		Gd
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
README.md		README.md
blueprint_prompt.md		blueprint_prompt.md
e623ec93-eb6e-4db9-a318-9702e060e589-output.lean		e623ec93-eb6e-4db9-a318-9702e060e589-output.lean
lake-manifest.json		lake-manifest.json
lakefile.lean		lakefile.lean
lean-toolchain		lean-toolchain

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

gd-lean

Final Update (February 8, 2026)

Hand-Proof History

Some twitter threads:

Thread summarizing progress:

A lemma about convergence rates of sequences (with a guest appearance by Terry Tao):

Formalizing the 'gradient inequality' for smooth convex functions:

Formalizing the Descent Lemma for differentiable functions with Lipschitz gradients

About

Uh oh!

Releases

Packages

Languages

damek/gd-lean

Folders and files

Latest commit

History

Repository files navigation

gd-lean

Final Update (February 8, 2026)

Hand-Proof History

Some twitter threads:

Thread summarizing progress:

A lemma about convergence rates of sequences (with a guest appearance by Terry Tao):

Formalizing the 'gradient inequality' for smooth convex functions:

Formalizing the Descent Lemma for differentiable functions with Lipschitz gradients

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages