Skip to content

Ablation: remove tandem curriculum entirely#1693

Closed
tcapelle wants to merge 2 commits intonoamfrom
noam-r23/remove-tandem-curriculum
Closed

Ablation: remove tandem curriculum entirely#1693
tcapelle wants to merge 2 commits intonoamfrom
noam-r23/remove-tandem-curriculum

Conversation

@tcapelle
Copy link
Contributor

@tcapelle tcapelle commented Mar 20, 2026

Hypothesis

The tandem curriculum (lines 712-715) was introduced to help the model learn single-foil patterns before tandem complexity. However, the current implementation is bugged (detects ALL samples as tandem, effectively zeroing gradients for 10 epochs). The fix from PR #1674 was essentially neutral on val_loss (+0.0006) — suggesting the curriculum itself may not be beneficial.

This ablation removes the curriculum entirely. If the curriculum is not helping (or is slightly harmful due to the epoch-10 shock when tandem samples are suddenly reintroduced), removing it should improve training efficiency and potentially val_loss by giving the model full data from epoch 0.

Key insight: The current bugged curriculum zeros ALL gradients for 10 epochs. The fact that the model trains well despite this suggests those 10 epochs are largely wasted. Removing the curriculum gives the model 10 more productive training epochs.

Instructions

In train.py, delete the tandem curriculum block entirely (lines 712-715):

Delete these 4 lines:

if epoch < 10:
    is_tandem_curr = (x[:, :, -8:].abs().sum(dim=(1, 2)) > 0.01)
    sample_mask = (~is_tandem_curr).float()[:, None, None]
    abs_err = abs_err * sample_mask

No other changes. Run with --wandb_group noam-r23-remove-curriculum.

Baseline

  • val_loss = 0.8326
  • in_dist surf_p = 17.94
  • ood_cond surf_p = 13.98
  • ood_re surf_p = 27.54
  • tandem surf_p = 36.73

Results

W&B run: q5qmcsj6

Split val/loss surf_Ux surf_Uy surf_p vol_Ux vol_Uy vol_p
val_in_dist 0.5672 4.992 1.794 17.98 0.949 0.323 18.58
val_tandem_transfer 1.5862 37.9
val_ood_cond 0.6613 13.4
val_ood_re 0.5147 27.5
combined val/loss 0.8324

Baseline: val_loss=0.8326 | in_dist=17.94 | ood_cond=13.98 | ood_re=27.54 | tandem=36.73
Delta: -0.0002 vs baseline (essentially neutral)

Peak memory: 18.2 GB

What happened:
Essentially neutral result. val/loss 0.8324 vs 0.8326 is indistinguishable from noise. Per-split: ood_cond surf_p improved (-0.58 Pa), tandem is slightly worse (+1.17 Pa), in_dist and ood_re are approximately equal.

This confirms the hypothesis that the tandem curriculum (in either its buggy or fixed form) has minimal effect on training outcomes. The model learns equally well from epoch 0 with full tandem data — the "wasted" 10 warm-up epochs don't meaningfully hurt (or help) final performance. The curriculum adds code complexity for no benefit and can be cleanly removed.

Note: vis pipeline Fourier PE fix also applied to prevent crash during visualization.

Suggested follow-ups:

  • This result frees up the first 10 epochs — could invest them in a different curriculum (e.g., easier cases first sorted by Re number or geometry complexity).
  • Alternatively, the curriculum slot could be used for a multi-stage LR schedule that starts higher and cools more aggressively in the first 10 epochs.

@tcapelle tcapelle added status:wip Student is working on it student:norman Assigned to norman noam Noam advisor branch experiments labels Mar 20, 2026
@github-actions
Copy link

github-actions bot commented Mar 20, 2026


Thank you for your submission, we really appreciate it. Like many open-source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution. You can sign the CLA by just posting a Pull Request Comment same as the below format.


I have read the CLA Document and I hereby sign the CLA


0 out of 2 committers have signed the CLA.
❌ @senpai-advisor
❌ @senpai-norman
senpai-advisor, senpai-norman seem not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You can retrigger this bot by commenting recheck in this Pull Request. Posted by the CLA Assistant Lite bot.

@tcapelle tcapelle marked this pull request as ready for review March 20, 2026 15:27
@tcapelle tcapelle added status:review Ready for advisor review and removed status:wip Student is working on it labels Mar 20, 2026
@morganmcg1 morganmcg1 closed this Mar 22, 2026
@github-actions github-actions bot locked and limited conversation to collaborators Mar 22, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

noam Noam advisor branch experiments status:review Ready for advisor review student:norman Assigned to norman

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants