title: Pomodoro v0.7.x Training Plan — Ablation from v0.6.3 Baseline tags: [journal, pomodoro, training, plan, ablation, v0.7] created: 2026-05-04 updated: 2026-05-04 status: active related:
Pomodoro v0.7.x Training Plan — Ablation from v0.6.3 Baseline
Baseline: v0.6.3 Config
S=64, L=8, r=0.0, max_size=128
Loss: (la + lb + lnb + lr + lc + lda + ldr + ldc) * lw
v0.6.3 is the best so far on lDDT (local structure) but stagnates after ~3 days. Goal: isolate what helps by changing one thing at a time.
New Versions
| Version | Changed Param(s) | Value | Rationale |
|---|---|---|---|
| v0.7.0 | max_size | 64 (was 128) | Smaller structures = faster iterations, more pretraining signal |
| v0.7.1 | loss | RMSD only (la + lr + lc) | Remove all distance losses (lda, ldr, ldc) and push-pull (lb, lnb) — simpler gradient landscape, test if distance terms cause stagnation |
| v0.7.2 | max_size + loss | 64 + RMSD only | Combine both changes — smallest/fastest regime, best for rapid iteration during pretraining |
| v0.7.3 | soft_normalize | disabled | Test if soft_normalize at end of GT operations constrains gradient flow and contributes to stagnation |
Per-Version Config Changes
v0.7.0 — Smaller Structures
ConfigData.max_size: 128 → 64ConfigRuntime.version:"0.7.0"- Everything else same as v0.6.3
v0.7.1 — Simplified Loss (RMSD Only)
ConfigRuntime.version:"0.7.1"- In
main.py:209, change:loss = (la + lb + lnb + lr + lc + lda + ldr + ldc) * lw→loss = (la + lr + lc) * lw - Still compute and log all loss components for monitoring, just don’t backprop through distance/push-pull terms
- Everything else same as v0.6.3
v0.7.2 — Smaller Structures + Simplified Loss
ConfigData.max_size: 128 → 64ConfigRuntime.version:"0.7.2"- Same loss simplification as v0.7.1
v0.7.3 — No soft_normalize
ConfigRuntime.version:"0.7.3"- Remove
soft_normalizecalls at the end of GT operations inmodel.py:VectorTrack(L60):v = soft_normalize(v, dim=1)→v(identity)ScalarTrack(L140-141):Q = soft_normalize(Q, dim=2),K = soft_normalize(K, dim=2)→ removeVectorTrack(L214-215, L220): same pattern → removeBootstrapVectorState(L340):pz = soft_normalize(pz, dim=1)→pzGeometryDecoderModel(L428-429):Q,Knormalize → removeGeometryDecoderModel(L488):u = soft_normalize(u, dim=1)→u
- Alternative: make
soft_normalizea no-op via a config flag rather than deleted, so it’s easy to re-enable. - Everything else same as v0.6.3
Implementation Steps
-
Create v0.7.0 worktree from v0.6.3 branch:
cd models/pomodoro/pomodoro git branch v0.7.0 v0.6.3 git worktree add ../v0.7.0 v0.7.0Edit
config.py:max_size=64,version="0.7.0". Commit + push. -
Create v0.7.1 worktree from v0.6.3 branch:
git branch v0.7.1 v0.6.3 git worktree add ../v0.7.1 v0.7.1Edit
config.py:version="0.7.1". Editmain.py:209: simplify loss to(la + lr + lc) * lw. Still log all components. Commit + push. -
Create v0.7.2 worktree from v0.7.1 branch (inherits loss simplification):
git branch v0.7.2 v0.7.1 git worktree add ../v0.7.2 v0.7.2Edit
config.py:max_size=64,version="0.7.2". Commit + push. -
Create v0.7.3 worktree from v0.6.3 branch:
git branch v0.7.3 v0.6.3 git worktree add ../v0.7.3 v0.7.3Edit
config.py:version="0.7.3". Editmodel.py: remove or disablesoft_normalizecalls at end of GT operations. Commit + push.
Loss Simplification Detail
Current loss in main.py:209:
loss = (la + lb + lnb + lr + lc + lda + ldr + ldc) * lwSimplified (v0.7.1, v0.7.2):
loss = (la + lr + lc) * lwRemoved:
lda— atom distance matrix loss (O(N²) memory)ldr— residue distance matrix lossldc— chain distance matrix losslb— bonded push-pull losslnb— non-bonded push-pull loss
All components still computed and logged for observability — only the backward gradient path changes.
What We Learn
| Comparison | Tests |
|---|---|
| v0.7.0 vs v0.6.3 | Does smaller structure size break stagnation? |
| v0.7.1 vs v0.6.3 | Does removing distance/push-pull losses break stagnation? |
| v0.7.2 vs v0.7.0 & v0.7.1 | Are the two effects independent or synergistic? |
| v0.7.3 vs v0.6.3 | Does removing soft_normalize break stagnation? (tests if GT output normalization constrains gradient flow) |