title: Pomodoro v0.7.x Ablation Results tags: [journal, pomodoro, training, results, ablation, v0.7] created: 2026-05-12 updated: 2026-05-12 status: active related:


Pomodoro v0.7.x Ablation Results

Version Comparison Table

VersionSLmax_sizeLosssoft_normalizefragment_lengthKey Change
v0.6.0328128fullyesBaseline (small S)
v0.6.1328512fullyesmax_size=512
v0.6.23232128fullyesL=32 (4x deeper)
v0.6.3648128fullyesS=64 (wider)
v0.7.064864fullyesmax_size=64
v0.7.1648128RMSD onlyyesSimplified loss (la+lr+lc)
v0.7.264864RMSD onlyyesmax_size=64 + simplified loss
v0.7.3648128fulldisabledsoft_normalize removed from GT
v0.8.06488192fullyes8Fragment cropping + full-size data

Full loss = (la + lb + lnb + lr + lc + lda + ldr + ldc) * lw; RMSD only = (la + lr + lc) * lw

Observations

  • RMSD-only loss is sufficient — v0.7.2 performs as good as or slightly better than equivalent models with the full loss.
  • Model width (S) doesn't change much — increasing S from 32 to 64 didn’t yield dramatic gains.
  • Model depth (L) might improve performance — v0.6.2 (L=32) is the deepest model; depth deserves further exploration.

Next Steps

  • Train v0.7.2 on more GPUs for longer duration to confirm the simplified loss holds at scale.