title: EDM Noise Distribution Fix — Gradient Bias Analysis tags: [journal, edm, pomodoro, noise, training, critical] created: 2026-05-15 updated: 2026-05-15 status: active related:
- “edm-algorithm-review”
- “edm-pomodoro-algorithm-review”
- “edm-noise-exploration”
- “edm-pomodoro-session”
- “diffusionv2-algorithm”
EDM Noise Distribution Fix — Gradient Bias Analysis
Problem: Exponential Decay of Effective Training Signal
The compute_noise.mo.py notebook plotted effective_gradient = pdf(z) * lw(sigma), showing a massive exponential bias toward low-σ training:
- Gradient at z = -5 (σ ≈ 0.1) is ~100× stronger than at z = -1.2 (median σ)
- Gradient at z = 2 (σ ≈ 120) is ~10,000× weaker
This means the model gets virtually all its training gradient from fine local geometry (bond lengths, angles) and almost none from global structure (fold topology, residue rearrangements).
Key Insight: λ·c_out² = 1 Cancellation
The EDM preconditioning ensures perfect gradient scaling cancellation:
λ(σ) · c_out² = (σ² + σ_data²) / (σ·σ_data)² · σ²·σ_data² / (σ² + σ_data²) = 1
This means:
- λ(σ) is NOT optional — it belongs to the denoising score matching objective and cancels
c_out²to give uniform gradient norm per sample - The
pdf * lwchart was measuring loss-value density, not gradient scale — the real gradient on F_θ ispdf(z)only - All bias comes from the log-normal
p(z)distribution, NOT fromlw(σ)
Fix Applied: Wider Log-Normal (v0.9.1)
| Parameter | v0.9.0 | v0.9.1 |
|---|---|---|
| P_mean | -1.2 | -1.84 |
| P_std | 1.5 | 2.8 |
| Effective median σ | 4.8 Å | 2.5 Å |
| 95% range σ | 0.25–95 Å | 0.06–110 Å |
P_mean formula: P_mean = ln(sigma_min / sigma_data) + P_std² ensures the low-σ CDF tail still reaches sigma_min.
Why not log-uniform?
Log-uniform with λ(σ) intact would create a cubic singularity (1/σ³) at low σ, making the bias worse. Log-uniform only works if λ(σ) is dropped or redesigned. Wider log-normal is simpler and preserves the EDM objective.
What’s NOT changed
lambda(σ)stays exactly as-is — it cancelsc_out²perfectlysigma_min = 1e-3,sigma_max = 80.0,rho = 7.0unchanged- No code changes in
context.py,objectives.py, or any other module - Only
config.pyvalues changed
Implementation
- Branch:
v0.9.1(fromv0.9.0at commitb627bce) - Worktree:
models/pomodoro/workspace/v0.9.1/ - Commit:
v0.9.1: wider noise sampling (P_mean=-1.84, P_std=2.8) - Remote: pushed to
origin/v0.9.1 - Main branch also updated with same config values