title: Platform Spec tags: [inventory, spec, platform] created: 2026-05-22 updated: 2026-05-22 status: active related:
Platform Spec
End-to-end contracts and cross-component workflows. Per-component specs live in inventory/specs/{component}.md.
Mission
Protein-based molecule design for therapeutic applications. Data flow: datasets -> models -> endpoints -> workflows -> website.
Three Pillars
| Pillar | Domain | Models |
|---|---|---|
| Pesto | Interface & Interaction Analysis | pesto, pesto-screen |
| Carbonara | Molecule Design | carbonara, carbonara-binders, carbonara-clip |
| Pomodoro | Structure Generation | pomodoro, oracle, backmap |
Must Do
- Provide deployable Docker endpoints for each production model (handler + runtime + schema)
- Support job lifecycle: QUEUED -> RUNNING -> COMPLETED/FAILED via SDK client
- Separate code (git) from data (R2/S3) — never commit weights, datasets, or large binaries
- Track data artifacts via pantry (.datapattern + artifacts.json + pty add/push/pull/resolve)
- Follow handler pattern: validate -> hydrate -> RUNNING -> process -> dehydrate -> COMPLETED
- Register every endpoint’s job type in job_type_registry
- Include pid field in every job (required by API)
- Support backend strategy pattern: RunpodBackend (prod), LocalBackend (Docker test), NullBackend (unit test)
- Provide SDK client (LemnaJobClient) as the single interface for all job operations
- Validate endpoints via test_endpoint.py with MODE toggle (local/runpod)
- Build Docker images with SSH agent forwarding for private git deps
- Use cache-then-copy pattern for large model downloads in Dockerfiles
- Keep workspace/ directories gitignored for experimentation; migrate mature code out
- Maintain master justfile at repo root; per-repo justfiles for local commands
- Use ConfigRuntime instance (not class) for endpoint model config
- Use pantry.resolve() for reading data paths in endpoints (not kw.find_root())
- Use Path(“data”)/… for writing data in endpoints
Must Not Do
- Run
git submodule update --init --recursive - Modify .gitmodules
- Store model weights, datasets, or large binaries in git
- Import from legacy kitchenware paths (core, utils.dev)
- Use .ipynb notebooks (use .mo.py compute-visualize split instead)
- Hardcode data paths (use pantry.resolve or ConfigRuntime)
- Double-wrap RunPod input ({“input”: …} wrapping is intentional in RunpodBackend.submit)
- Use Optional[BaseModel] on server-side FastAPI models (use Optional[Any] for JobBaseAnonymous)
- Build Docker images without ssh-keyscan github.com in known_hosts
- Use —no-cache-dir with uv when cache mounts are active
End-to-End Workflows
Binder Design Pipeline
pesto -> rfdiffusion -> carbonara-binders -> boltz -> openmm
- pesto: Analyze protein interface, identify binding site
- rfdiffusion: Generate backbone scaffold for target site
- carbonara-binders: Design binder sequences on scaffold
- boltz: Validate structure with structure prediction
- openmm: Molecular dynamics refinement
Workflow Contracts
- Each step receives output from previous step as BinaryFile input
- Each step uses its own SDK bucket namespace
- Steps are chained via job output_data -> next job input_data
- Pipeline status tracked via job status (QUEUED/RUNNING/COMPLETED/FAILED)
Data Contracts
| Layer | Storage | Tracking |
|---|---|---|
| Code | git repos | version control |
| Model weights | R2/S3 (outputs/vX.Y.Z/) | pantry (.datapattern + artifacts.json) |
| Training datasets | R2/S3 (datasets/) | pantry |
| Model data (per-repo) | R2/S3 (data/) | pantry |
| Results | R2/S3 | pantry |
Dependency Chain
kitchenware -> models -> endpoints -> workflows -> website
Changes to kitchenware cascade to everything. Test the full chain when modifying it.
Quality Gates (Universal)
just format && just checkpasses on every repo- No unused imports
- Type hints on function signatures
- pathlib.Path for all file paths
- Tests exist for components with side effects