leadforge-dev · shaypal5 · Jun 10, 2026 · Jun 10, 2026 · Jun 10, 2026 · Jun 10, 2026
diff --git a/.agent-plan.md b/.agent-plan.md
@@ -41,10 +41,11 @@ early against the known-good lead-scoring path + physical reorg into
 
 Status: `LTV-M0` landed (#102, #103, #106). `LTV-M1`: `LTV-Pb` merged (#104);
 `LTV-Pc` (pLTV feature/task specs) still outstanding. `LTV-M2`: `LTV-Pd`
-(`GenerationScheme` protocol + registry) opened as **#107** (awaiting review,
-verified byte-identical). Next in M2: `LTV-Pe` (physically move lead-scoring
-pipeline into `schemes/lead_scoring/`), then `LTV-Pf` (scaffold
-`schemes/lifecycle/`).
+(scheme protocol + registry) merged (#107); `LTV-Pe` (scheme owns bundle
+rendering — second half of the seam) opened as **#108** (awaiting review,
+verified byte-identical). M2 reordered so the render seam precedes the physical
+move. Next in M2: `LTV-Pf` (physically move lead-scoring pipeline into
+`schemes/lead_scoring/`), then `LTV-Pg` (scaffold `schemes/lifecycle/`).
 
 ---
 

diff --git a/docs/ltv/roadmap.md b/docs/ltv/roadmap.md
@@ -42,15 +42,15 @@ protocol + registry, with the package physically reorganized into
 |-----------|------------|-----|------------|
 | `LTV-M0` | Planning + design lock | `LTV-Pa` | #102, #103 (+ scheme reframe) |
 | `LTV-M1` | Lifecycle schema foundation | `LTV-Pb`, `LTV-Pc` | #104 (Pb) |
-| `LTV-M2` | Generation-scheme architecture + physical reorg | `LTV-Pd`, `LTV-Pe`, `LTV-Pf` | |
-| `LTV-M3` | Customer population + lifecycle world | `LTV-Pg`, `LTV-Ph` | |
-| `LTV-M4` | Lifecycle simulation engine | `LTV-Pi`, `LTV-Pj` | |
-| `LTV-M5` | Customer snapshots + pLTV targets (both regimes) | `LTV-Pk`, `LTV-Pl` | |
-| `LTV-M6` | Register LifecycleScheme + recipe + manifest/version | `LTV-Pm`, `LTV-Pn` | |
-| `LTV-M7` | Validation + regression-metric calibration | `LTV-Po` | |
-| `LTV-M8` | CLI, notebooks, publish | `LTV-Pp`, `LTV-Pq`, `LTV-Pr` | |
+| `LTV-M2` | Generation-scheme architecture + physical reorg | `LTV-Pd`, `LTV-Pe`, `LTV-Pf`, `LTV-Pg` | #107 (Pd), #108 (Pe) |
+| `LTV-M3` | Customer population + lifecycle world | `LTV-Ph`, `LTV-Pi` | |
+| `LTV-M4` | Lifecycle simulation engine | `LTV-Pj`, `LTV-Pk` | |
+| `LTV-M5` | Customer snapshots + pLTV targets (both regimes) | `LTV-Pl`, `LTV-Pm` | |
+| `LTV-M6` | Register LifecycleScheme + recipe + manifest/version | `LTV-Pn`, `LTV-Po` | |
+| `LTV-M7` | Validation + regression-metric calibration | `LTV-Pp` | |
+| `LTV-M8` | CLI, notebooks, publish | `LTV-Pq`, `LTV-Pr`, `LTV-Ps` | |
 
-Total: ~18 PRs across 9 milestones.
+Total: ~19 PRs across 9 milestones.
 
 ---
 
@@ -96,16 +96,33 @@ Total: ~18 PRs across 9 milestones.
   byte-identical (all 14 files of a pinned-timestamp bundle hash identically,
   main vs branch).
   - Labels: `type: refactor`, `layer: api`, `layer: core`
-- [ ] **`LTV-Pe`** — `refactor: move lead-scoring pipeline to schemes/lead_scoring/`.
-  Physically relocate the lead-scoring population/engine/state/mechanisms/
-  structure/snapshot/relational/task modules + its entity/feature/task specs
-  under `schemes/lead_scoring/`; leave shared primitives in `schema/`,
-  `render/` envelope, etc. Add back-compat import shims where `scripts/` or the
-  sibling datasets repo reference internal paths.
+- [x] **`LTV-Pe`** — `refactor(render): scheme owns bundle rendering` (**PR #108**). Complete
+  the **second half** of the seam against the known-good lead-scoring path:
+  add `write_bundle` to the `GenerationScheme` protocol; move the
+  `api/bundle.py` orchestration body into `LeadScoringScheme.write_bundle`
+  (reusing the already-modular shared helpers — `build_manifest`,
+  `apply_exposure`, `get_filter`); `api/bundle.py::write_bundle` becomes a thin
+  dispatcher on `bundle.spec.scheme`, so `WorldBundle.save()` delegates to the
+  producing scheme. Also harden scheme registration so resolution no longer
+  depends on import order (the side-effect-registration footgun). Verified
+  byte-identical. **Sequenced before the physical move** so the file move
+  relocates a *complete* (both-halves) scheme and `bundle.py`'s call sites
+  change only once. (Reorder rationale: the render path is where schemes
+  diverge most; design it against lead-scoring with byte-identity as the oracle
+  before building lifecycle.)
+  - Tests: render dispatch, determinism through `save()`, unknown-scheme on
+    `save`, base-direct resolution (footgun guard), full suite green.
+  - Labels: `type: refactor`, `layer: render`, `layer: api`
+- [ ] **`LTV-Pf`** — `refactor: move lead-scoring pipeline to schemes/lead_scoring/`.
+  Physically relocate the (now fully scheme-owned) lead-scoring population/
+  engine/state/mechanisms/structure/snapshot/relational/task modules + its
+  entity/feature/task specs under `schemes/lead_scoring/`; leave shared
+  primitives in `schema/`, `render/` envelope, etc. Add back-compat import
+  shims where `scripts/` or the sibling datasets repo reference internal paths.
   - Tests: full suite + hash-determinism green; public API imports unchanged;
     shim coverage.
   - Labels: `type: refactor`, `layer: schema`, `layer: simulation`, `layer: render`
-- [ ] **`LTV-Pf`** — `refactor: scaffold schemes/lifecycle/ + relocate LTV-Pb/Pc specs`.
+- [ ] **`LTV-Pg`** — `refactor: scaffold schemes/lifecycle/ + relocate LTV-Pb/Pc specs`.
   Create `schemes/lifecycle/`; move the lifecycle entity rows (from #104) and
   the `LTV-Pc` feature/task specs into it; register a stub `LifecycleScheme`
   (pipeline methods raise `NotImplementedError` until M3–M6). Split any
@@ -119,13 +136,13 @@ Total: ~18 PRs across 9 milestones.
 
 > Built directly under `schemes/lifecycle/`.
 
-- [ ] **`LTV-Pg`** — `feat(lifecycle): customer population builder`. Customer
+- [ ] **`LTV-Ph`** — `feat(lifecycle): customer population builder`. Customer
   entities, 5 new latent traits, **staggered start dates** ending at the
   absolute `observation_date` (D4); seam for future chained generation (D3).
   - Tests: determinism, latent distributions, staggered-start spread, FK
     integrity, acquisition-window boundary.
   - Labels: `type: feature`, `layer: simulation`
-- [ ] **`LTV-Ph`** — `feat(lifecycle): motif families + mechanism policies`. 5
+- [ ] **`LTV-Pi`** — `feat(lifecycle): motif families + mechanism policies`. 5
   retention motif families; `assign_lifecycle_mechanisms()` mapping motif →
   churn/expansion/payment params.
   - Tests: per-motif param tables, dispatch, determinism.
@@ -135,13 +152,13 @@ Total: ~18 PRs across 9 milestones.
 
 ## `LTV-M4` — Lifecycle simulation engine
 
-- [ ] **`LTV-Pi`** — `feat(lifecycle): churn / expansion / payment hazards`.
+- [ ] **`LTV-Pj`** — `feat(lifecycle): churn / expansion / payment hazards`.
   Weibull churn hazard with renewal-date spike, expansion propensity (the
   heavy-tail generator for pLTV), payment failure + dunning.
   - Tests: hazard shape over tenure, renewal spike, dunning escalation,
     expansion MRR-delta bounds.
   - Labels: `type: feature`, `layer: mechanisms`
-- [ ] **`LTV-Pj`** — `feat(lifecycle): weekly simulation engine`.
+- [ ] **`LTV-Pk`** — `feat(lifecycle): weekly simulation engine`.
   `simulate_lifecycle()`: weekly loop per customer through `observation_date +
   730d (+ early-regime buffer)` so all three windows are fully simulated (D6);
   emits `subscription_events`, `health_signals`, `invoices`; updates terminal
@@ -154,15 +171,15 @@ Total: ~18 PRs across 9 milestones.
 
 ## `LTV-M5` — Customer snapshots + pLTV targets (both regimes)
 
-- [ ] **`LTV-Pk`** — `feat(lifecycle): calendar-anchored customer snapshot`.
+- [ ] **`LTV-Pl`** — `feat(lifecycle): calendar-anchored customer snapshot`.
   `build_customer_snapshot(cutoff=observation_date)`: last-12-week health
   aggregates; `mrr_change_at_snapshot` (valid) + `mrr_change_full_period`
   (trap); the three `ltv_revenue_{90,365,730}d` gross-revenue targets +
   `churned_within_180d`; difficulty distortions.
   - Tests: no post-cutoff data in windowed columns; ZILN target shape; trap
     invariant; target derivation; trap exempt from distortion.
   - Labels: `type: feature`, `layer: render`
-- [ ] **`LTV-Pl`** — `feat(lifecycle): early-pLTV (tenure-anchored) task family`.
+- [ ] **`LTV-Pm`** — `feat(lifecycle): early-pLTV (tenure-anchored) task family`.
   Reuse the snapshot builder with a per-customer relative cutoff
   (`customer_start + early_tenure_weeks`) to emit the cold-start snapshot +
   recomputed targets (D8); separate task directory.
@@ -174,7 +191,7 @@ Total: ~18 PRs across 9 milestones.
 
 ## `LTV-M6` — Register LifecycleScheme + recipe + manifest/version
 
-- [ ] **`LTV-Pm`** — `feat(lifecycle): complete LifecycleScheme + manifest/version`.
+- [ ] **`LTV-Pn`** — `feat(lifecycle): complete LifecycleScheme + manifest/version`.
   Fill in the `LifecycleScheme` pipeline methods (population→sim→render→tasks);
   add `n_customers` + lifecycle config (windows, early-tenure, observation
   anchor) to `GenerationConfig`; record `generation_scheme` + `observation_date`
@@ -184,7 +201,7 @@ Total: ~18 PRs across 9 milestones.
   - Tests: dispatch, lead-scoring path unaffected, manifest fields, regression
     split writer, exposure filtering for new tables.
   - Labels: `type: feature`, `layer: api`, `layer: render`
-- [ ] **`LTV-Pn`** — `feat(recipes): b2b_saas_ltv_v1 recipe assets`. The three
+- [ ] **`LTV-Po`** — `feat(recipes): b2b_saas_ltv_v1 recipe assets`. The three
   recipe YAMLs (`scheme: lifecycle`); register in the recipe registry;
   end-to-end `Generator.from_recipe("b2b_saas_ltv_v1").generate()` smoke test.
   - Tests: recipe loads, full round-trip, determinism, all task splits (3
@@ -195,7 +212,7 @@ Total: ~18 PRs across 9 milestones.
 
 ## `LTV-M7` — Validation + regression-metric calibration
 
-- [ ] **`LTV-Po`** — `feat(validation): lifecycle leakage probes + pLTV metric bands`.
+- [ ] **`LTV-Pp`** — `feat(validation): lifecycle leakage probes + pLTV metric bands`.
   Scheme-aware leakage probes (cutoff window check; banned terminal
   columns/tables; banned forward-window target columns); regression evaluation
   (Spearman, normalized Gini, decile calibration, total-pred-vs-actual, value
@@ -208,15 +225,15 @@ Total: ~18 PRs across 9 milestones.
 
 ## `LTV-M8` — CLI, notebooks, publish
 
-- [ ] **`LTV-Pp`** — `feat(cli): lifecycle generate flags + scheme-aware inspect`.
+- [ ] **`LTV-Pq`** — `feat(cli): lifecycle generate flags + scheme-aware inspect`.
   `--n-customers`, observation/early-tenure flags; `inspect` dispatches on the
   bundle's `generation_scheme`.
   - Labels: `type: feature`, `layer: cli`
-- [ ] **`LTV-Pq`** — `docs(notebooks): pLTV teaching sequence`. ZILN-vs-MSE
+- [ ] **`LTV-Pr`** — `docs(notebooks): pLTV teaching sequence`. ZILN-vs-MSE
   baseline; discrimination/calibration metrics; the `mrr_change_full_period`
   leakage demo; early/cold-start pLTV; value-aware ranking; right-censoring note.
   - Labels: `type: docs`, `layer: render`
-- [ ] **`LTV-Pr`** — `feat(release): package + publish b2b_saas_ltv_v1`. Kaggle
+- [ ] **`LTV-Ps`** — `feat(release): package + publish b2b_saas_ltv_v1`. Kaggle
   + HF packaging (reuse Phase-5 packagers, scheme-aware), LLM critique, dataset
   card, release notes, tag. Publishes under the live `leadforge` Kaggle org.
   - Labels: `type: feature`, `layer: validation`

diff --git a/leadforge/api/bundle.py b/leadforge/api/bundle.py
@@ -1,34 +1,26 @@
-"""Bundle writer — assembles and serialises the full output bundle.
-
-:func:`write_bundle` is called by :meth:`WorldBundle.save` and orchestrates
-all rendering steps:
-
-1. Project the relational dict (snapshot-safe for ``student_public``,
-   full-horizon for ``research_instructor``) and write ``tables/``.
-2. Build the lead snapshot and write task splits (``tasks/``).
-3. Write ``dataset_card.md`` and ``feature_dictionary.csv``.
-4. Apply exposure filtering — write ``metadata/`` for ``research_instructor``
-   mode; skip it for ``student_public``.
-5. Build and write ``manifest.json``.
+"""Bundle writer — dispatches serialisation to the producing generation scheme.
+
+:func:`write_bundle` is called by :meth:`WorldBundle.save`.  It resolves the
+bundle's generation scheme (``bundle.spec.scheme``) and delegates to that
+scheme's :meth:`~leadforge.schemes.base.GenerationScheme.write_bundle`, which
+owns the bundle's on-disk shape end to end (relational tables, task splits,
+dataset card, feature dictionary, exposure metadata, manifest).
+
+Scope note: each scheme currently orchestrates its *own* write sequence; only
+the scheme-agnostic relational-table write is shared today
+(:func:`leadforge.render.relational.write_relational_tables`).  A shared bundle
+orchestrator with scheme render hooks is deferred to ``LTV-M6`` — it depends on
+generalising ``build_manifest`` and ``apply_exposure``, which are still
+lead-scoring-coupled (see ``docs/ltv/roadmap.md``).
+
+This thin module preserves the ``write_bundle(bundle, path)`` entry point.
 """
 
 from __future__ import annotations
 
-from pathlib import Path
 from typing import TYPE_CHECKING
 
-from leadforge.exposure.filters import get_filter
-from leadforge.exposure.modes import apply_exposure
-from leadforge.narrative.dataset_card import render_dataset_card
-from leadforge.render.manifests import build_manifest, write_manifest
-from leadforge.render.relational import to_dataframes
-from leadforge.render.relational_snapshot_safe import to_dataframes_snapshot_safe
-from leadforge.render.snapshots import build_snapshot
-from leadforge.render.tasks import write_task_splits
-from leadforge.schema.dictionaries import write_feature_dictionary
-from leadforge.schema.features import LEAD_SNAPSHOT_FEATURES, redacted_columns_for
-from leadforge.schema.tables import write_parquet
-from leadforge.schema.tasks import task_manifest_for_config
+from leadforge.schemes import get_scheme
 
 if TYPE_CHECKING:
     from leadforge.core.models import WorldBundle
@@ -39,7 +31,7 @@ def write_bundle(
     path: str,
     generation_timestamp: str | None = None,
 ) -> None:
-    """Write *bundle* to disk at *path*.
+    """Write *bundle* to disk at *path* via its generation scheme.
 
     Args:
         bundle: Fully populated :class:`~leadforge.core.models.WorldBundle`.
@@ -48,119 +40,9 @@ def write_bundle(
             Pass a fixed value to produce byte-identical manifests.
 
     Raises:
-        RuntimeError: if any of ``bundle.simulation_result``,
-            ``bundle.population``, or ``bundle.world_graph`` are ``None``.
+        UnknownSchemeError: if ``bundle.spec.scheme`` is not registered.
+        RuntimeError: if the bundle is not fully populated (raised by the
+            scheme's ``write_bundle``).
     """
-    if bundle.simulation_result is None or bundle.population is None or bundle.world_graph is None:
-        raise RuntimeError("WorldBundle is not fully populated. Call Generator.generate() first.")
-
-    root = Path(path)
-    root.mkdir(parents=True, exist_ok=True)
-
-    config = bundle.spec.config
-    result = bundle.simulation_result
-    population = bundle.population
-    world_graph = bundle.world_graph
-
-    # The redaction set comes from the canonical feature spec — the same
-    # source of truth the validator uses.  It is applied uniformly to
-    # every published parquet file (relational tables AND task splits) so
-    # users doing feature engineering off the raw tables (per the
-    # README's "Option 3") cannot trivially reintroduce a redacted
-    # column by joining ``tables/leads.parquet`` to their feature set.
-    redacted = redacted_columns_for(config.exposure_mode)
-    bundle_filter = get_filter(config.exposure_mode)
-
-    # ------------------------------------------------------------------
-    # 1. Relational tables → tables/
-    #
-    # For ``student_public`` (``relational_snapshot_safe = True``) we
-    # project the full-horizon dict onto the snapshot-safe shape:
-    # ``BANNED_LEAD_COLUMNS`` / ``BANNED_OPP_COLUMNS`` are dropped, event
-    # tables are filtered per-lead to ``lead_created_at + snapshot_day``,
-    # and ``BANNED_TABLES`` (``customers`` / ``subscriptions``) are
-    # omitted entirely.  The feature-level redaction below still applies
-    # on top — the two policies operate on disjoint columns
-    # (snapshot-safe owns the structural reconstruction surface;
-    # ``redacted_columns_for`` owns near-deterministic snapshot
-    # features), so they neither double-emit nor overlap.
-    # ------------------------------------------------------------------
-    tables_dir = root / "tables"
-    tables_dir.mkdir(exist_ok=True)
-
-    dfs = to_dataframes(result, population)
-    if bundle_filter.relational_snapshot_safe:
-        if config.snapshot_day is None:
-            raise ValueError(
-                f"exposure_mode={config.exposure_mode.value!r} requires "
-                "config.snapshot_day to be set (the snapshot-safe relational "
-                "export filters event tables to lead_created_at + snapshot_day); "
-                "got snapshot_day=None.  Pin a snapshot_day on the recipe or "
-                "pass it explicitly."
-            )
-        dfs = to_dataframes_snapshot_safe(dfs, snapshot_day=config.snapshot_day)
-    table_row_counts: dict[str, int] = {}
-    for table_name, df in dfs.items():
-        if redacted:
-            cols_to_drop = [c for c in redacted if c in df.columns]
-            if cols_to_drop:
-                df = df.drop(columns=cols_to_drop)
-        write_parquet(df, tables_dir / f"{table_name}.parquet")
-        table_row_counts[table_name] = len(df)
-
-    # ------------------------------------------------------------------
-    # 2. Snapshot + task splits → tasks/
-    #
-    # Same redaction rule applied to the snapshot DataFrame before the
-    # task splits are written, so manifest SHA-256 hashes reflect the
-    # published column set without a post-write rewrite step.
-    # ------------------------------------------------------------------
-    snapshot = build_snapshot(
-        result,
-        population,
-        horizon_days=config.horizon_days,
-        snapshot_day=config.snapshot_day,
-        difficulty_params=config.difficulty_params,
-        seed=config.seed,
-    )
-    if redacted:
-        drop_cols = [c for c in redacted if c in snapshot.columns]
-        if drop_cols:
-            snapshot = snapshot.drop(columns=drop_cols)
-    visible_features = tuple(f for f in LEAD_SNAPSHOT_FEATURES if f.name not in redacted)
-
-    task = task_manifest_for_config(config.primary_task, config.label_window_days)
-    task_row_counts = write_task_splits(snapshot, root / "tasks", seed=config.seed, task=task)
-
-    # ------------------------------------------------------------------
-    # 3. Dataset card and feature dictionary
-    # ------------------------------------------------------------------
-    (root / "dataset_card.md").write_text(
-        render_dataset_card(
-            bundle.spec,
-            task_manifest=task,
-            table_counts=table_row_counts,
-            features=visible_features,
-        )
-    )
-    write_feature_dictionary(root / "feature_dictionary.csv", features=visible_features)
-
-    # ------------------------------------------------------------------
-    # 4. Exposure metadata (research_instructor only)
-    # ------------------------------------------------------------------
-    apply_exposure(bundle, root, config.exposure_mode)
-
-    # ------------------------------------------------------------------
-    # 5. Manifest
-    # ------------------------------------------------------------------
-    manifest = build_manifest(
-        config=config,
-        world_graph=world_graph,
-        table_row_counts=table_row_counts,
-        task_row_counts={task.task_id: task_row_counts},
-        bundle_root=root,
-        generation_timestamp=generation_timestamp,
-        redacted_columns=sorted(redacted),
-        relational_snapshot_safe=bundle_filter.relational_snapshot_safe,
-    )
-    write_manifest(manifest, root)
+    scheme = get_scheme(bundle.spec.scheme)
+    scheme.write_bundle(bundle, path, generation_timestamp=generation_timestamp)