Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 7 additions & 6 deletions .agent-plan.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,12 +40,13 @@ early against the known-good lead-scoring path + physical reorg into
`schemes/`**. (Framing follows Google `lifetime_value`/ZILN and Voyantis pLTV.)

Status: `LTV-M0` landed (#102, #103, #106). `LTV-M1`: `LTV-Pb` merged (#104);
`LTV-Pc` (pLTV feature/task specs) still outstanding. `LTV-M2`: `LTV-Pd`
(scheme protocol + registry) merged (#107); `LTV-Pe` (scheme owns bundle
rendering — second half of the seam) opened as **#108** (awaiting review,
verified byte-identical). M2 reordered so the render seam precedes the physical
move. Next in M2: `LTV-Pf` (physically move lead-scoring pipeline into
`schemes/lead_scoring/`), then `LTV-Pg` (scaffold `schemes/lifecycle/`).
`LTV-Pc` (pLTV feature/task specs) still outstanding. `LTV-M2`: `LTV-Pd` (#107)
and `LTV-Pe` (#108) merged (scheme protocol + render seam). `LTV-Pf` (physical
move, **hard break / no shims** per D12) split into Pf.1 (compute core —
simulation/mechanisms/structure moved) opened as **#109**, and Pf.2 (render
move, pending). Verified byte-identical. Sibling `leadforge-datasets-private`
Comment on lines +44 to +47
build scripts must update to the new import paths (breakage issue filed). Next:
`LTV-Pf.2` (render), then `LTV-Pg` (scaffold `schemes/lifecycle/`).

---

Expand Down
20 changes: 20 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,26 @@ Format inspired by [Keep a Changelog](https://keepachangelog.com/).

## Unreleased

### Moved — lead-scoring internals under `schemes/lead_scoring/` (breaking: internal import paths)

Part of the peer-generation-schemes architecture (`docs/ltv/design.md` §2.5).
The lead-scoring compute core was physically relocated under the new
`leadforge.schemes.lead_scoring` package. **The documented public API
(`leadforge.api`, the CLI) is unchanged**, and generated bundles are
byte-identical; only direct imports of these *internal* modules break (no
back-compat shims, by design):

| old import path | new import path |
|---|---|
| `leadforge.simulation.*` | `leadforge.schemes.lead_scoring.simulation.*` |
| `leadforge.mechanisms.*` | `leadforge.schemes.lead_scoring.mechanisms.*` |
| `leadforge.structure.*` | `leadforge.schemes.lead_scoring.structure.*` |

`render/{snapshots,relational,tasks}` and the lead-scoring `schema` specs
relocate in follow-up PRs. Consumers importing internals (e.g. the
`leadforge-datasets-private` build scripts) must update to the new paths;
the package stays on the `1.x` line (the public contract did not change).

### CLI surfaces v4 fields

- `leadforge inspect` now prints `Primary task`, `Label window`,
Expand Down
39 changes: 18 additions & 21 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -157,10 +157,13 @@ leadforge/
core/ rng.py, ids.py, time.py, enums.py, models.py, exceptions.py, ...
narrative/ spec.py, company.py, product.py, personas.py, market.py, funnel.py, dataset_card.py
schema/ entities.py, relationships.py, events.py, features.py, tasks.py, dictionaries.py
structure/ node_types.py, graph.py, motifs.py, templates.py, rewiring.py, sampler.py, constraints.py
mechanisms/ base.py, static.py, transitions.py, counts.py, categorical.py, scores.py, hazards.py, measurement.py, policies.py
simulation/ world.py, state.py, population.py, scheduler.py, engine.py, interventions.py
render/ relational.py, snapshots.py, metadata.py, manifests.py, graph_export.py, notebooks.py
schemes/ base.py (GenerationScheme protocol + SCHEME_REGISTRY);
lead_scoring/ — the lead-scoring scheme: __init__.py (build_world/
write_bundle) + simulation/, mechanisms/, structure/ (moved in
LTV-Pf.1). render/ + lead-scoring schema specs migrate here in
LTV-Pf.2 / LTV-Pg. See docs/ltv/design.md §2.5.
render/ relational.py (+ write_relational_tables), snapshots.py, manifests.py, tasks.py
# lead-scoring render still here pending LTV-Pf.2
exposure/ modes.py, filters.py, redaction.py
validation/ invariants.py, artifact_checks.py, realism.py, difficulty.py, drift.py
recipes/ registry.py, b2b_saas_procurement_v1/{recipe,narrative,schema,motifs,difficulty_profiles}.yaml
Expand Down Expand Up @@ -239,23 +242,17 @@ leadforge/ # Python package root
│ ├── relationships.py # FK constraints (ALL_CONSTRAINTS)
│ ├── tasks.py # SplitSpec, TaskManifest, CONVERTED_WITHIN_90_DAYS
│ └── dictionaries.py # Feature dictionary CSV writer
├── structure/ # Hidden world graph
│ ├── graph.py # WorldGraph (DAG wrapper)
│ ├── motifs.py # 5 motif families
│ ├── rewiring.py # Stochastic graph perturbation
│ └── sampler.py # sample_hidden_graph()
├── mechanisms/ # Node/edge behavior
│ ├── policies.py # assign_mechanisms() — motif → MechanismAssignment
│ ├── hazards.py # ConversionHazard
│ ├── transitions.py # StageSequence, HazardTransition
│ ├── counts.py # PoissonIntensity, RecencyDecayIntensity
│ ├── categorical.py # CategoricalInfluence, CHANNEL_QUALITY_SCORES
│ └── scores.py # LatentScore
├── simulation/ # World evolution
│ ├── engine.py # simulate_world() — 90-day daily loop
│ ├── state.py # LeadSimState (per-lead mutable state)
│ └── population.py # build_population() — accounts, contacts, leads
├── render/ # Bundle output
├── schemes/ # Generation schemes (peer pipelines) + registry
│ ├── base.py # GenerationScheme protocol + SCHEME_REGISTRY
│ └── lead_scoring/ # The lead-scoring scheme (LeadScoringScheme)
│ ├── __init__.py # build_world() + write_bundle()
│ ├── structure/ # Hidden world graph (WorldGraph, motifs, sampler)
│ ├── mechanisms/ # Node/edge behavior (policies, hazards, scores, …)
│ └── simulation/ # World evolution (engine, population, state)
│ # NOTE (LTV-M2 reorg in progress): render/{snapshots,relational,tasks}
│ # relocate under schemes/lead_scoring/ in a follow-up; schema specs split
│ # in LTV-Pg. See docs/ltv/design.md §2.5 for the target layout.
├── render/ # Bundle output (envelope + not-yet-moved lead-scoring render)
│ ├── snapshots.py # build_snapshot() — ML-ready lead table
│ ├── relational.py # to_dataframes() — 9-table dict
│ ├── tasks.py # write_task_splits() — train/valid/test Parquet
Expand Down
Binary file added assets/leadforge_advanced.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/leadforge_intermediate.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/leadforge_intro.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
56 changes: 48 additions & 8 deletions docs/ltv/roadmap.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ protocol + registry, with the package physically reorganized into
|-----------|------------|-----|------------|
| `LTV-M0` | Planning + design lock | `LTV-Pa` | #102, #103 (+ scheme reframe) |
| `LTV-M1` | Lifecycle schema foundation | `LTV-Pb`, `LTV-Pc` | #104 (Pb) |
| `LTV-M2` | Generation-scheme architecture + physical reorg | `LTV-Pd`, `LTV-Pe`, `LTV-Pf`, `LTV-Pg` | #107 (Pd), #108 (Pe) |
| `LTV-M2` | Generation-scheme architecture + physical reorg | `LTV-Pd`, `LTV-Pe`, `LTV-Pf`, `LTV-Pg` | #107 (Pd), #108 (Pe), #109 (Pf.1) |
| `LTV-M3` | Customer population + lifecycle world | `LTV-Ph`, `LTV-Pi` | |
| `LTV-M4` | Lifecycle simulation engine | `LTV-Pj`, `LTV-Pk` | |
| `LTV-M5` | Customer snapshots + pLTV targets (both regimes) | `LTV-Pl`, `LTV-Pm` | |
Expand Down Expand Up @@ -114,13 +114,23 @@ Total: ~19 PRs across 9 milestones.
`save`, base-direct resolution (footgun guard), full suite green.
- Labels: `type: refactor`, `layer: render`, `layer: api`
- [ ] **`LTV-Pf`** — `refactor: move lead-scoring pipeline to schemes/lead_scoring/`.
Physically relocate the (now fully scheme-owned) lead-scoring population/
engine/state/mechanisms/structure/snapshot/relational/task modules + its
entity/feature/task specs under `schemes/lead_scoring/`; leave shared
primitives in `schema/`, `render/` envelope, etc. Add back-compat import
shims where `scripts/` or the sibling datasets repo reference internal paths.
- Tests: full suite + hash-determinism green; public API imports unchanged;
shim coverage.
Physically relocate the (now fully scheme-owned) lead-scoring modules under
`schemes/lead_scoring/`; leave shared primitives in `schema/` and the
`render/` envelope. **Hard break, no shims** (decision D12): old internal
import paths are removed and all in-repo callers updated; the
`leadforge-datasets-private` build scripts must update in lockstep (tracked
via a breakage issue there). Public API (`leadforge.api`, CLI) unchanged;
package stays `1.x` with a CHANGELOG "Moved" note. Split into two PRs to keep
each reviewable and byte-identical:
- [x] **`LTV-Pf.1`** — compute core: `simulation/` + `mechanisms/` +
`structure/` moved as whole directories (21 file renames, all callers
rewritten). Verified byte-identical; full suite green. (**PR #109**)
- [ ] **`LTV-Pf.2`** — render: relocate `render/{snapshots,relational,tasks}`
under the scheme, splitting `render/relational.py` so the shared
`write_relational_tables` stays in the envelope while the 9-table
`to_dataframes` moves. (The lead-scoring `schema` specs split lands with
`LTV-Pg`.)
- Tests: full suite + hash-determinism green; public API imports unchanged.
- Labels: `type: refactor`, `layer: schema`, `layer: simulation`, `layer: render`
- [ ] **`LTV-Pg`** — `refactor: scaffold schemes/lifecycle/ + relocate LTV-Pb/Pc specs`.
Create `schemes/lifecycle/`; move the lifecycle entity rows (from #104) and
Expand Down Expand Up @@ -198,6 +208,14 @@ Total: ~19 PRs across 9 milestones.
+ windows in the manifest; bump `BUNDLE_SCHEMA_VERSION` 5 → 6 (D5); teach the
task-split writer the continuous-target path. Extend `CLAUDE.md` hard
constraints with the lifecycle snapshot-safety clause + the schemes/ layout.
- **Layering cleanup (carried debt, see `Known deferred cleanups` below):**
generalise `build_manifest` (drop the lead-scoring `world_graph` param) and
`apply_exposure` (stop hard-coding the lead-scoring hidden graph + latent
registry) so they are scheme-agnostic; with that done, remove the
`core.models` / `render.relational` **TYPE_CHECKING** back-references to
`leadforge.schemes.lead_scoring.*` introduced in `LTV-Pf.1` (a core→scheme
layering inversion), and lift the shared render orchestration out of each
scheme's `write_bundle` (the decomposition deferred in `LTV-Pe`).
- Tests: dispatch, lead-scoring path unaffected, manifest fields, regression
split writer, exposure filtering for new tables.
- Labels: `type: feature`, `layer: api`, `layer: render`
Expand Down Expand Up @@ -240,6 +258,28 @@ Total: ~19 PRs across 9 milestones.

---

## Known deferred cleanups (tech debt carried by M2, paid down in M6)

The peer-schemes reorg deliberately defers a few cleanups to keep each M2 PR
byte-identical and reviewable. They are tracked here and discharged in
**`LTV-Pn`** (M6), where the manifest/exposure generalization makes them clean:

1. **Shared render orchestration** — `LTV-Pe` left each scheme owning its full
`write_bundle`; only `write_relational_tables` is shared. A shared bundle
orchestrator with scheme render hooks lands once there are two schemes.
2. **`build_manifest` / `apply_exposure` are lead-scoring-coupled** —
`build_manifest` takes a `world_graph`; `apply_exposure` writes the
lead-scoring hidden graph + latent registry. Generalize both to be
scheme-agnostic.
3. **core→scheme layering inversion** — `LTV-Pf.1` introduced
`TYPE_CHECKING`-only imports of `leadforge.schemes.lead_scoring.*` in
`core.models` (`WorldBundle.world_graph: WorldGraph | None`) and
`render.relational`. Harmless at runtime (no eager import), but `core`/shared
`render` should not reference a scheme. Remove once (2) makes
`WorldBundle` hold scheme-agnostic artifacts.

---

## Dependencies

```
Expand Down
6 changes: 3 additions & 3 deletions leadforge/core/models.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,9 +11,9 @@

if TYPE_CHECKING:
from leadforge.narrative.spec import NarrativeSpec
from leadforge.simulation.engine import SimulationResult
from leadforge.simulation.population import PopulationResult
from leadforge.structure.graph import WorldGraph
from leadforge.schemes.lead_scoring.simulation.engine import SimulationResult
from leadforge.schemes.lead_scoring.simulation.population import PopulationResult
from leadforge.schemes.lead_scoring.structure.graph import WorldGraph


# Default generation scheme when a recipe/world does not declare one. Kept here
Expand Down
2 changes: 1 addition & 1 deletion leadforge/exposure/metadata.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ def write_metadata_dir(bundle: WorldBundle, bundle_root: Path) -> None:
bundle_root: Root directory of the written bundle.
"""
from leadforge.core.rng import RNGRoot
from leadforge.mechanisms.policies import assign_mechanisms
from leadforge.schemes.lead_scoring.mechanisms.policies import assign_mechanisms

# Callers must only invoke this after full bundle assembly; world_graph
# and population are guaranteed non-None at that point.
Expand Down
2 changes: 1 addition & 1 deletion leadforge/render/manifests.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@

if TYPE_CHECKING:
from leadforge.core.models import GenerationConfig
from leadforge.structure.graph import WorldGraph
from leadforge.schemes.lead_scoring.structure.graph import WorldGraph

# Bump this whenever the bundle layout or manifest schema changes.
# History:
Expand Down
8 changes: 4 additions & 4 deletions leadforge/render/relational.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,8 +30,8 @@
from collections.abc import Collection
from pathlib import Path

from leadforge.simulation.engine import SimulationResult
from leadforge.simulation.population import PopulationResult
from leadforge.schemes.lead_scoring.simulation.engine import SimulationResult
from leadforge.schemes.lead_scoring.simulation.population import PopulationResult

_Source = Literal["population", "simulation"]

Expand Down Expand Up @@ -63,9 +63,9 @@ def to_dataframes(
"""Convert simulation output to one typed DataFrame per relational table.

Args:
result: Output of :func:`~leadforge.simulation.engine.simulate_world`.
result: Output of :func:`~leadforge.schemes.lead_scoring.simulation.engine.simulate_world`.
population: Output of
:func:`~leadforge.simulation.population.build_population`.
:func:`~leadforge.schemes.lead_scoring.simulation.population.build_population`.
Comment on lines 65 to +68

Returns:
Dict mapping table name → ``pd.DataFrame`` with dtypes matching the
Expand Down
10 changes: 5 additions & 5 deletions leadforge/render/snapshots.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,12 +24,12 @@
TouchRow,
)
from leadforge.schema.features import LEAD_SNAPSHOT_FEATURES
from leadforge.simulation.population import REVENUE_BAND_MIDPOINTS
from leadforge.schemes.lead_scoring.simulation.population import REVENUE_BAND_MIDPOINTS

if TYPE_CHECKING:
from leadforge.core.models import DifficultyParams
from leadforge.simulation.engine import SimulationResult
from leadforge.simulation.population import PopulationResult
from leadforge.schemes.lead_scoring.simulation.engine import SimulationResult
from leadforge.schemes.lead_scoring.simulation.population import PopulationResult

# Ordered column list and dtypes derived from the canonical feature spec.
_SNAPSHOT_COLUMNS = [f.name for f in LEAD_SNAPSHOT_FEATURES]
Expand Down Expand Up @@ -76,9 +76,9 @@ def build_snapshot(
horizon).

Args:
result: Output of :func:`~leadforge.simulation.engine.simulate_world`.
result: Output of :func:`~leadforge.schemes.lead_scoring.simulation.engine.simulate_world`.
population: Output of
:func:`~leadforge.simulation.population.build_population`.
:func:`~leadforge.schemes.lead_scoring.simulation.population.build_population`.
Comment on lines 78 to +81
horizon_days: Simulation horizon length. Defaults to 90.
snapshot_day: Optional windowed snapshot day. When set, only events
with timestamps ``<= lead_created_at + timedelta(days=snapshot_day)``
Expand Down
18 changes: 9 additions & 9 deletions leadforge/schemes/lead_scoring/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,13 +3,13 @@
Owns the lead-scoring pipeline — hidden-DAG sampling, difficulty interpretation,
population, simulation, and bundle assembly — behind the single
:meth:`~leadforge.schemes.base.GenerationScheme.build_world` entry point. This
is the first scheme extracted (LTV-Pd) and the trunk the lifecycle scheme
parallels.
is the first scheme extracted, and the trunk the lifecycle scheme parallels.

The implementation modules (``population``, ``engine``, mechanisms, structure,
render) still live under their original package paths; they are physically
relocated into this package in LTV-Pe. Until then ``build_world`` calls the
current homes, keeping the lead-scoring bundle's output byte-for-byte identical.
The compute-core modules (``simulation``, ``mechanisms``, ``structure``) live
under this package as of LTV-Pf. The render modules (``snapshots``,
``relational``, ``tasks``) still live under ``leadforge.render`` and are
relocated in a follow-up; ``build_world`` / ``write_bundle`` import from their
current homes.
"""

from __future__ import annotations
Expand Down Expand Up @@ -43,9 +43,9 @@ def build_world(
"""
from leadforge.core.models import WorldBundle, WorldSpec
from leadforge.core.rng import RNGRoot
from leadforge.simulation.engine import simulate_world
from leadforge.simulation.population import build_population
from leadforge.structure.sampler import sample_hidden_graph
from leadforge.schemes.lead_scoring.simulation.engine import simulate_world
from leadforge.schemes.lead_scoring.simulation.population import build_population
from leadforge.schemes.lead_scoring.structure.sampler import sample_hidden_graph

latent_touch_intensity = bool(options.get("latent_touch_intensity", False))

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -135,7 +135,8 @@ def from_dict(cls, data: dict[str, Any]) -> MechanismSummary:
class MechanismAssignment:
"""Named mechanism instances consumed by the simulation engine.

All fields are populated by :func:`~leadforge.mechanisms.policies.assign_mechanisms`.
All fields are populated by
:func:`~leadforge.schemes.lead_scoring.mechanisms.policies.assign_mechanisms`.
"""

motif_family: str
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
import random
from typing import Any

from leadforge.mechanisms.base import Mechanism, MechanismContext
from leadforge.schemes.lead_scoring.mechanisms.base import Mechanism, MechanismContext


class CategoricalInfluence(Mechanism):
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
from dataclasses import dataclass, field
from typing import Any

from leadforge.mechanisms.base import Mechanism, MechanismContext
from leadforge.schemes.lead_scoring.mechanisms.base import Mechanism, MechanismContext


@dataclass(frozen=True)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,16 +2,17 @@

:class:`ConversionHazard` is the primary mechanism called by the simulation
engine on each day step for each active lead. It maps the merged latent state
to a daily conversion probability via a :class:`~leadforge.mechanisms.scores.LatentScore`.
to a daily conversion probability via a
:class:`~leadforge.schemes.lead_scoring.mechanisms.scores.LatentScore`.
"""

from __future__ import annotations

import random
from typing import Any

from leadforge.mechanisms.base import Mechanism, MechanismContext
from leadforge.mechanisms.scores import LatentScore
from leadforge.schemes.lead_scoring.mechanisms.base import Mechanism, MechanismContext
from leadforge.schemes.lead_scoring.mechanisms.scores import LatentScore


class ConversionHazard(Mechanism):
Expand All @@ -22,7 +23,7 @@ class ConversionHazard(Mechanism):
p_convert = clip(base_rate + scale * score, 0, max_daily_rate)

Args:
score_mech: A :class:`~leadforge.mechanisms.scores.LatentScore`
score_mech: A :class:`~leadforge.schemes.lead_scoring.mechanisms.scores.LatentScore`
instance that maps latents → [0, 1] score.
base_rate: Minimum daily conversion probability (intercept).
scale: Multiplier on the latent score.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
import random
from typing import Any

from leadforge.mechanisms.base import Mechanism, MechanismContext
from leadforge.schemes.lead_scoring.mechanisms.base import Mechanism, MechanismContext


def _weighted_sum(latents: dict[str, float], weights: dict[str, float], bias: float) -> float:
Expand Down
Loading
Loading