feat(schema): lifecycle entity rows for b2b_saas_ltv_v1 [LTV-Pb]#104
Merged
Conversation
First implementation PR of the LTV workstream. Adds the schema foundation for the post-conversion lifecycle bundle, fully decoupled from the lead-scoring catalog so its output is unchanged. New entity rows (leadforge/schema/entities.py): - SubscriptionEventRow (subscription_events) — lifecycle state changes. - HealthSignalRow (health_signals) — weekly product-usage telemetry. - InvoiceRow (invoices) — monthly billing; the unit of pLTV value. - CustomerLifecycleRow / SubscriptionLifecycleRow — richer customers/ subscriptions for the lifecycle bundle. Dedicated classes rather than in-place extension of CustomerRow/SubscriptionRow, because to_dict() emits every field and extending in place would change the lead-scoring instructor bundle's parquet schema. opportunity_id is nullable (independent generation; reserved for future chaining). Registries kept separate from the lead-scoring catalog: - LIFECYCLE_ROW_TYPES / LIFECYCLE_TABLE_NAMES (entities.py). - LIFECYCLE_CONSTRAINTS (relationships.py) — 6 FK edges; customers→accounts only (no customer→opportunity FK under independent generation). - ID_PREFIXES gains subscription_event/health_signal/invoice (subev/hsig/inv). ALL_ROW_TYPES (9), TABLE_NAMES, and ALL_CONSTRAINTS (10) are unchanged; a guard test asserts this. docs/ltv/design.md §4.2/§10 updated to record the dedicated-classes decision. Tests: 48 new in tests/schema/test_lifecycle_entities.py (to_dict parity, empty-dataframe columns/dtypes, parquet round-trips, registry shape, lifecycle FK constraints, lead-scoring-catalog-unchanged guard, ID prefixes) + ID prefix set updated. Full suite 1480 passed / 51 skipped; ruff + mypy clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Check off LTV-Pb in the roadmap and link its GitHub PR (#104); update the .agent-plan.md status line. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Pull request overview
This PR introduces the schema foundations for the new post-conversion lifecycle bundle (b2b_saas_ltv_v1) while explicitly keeping the existing lead-scoring schema/catalog unchanged by using dedicated lifecycle row classes and separate registries.
Changes:
- Added lifecycle-specific entity row dataclasses (
CustomerLifecycleRow,SubscriptionLifecycleRow,SubscriptionEventRow,HealthSignalRow,InvoiceRow) plus lifecycle registries (LIFECYCLE_ROW_TYPES,LIFECYCLE_TABLE_NAMES) without modifyingALL_ROW_TYPES/TABLE_NAMES. - Added lifecycle FK constraint registry (
LIFECYCLE_CONSTRAINTS) separate from the lead-scoring FK graph. - Extended ID prefix registry to cover lifecycle-only entity types and added comprehensive schema/round-trip/guard tests.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
leadforge/schema/entities.py |
Adds lifecycle row contracts and lifecycle-only registries while preserving the lead-scoring registries unchanged. |
leadforge/schema/relationships.py |
Introduces lifecycle-only FK constraints, kept separate from the lead-scoring FK constraint set. |
leadforge/core/ids.py |
Adds lifecycle ID prefixes (subev, hsig, inv) and documents lifecycle prefix usage. |
tests/schema/test_lifecycle_entities.py |
Adds lifecycle schema contract tests (to_dict parity, empty df schema, parquet round-trips, registry shape, FK constraints, and lead-scoring invariants). |
tests/schema/test_ids.py |
Updates expected ID-prefix coverage to include lifecycle entity types. |
docs/ltv/design.md |
Updates design documentation to reflect the “dedicated lifecycle classes + separate registry” decision and lifecycle table inventory notes. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
pr-agent-context report: No unresolved review comments, failing checks, or actionable patch coverage gaps were found on PR #104 in repository https://github.com/leadforge-dev/leadforge. Treat this PR as all clear unless new signals appear.Run metadata: |
shaypal5
added a commit
that referenced
this pull request
Jun 10, 2026
Acts on the maintainer decision that leadforge becomes a platform hosting two
PARALLEL, peer generation schemes (lead_scoring + lifecycle), not a
lead-scoring framework with an LTV bolt-on.
design.md:
- New §2.5 "peer generation schemes": decisions D10 (extract the
GenerationScheme abstraction EARLY, against the known-good lead-scoring path,
output byte-identical) and D11 (physically reorganize into
leadforge/schemes/{lead_scoring,lifecycle}/ now). Adds the scheme→recipe→
bundle hierarchy, the GenerationScheme protocol shape, a shared-envelope vs
per-scheme table, the target package layout, reorg safety rails for the
published 1.x package, and a note that LTV-Pb (#104) already aligns.
- §10 inventory: lifecycle modules now live under schemes/lifecycle/; adds
schemes/base.py; recipe declares scheme: lifecycle.
roadmap.md (reshaped to 9 milestones / ~18 PRs, Pa..Pr):
- New LTV-M2 "Generation-scheme architecture + physical reorg" (LTV-Pd/Pe/Pf):
protocol+registry against lead-scoring → move lead-scoring into
schemes/lead_scoring/ → scaffold schemes/lifecycle/ and relocate the
LTV-Pb/Pc specs.
- Lifecycle build milestones (population/engine/snapshots) renumbered to
M3-M5 and now land directly under schemes/lifecycle/.
- LTV-M6 registers LifecycleScheme end-to-end + recipe + manifest
generation_scheme + schema v6.
.agent-plan.md: scheme-architecture summary + revised status (M2 next; can run
in parallel with M1 since it only touches the existing lead-scoring path).
Stacked on the LTV-Pb branch (#104) because it references that work as done.
No package code in this PR.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
shaypal5
added a commit
that referenced
this pull request
Jun 10, 2026
Acts on the maintainer decision that leadforge becomes a platform hosting two
PARALLEL, peer generation schemes (lead_scoring + lifecycle), not a
lead-scoring framework with an LTV bolt-on.
design.md:
- New §2.5 "peer generation schemes": decisions D10 (extract the
GenerationScheme abstraction EARLY, against the known-good lead-scoring path,
output byte-identical) and D11 (physically reorganize into
leadforge/schemes/{lead_scoring,lifecycle}/ now). Adds the scheme→recipe→
bundle hierarchy, the GenerationScheme protocol shape, a shared-envelope vs
per-scheme table, the target package layout, reorg safety rails for the
published 1.x package, and a note that LTV-Pb (#104) already aligns.
- §10 inventory: lifecycle modules now live under schemes/lifecycle/; adds
schemes/base.py; recipe declares scheme: lifecycle.
roadmap.md (reshaped to 9 milestones / ~18 PRs, Pa..Pr):
- New LTV-M2 "Generation-scheme architecture + physical reorg" (LTV-Pd/Pe/Pf):
protocol+registry against lead-scoring → move lead-scoring into
schemes/lead_scoring/ → scaffold schemes/lifecycle/ and relocate the
LTV-Pb/Pc specs.
- Lifecycle build milestones (population/engine/snapshots) renumbered to
M3-M5 and now land directly under schemes/lifecycle/.
- LTV-M6 registers LifecycleScheme end-to-end + recipe + manifest
generation_scheme + schema v6.
.agent-plan.md: scheme-architecture summary + revised status (M2 next; can run
in parallel with M1 since it only touches the existing lead-scoring path).
Stacked on the LTV-Pb branch (#104) because it references that work as done.
No package code in this PR.
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
shaypal5
added a commit
that referenced
this pull request
Jun 10, 2026
…leScheme [LTV-Pg.1] (#111) * refactor(schema): scaffold schemes/lifecycle/ + register stub LifecycleScheme [LTV-Pg.1] First half of the schema reorg (LTV-Pg). Gives the lifecycle scheme its own home and makes it a registered peer of lead_scoring, ahead of building its pipeline (M3–M6). Byte-identical; lead-scoring catalog unchanged. - New leadforge/schemes/lifecycle/ package: - entities.py — the 5 lifecycle rows (CustomerLifecycleRow, SubscriptionLifecycleRow, SubscriptionEventRow, HealthSignalRow, InvoiceRow) + LIFECYCLE_ROW_TYPES / LIFECYCLE_TABLE_NAMES, moved from schema/entities.py. AccountRow / EntityRowProtocol / _empty_df are shared and imported from leadforge.schema.entities. - relationships.py — LIFECYCLE_CONSTRAINTS, moved from schema/relationships.py (reuses the shared FKConstraint). - __init__.py — stub LifecycleScheme (build_world/write_bundle raise NotImplementedError until M3–M6); self-registers. schemes/__init__ imports it. - schema/entities.py and schema/relationships.py: lifecycle definitions removed; breadcrumb comments point to the new home. ALL_ROW_TYPES / ALL_CONSTRAINTS unchanged. - tests/schema/test_lifecycle_entities.py → tests/schemes/lifecycle/test_entities.py with updated imports; tests/schemes/test_registry.py gains lifecycle registration + stub-raises-NotImplementedError tests. - CHANGELOG, CLAUDE.md (both layouts), roadmap (Pg split into Pg.1/Pg.2), agent-plan updated. available_schemes() → ("lead_scoring", "lifecycle"). Verified byte-identical (14/14 files); full suite 1534 passed / 51 skipped; ruff + mypy clean (92 files). Note: class-level extraction from the shared schema/entities.py can't be a git rename (multiple classes pulled from a multi-class file); the lifecycle rows were only added in #104 so the history loss is shallow. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * docs(ltv): record LTV-Pg.1 (#111) in roadmap + agent-plan [LTV-Pg.1] Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * refactor(schema): make_empty_dataframe public (self-review) [LTV-Pg.1] Self-review: schemes/lifecycle/entities.py imported a private symbol (`_empty_df`) across packages — a leading-underscore name signals module-internal, so importing it elsewhere is a smell. Promote it to a public shared helper `make_empty_dataframe` in leadforge.schema.entities (used by both the lead-scoring rows and the lifecycle rows); the cross-module import is now legitimate. No behaviour change (verified byte-identical, 14/14); full suite passes; ruff + mypy clean. (When LTV-Pg.2 moves lead-scoring rows out of schema/entities.py, make_empty_dataframe + EntityRowProtocol stay as the shared primitives.) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
First implementation PR of the LTV workstream (
LTV-Pb, milestoneLTV-M1). Adds the schema foundation for the post-conversion lifecyclebundle (
b2b_saas_ltv_v1), fully decoupled from the lead-scoring catalog.New entity rows (
leadforge/schema/entities.py)SubscriptionEventRow(subscription_events) — lifecycle state changes(renewal / expansion / downgrade / churn / payment_failure / payment_recovered).
HealthSignalRow(health_signals) — weekly product-usage telemetry.InvoiceRow(invoices) — monthly billing; the unit of pLTV value.CustomerLifecycleRow/SubscriptionLifecycleRow— richercustomers/subscriptions for the lifecycle bundle.
Key design decision — dedicated classes, not in-place extension
The roadmap originally said "extend
CustomerRow/SubscriptionRow." Duringimplementation I found that
EntityRow.to_dict()emits every dataclassfield, so adding fields in place would silently change the lead-scoring
instructor bundle's
customers/subscriptionsparquet schema (and breakits contract tests). Instead, the lifecycle bundle uses dedicated
CustomerLifecycleRow/SubscriptionLifecycleRowclasses (reusing thelogical table names) kept in a separate registry. This faithfully realizes the
"lead-scoring output unchanged" requirement.
docs/ltv/design.md§4.2/§10updated to record this.
Separate registries (lead-scoring catalog untouched)
LIFECYCLE_ROW_TYPES/LIFECYCLE_TABLE_NAMES(6 tables: accounts reused +customers, subscriptions, subscription_events, health_signals, invoices).
LIFECYCLE_CONSTRAINTS— 6 FK edges;customers → accountsonly (nocustomer → opportunityFK under independent generation;opportunity_idis a nullable column reserved for future chaining).
ID_PREFIXESgainssubscription_event/health_signal/invoice(
subev/hsig/inv).ALL_ROW_TYPES(9),TABLE_NAMES, andALL_CONSTRAINTS(10) are unchanged —a guard test asserts it.
Tests
48 new in
tests/schema/test_lifecycle_entities.py:to_dictparity,empty-dataframe columns/dtypes, populated + empty parquet round-trips, registry
shape, lifecycle FK constraints, lead-scoring-catalog-unchanged guard, and ID
prefixes.
tests/schema/test_ids.pyexpected-set updated.ruff check+ruff format --check: clean.mypy leadforge/: clean (84 files).BUNDLE_SCHEMA_VERSIONunchanged (the schema bump to6 lands with the recipe wiring in
LTV-M5).Scope
Schema contracts only — no simulation, render, or recipe wiring. Next:
LTV-Pc(pLTV feature spec + regression task specs).🤖 Generated with Claude Code