Skip to content

refactor(schema): split lead-scoring schema into schemes/lead_scoring/ [LTV-Pg.2]#112

Open
shaypal5 wants to merge 2 commits into
mainfrom
refactor/split-lead-scoring-schema
Open

refactor(schema): split lead-scoring schema into schemes/lead_scoring/ [LTV-Pg.2]#112
shaypal5 wants to merge 2 commits into
mainfrom
refactor/split-lead-scoring-schema

Conversation

@shaypal5

Copy link
Copy Markdown
Contributor

Summary

Completes the M2 physical reorg (LTV-Pg.2, milestone LTV-M2). Pulls
the lead-scoring-specific schema definitions out of the shared schema/ package
into schemes/lead_scoring/, leaving only genuine cross-scheme primitives behind.

What moved

New files in schemes/lead_scoring/:

file content
entities.py ContactRowSubscriptionRow + ALL_ROW_TYPES / TABLE_NAMES
relationships.py ALL_CONSTRAINTS
features.py LEAD_SNAPSHOT_FEATURES + redacted_columns_for
tasks.py CONVERTED_WITHIN_90_DAYS + task_manifest_for_config

What stayed in shared schema/

file retained
entities.py EntityRowProtocol, make_empty_dataframe, AccountRow
features.py FeatureSpec
relationships.py FKConstraint, FKViolationError, validate_fk
tasks.py SplitSpec, TaskManifest

Why this boundary

  • AccountRow is genuinely shared: used by LIFECYCLE_ROW_TYPES (lifecycle scheme).
  • FeatureSpec is a shared primitive: lifecycle will define CUSTOMER_SNAPSHOT_FEATURES with it.
  • SplitSpec/TaskManifest are abstract descriptors: lifecycle tasks will use them.
  • Everything else is lead-scoring specific.

Callers

36 files updated. 4 had mixed imports (FeatureSpec/validate_fk/TaskManifest stay shared but were co-imported with moved symbols) and needed manual fixing.

Layout-lock tests

3 new assertions in tests/schemes/test_module_layout.py:

  • shared primitives importable from schema.*
  • lead-scoring specifics importable from schemes.lead_scoring.*
  • moved symbols absent from the shared schema.* namespace

Verification

✅ BYTE-IDENTICAL vs pre-reorg main (14 files)
  • Full suite 1537 passed / 51 skipped (+3); ruff + mypy clean (96 files).
  • BUNDLE_SCHEMA_VERSION unchanged.

M2 complete

With this PR, LTV-M2 (generation-scheme architecture + physical reorg) is fully
done. The package now has peer schemes with their own schema, simulation, render,
and task specifications — the foundation for LTV-M3 (lifecycle customer
population).

🤖 Generated with Claude Code

…/ [LTV-Pg.2]

Completes the M2 physical reorg (LTV-Pg). Pulls the lead-scoring-specific
schema definitions out of the shared `schema/` package, leaving only genuine
cross-scheme primitives behind.

New files (schemes/lead_scoring/):
- entities.py   — ContactRow … SubscriptionRow + ALL_ROW_TYPES / TABLE_NAMES
- relationships.py — ALL_CONSTRAINTS
- features.py  — LEAD_SNAPSHOT_FEATURES + redacted_columns_for
- tasks.py     — CONVERTED_WITHIN_90_DAYS + task_manifest_for_config

Shared schema/ after the split (only primitives remain):
- entities.py      — EntityRowProtocol, make_empty_dataframe, AccountRow
- features.py      — FeatureSpec
- relationships.py — FKConstraint, FKViolationError, validate_fk
- tasks.py         — SplitSpec, TaskManifest

All callers updated (36 files): multi-line from-import blocks rewritten via
perl; 4 mixed imports (tests/exposure/test_redaction.py, tests/schema/
test_relationships.py, tests/render/test_render.py, tests/narrative/
test_dataset_card.py) fixed manually where FeatureSpec/validate_fk/TaskManifest
stay shared but co-imported with moved symbols.

tests/schemes/test_module_layout.py: 3 new tests for Pg.2 — primitives-stay,
scheme-specifics-in-scheme, removed-from-shared-schema.

CHANGELOG, CLAUDE.md (canonical layout), roadmap (Pg.2 ✓), agent-plan updated.

Verified byte-identical to pre-reorg main (14/14 files); full suite
1537 passed / 51 skipped; ruff + mypy clean (96 source files).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings June 11, 2026 05:19
@shaypal5 shaypal5 added this to the dataset: leadforge-ltv-v1 milestone Jun 11, 2026
@shaypal5 shaypal5 added type: refactor Code change with no behavior difference layer: schema schema/ entity/event contracts status: needs review Ready for review dataset: leadforge-ltv-v1 Issue/PR scoped to the b2b_saas_ltv_v1 LTV dataset workstream labels Jun 11, 2026
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions

Copy link
Copy Markdown

pr-agent-context report:

No unresolved review comments, failing checks, or actionable patch coverage gaps were found on PR #112 in repository https://github.com/leadforge-dev/leadforge. Treat this PR as all clear unless new signals appear.

Run metadata:

Tool ref: v4
Tool version: 4.0.21
Trigger: commit pushed
Workflow run: 27325625947 attempt 1
Comment timestamp: 2026-06-11T05:20:12.751903+00:00
PR head commit: 4caa1a3041df582f30e9d0b4921abbff48bc4672

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR completes the LTV-M2 / LTV-Pg.2 physical reorg by moving lead-scoring-specific schema definitions out of leadforge/schema/ into leadforge/schemes/lead_scoring/, leaving only genuinely cross-scheme primitives in schema/. It also updates imports across the codebase and adds tests that lock in the intended module boundaries.

Changes:

  • Introduces leadforge/schemes/lead_scoring/{entities,features,relationships,tasks}.py and migrates lead-scoring schema catalogs/constants into them.
  • Refactors leadforge/schema/{entities,features,relationships,tasks}.py to retain only shared primitives and updates call sites accordingly.
  • Adds/updates tests and documentation to enforce and describe the new module layout.

Reviewed changes

Copilot reviewed 38 out of 38 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
tests/validation/test_realism.py Updates imports to pull lead-scoring feature catalog from the scheme package.
tests/test_primary_task_threading.py Updates task imports to use lead-scoring scheme tasks module.
tests/simulation/test_engine.py Updates entity imports to scheme-scoped lead-scoring rows.
tests/scripts/test_build_v7_snapshot.py Updates entity imports (TouchRow) to scheme module.
tests/schemes/test_module_layout.py Adds layout-lock assertions for shared-vs-scheme boundaries.
tests/schemes/lifecycle/test_entities.py Updates imports for lead-scoring catalogs used in lifecycle tests.
tests/schema/test_tasks.py Updates imports for moved task manifest symbol(s).
tests/schema/test_relationships.py Imports lead-scoring ALL_CONSTRAINTS from scheme module while keeping shared primitives in schema.
tests/schema/test_features.py Updates imports for moved lead-scoring feature catalog and redaction helper.
tests/schema/test_entities.py Updates imports for moved lead-scoring entity rows/catalogs.
tests/render/test_render.py Updates imports to scheme-scoped entities/features/constraints.
tests/narrative/test_dataset_card.py Splits shared task primitive imports vs lead-scoring task factory import.
tests/exposure/test_redaction.py Updates imports to scheme-scoped lead-scoring feature catalog/redaction helper.
leadforge/validation/release_quality.py Updates imports to scheme-scoped lead-scoring feature catalog.
leadforge/validation/realism.py Updates imports and doc references to scheme-scoped lead-scoring feature catalog/entities.
leadforge/validation/invariants.py Updates imports to scheme-scoped lead-scoring redaction helper.
leadforge/validation/bundle_checks.py Updates imports to scheme-scoped lead-scoring features + constraints.
leadforge/schemes/lifecycle/entities.py Updates doc references to lead-scoring entity rows under scheme module.
leadforge/schemes/lead_scoring/tasks.py New: lead-scoring task manifest constant + config-to-manifest factory.
leadforge/schemes/lead_scoring/simulation/state.py Updates doc reference to scheme-scoped LeadRow.
leadforge/schemes/lead_scoring/simulation/population.py Updates entity imports to scheme-scoped lead-scoring rows.
leadforge/schemes/lead_scoring/simulation/engine.py Updates imports/doc references to scheme-scoped entity rows.
leadforge/schemes/lead_scoring/render/tasks.py Updates task imports to scheme-scoped lead-scoring tasks module.
leadforge/schemes/lead_scoring/render/snapshots.py Updates entity/features imports to scheme-scoped modules.
leadforge/schemes/lead_scoring/render/relational.py Updates entity imports to scheme-scoped lead-scoring rows.
leadforge/schemes/lead_scoring/relationships.py New: lead-scoring FK constraint catalog.
leadforge/schemes/lead_scoring/features.py New: lead-scoring feature catalog + redaction helper.
leadforge/schemes/lead_scoring/entities.py New: lead-scoring entity rows + catalogs.
leadforge/schemes/lead_scoring/init.py Updates bundle writer imports to scheme-scoped schema modules.
leadforge/schema/tasks.py Removes lead-scoring task definitions; keeps shared primitives only.
leadforge/schema/relationships.py Removes lead-scoring constraint catalog; keeps shared primitives only.
leadforge/schema/features.py Removes lead-scoring feature catalog/redaction helper; keeps FeatureSpec only.
leadforge/schema/entities.py Removes lead-scoring entity rows/catalogs; keeps shared primitives + AccountRow.
leadforge/schema/dictionaries.py Updates feature catalog import source for dictionary generation.
leadforge/narrative/dataset_card.py Updates imports for feature catalog used in dataset card rendering.
docs/ltv/roadmap.md Marks LTV-Pg.2 as complete and links PR.
CLAUDE.md Updates repository layout documentation to reflect the new shared-schema boundary.
.agent-plan.md Updates planning notes to reflect Pg.2 opened/merged state.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 10 to +14
from pathlib import Path

import pandas as pd

from leadforge.schema.features import LEAD_SNAPSHOT_FEATURES, FeatureSpec
from leadforge.schemes.lead_scoring.features import LEAD_SNAPSHOT_FEATURES, FeatureSpec
import pytest

from leadforge.schema.tasks import CONVERTED_WITHIN_90_DAYS, SplitSpec
from leadforge.schemes.lead_scoring.tasks import CONVERTED_WITHIN_90_DAYS, SplitSpec
Comment on lines 8 to 11

from leadforge.schema.dictionaries import feature_dictionary_df, write_feature_dictionary
from leadforge.schema.features import LEAD_SNAPSHOT_FEATURES, FeatureSpec
from leadforge.schemes.lead_scoring.features import LEAD_SNAPSHOT_FEATURES, FeatureSpec

Comment on lines +8 to 12
from leadforge.schema.tables import read_parquet, write_parquet
from leadforge.schemes.lead_scoring.entities import (
ALL_ROW_TYPES,
TABLE_NAMES,
AccountRow,
Comment on lines 14 to 17
from leadforge.schema.tables import read_parquet, write_parquet
from leadforge.schemes.lead_scoring.entities import ALL_ROW_TYPES, TABLE_NAMES, AccountRow
from leadforge.schemes.lead_scoring.relationships import ALL_CONSTRAINTS, FKConstraint
from leadforge.schemes.lifecycle.entities import (
import pandas as pd

from leadforge.schema.features import LEAD_SNAPSHOT_FEATURES, FeatureSpec
from leadforge.schemes.lead_scoring.features import LEAD_SNAPSHOT_FEATURES, FeatureSpec
from typing import TYPE_CHECKING

from leadforge.schema.features import LEAD_SNAPSHOT_FEATURES, FeatureSpec
from leadforge.schemes.lead_scoring.features import LEAD_SNAPSHOT_FEATURES, FeatureSpec
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dataset: leadforge-ltv-v1 Issue/PR scoped to the b2b_saas_ltv_v1 LTV dataset workstream layer: schema schema/ entity/event contracts status: needs review Ready for review type: refactor Code change with no behavior difference

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants