fix(render,validation,schema): snapshot fixes + noise clamps + schema cleanup (PR 8.1)#82
Merged
Merged
Conversation
… cleanup - Fix has_open_opportunity/opportunity_estimated_acv post-snapshot leak: use closed_at > lead_created_at + snapshot_day semantics instead of close_outcome.isna() (which incorrectly treats post-snapshot closures) - Add probe_opportunity_snapshot_consistency leakage probe: verifies shipped opportunity-derived columns match correct closed_at > cutoff recomputation; registered as opt-in in PROBE_REGISTRY - Add lead_id guard in _filter_to_snapshot_window: raises ValueError with descriptive message instead of producing all-NaT cutoffs - Clamp Gaussian noise to physical ranges in _apply_difficulty_distortions: days_since_*, ACV, and count columns clamped to >= 0 after noise pass - Exempt total_touches_all from difficulty distortion: add _DISTORTION_EXEMPT_COLS to prevent up to 18% NaN on the leakage trap - Drop first_touch_channel from LEAD_SNAPSHOT_FEATURES: byte-identical to lead_source in v1; add rename in v5/v6/v7 pipeline maps for backwards compatibility - Rename touches_week_1 -> touches_days_0_7 in LEAD_SNAPSHOT_FEATURES and snapshot builder (window covers days 0-7 inclusive = 8 values) - Fix label window boundary: change < to <= in engine.py so conversion on day label_window_days is correctly labeled positive (inclusive spec) - Update contract test V5_TASK_COLUMNS_STUDENT_PUBLIC for schema changes - Update all test files referencing old column names Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Pull request overview
This PR tightens snapshot/label semantics and schema correctness across the render + validation pipeline, adds a targeted leakage probe to catch opportunity snapshot-time mistakes, and updates downstream tests/docs/pipelines for the renamed/removed snapshot columns.
Changes:
- Fix opportunity “open at snapshot” semantics in
build_snapshot, add an opt-in validation probe for opportunity snapshot consistency, and harden snapshot-safe relational filtering on missinglead_id. - Update snapshot schema by removing redundant
first_touch_channel, renamingtouches_week_1→touches_days_0_7, and threading backward-compat rename maps for v5/v6/v7 CSV pipelines. - Make the label window inclusive (
<=) and update tests/docs accordingly; adjust difficulty distortion behavior (clamps + exemption fortotal_touches_all).
Reviewed changes
Copilot reviewed 17 out of 17 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test_primary_task_threading.py | Updates expected snapshot column name to touches_days_0_7. |
| tests/simulation/test_engine.py | Adjusts label-window boundary test and clarifies docstring for inclusive behavior. |
| tests/scripts/test_build_v7_snapshot.py | Updates synthetic snapshot fixture column to touches_days_0_7. |
| tests/scripts/test_build_v6_snapshot.py | Updates synthetic snapshot fixture column to touches_days_0_7. |
| tests/scripts/test_build_v5_snapshot.py | Updates synthetic snapshot fixture column to touches_days_0_7. |
| tests/render/test_snapshot_windowed.py | Renames windowed-touch tests to touches_days_0_7 and updates assertions. |
| tests/render/test_bundle_schema_v5_contract.py | Updates schema contract expectations (drop first_touch_channel from snapshot; rename touches column). |
| release/docs/feature_dictionary.md | Updates release docs for 31-column snapshot, removed first_touch_channel, and renamed touches feature. |
| leadforge/validation/leakage_probes.py | Adds opt-in probe_opportunity_snapshot_consistency and registers it. |
| leadforge/simulation/engine.py | Makes label window inclusive: conversion_day <= label_window_days. |
| leadforge/schema/features.py | Removes first_touch_channel from snapshot feature spec; renames touches_week_1 to touches_days_0_7. |
| leadforge/render/snapshots.py | Fixes open-opportunity snapshot semantics; adds distortion exemption + non-negativity clamping; renames touches feature in snapshot builder. |
| leadforge/render/relational_snapshot_safe.py | Adds a loud guard when snapshot-filtered tables lack lead_id. |
| leadforge/pipelines/common.py | Adds backward-compat rename from touches_days_0_7 back to touches_week_1 for v6/v7 outputs. |
| leadforge/pipelines/build_v5.py | Adds backward-compat rename from touches_days_0_7 back to touches_week_1 for v5 outputs. |
| docs/release/feature_dictionary.md | Mirrors the release feature dictionary updates in the main docs tree. |
| .agent-plan.md | Marks PR 8.1 items complete and records outcomes (AUC deltas, validation status). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+428
to
+437
| # 1b. Post-noise clamp to physical ranges. | ||
| # Float non-negative columns: clamp to >= 0. | ||
| for col in _NONNEG_FLOAT_COLS: | ||
| if col in df.columns and df[col].notna().any(): | ||
| df[col] = df[col].clip(lower=0) | ||
| # Int non-negative columns: clamp to >= 0, then restore int dtype. | ||
| for col in _NONNEG_INT_COLS: | ||
| if col in df.columns and df[col].notna().any(): | ||
| df[col] = df[col].clip(lower=0) | ||
|
|
Comment on lines
+258
to
+263
| od["_snapshot_cutoff"] = od["lead_id"].map(cutoff_map) | ||
| open_mask = od["_closed_at_ts"].isna() | (od["_closed_at_ts"] > od["_snapshot_cutoff"]) | ||
| open_opps_raw = od[open_mask][["lead_id", "estimated_acv"]] | ||
| else: | ||
| open_opps_raw = od[["lead_id", "estimated_acv"]] | ||
| open_opps = open_opps_raw.groupby("lead_id").first().reset_index() |
Comment on lines
+1044
to
+1049
| expected_open = ( | ||
| opps_open.groupby("lead_id")["estimated_acv"] | ||
| .first() | ||
| .reset_index() | ||
| .rename(columns={"estimated_acv": "_expected_acv"}) | ||
| ) |
Ten issues from senior code review addressed:
🔴 Bug: dead 'touches_week_1' entry in _NONNEG_INT_COLS removed by
replacing both frozensets with FeatureSpec-derived comprehensions
(Fix 3 makes Fix 1 obsolete as a standalone change).
🔴 Bug: _filter_to_snapshot_window guard now runs before the
len(events)==0 early-return so a misconfigured empty table still
raises instead of silently passing through.
🟠 Design: _NONNEG_FLOAT_COLS and _NONNEG_INT_COLS replaced with
FeatureSpec-derived comprehensions keyed on new non_negative field.
Added non_negative=True to all 13 count/duration/monetary features.
No more manual-maintenance second source of truth.
🟠 Comments: stale 'Mapping: column name → (lower_bound, upper_bound)'
comment removed; 'restore int dtype' wording corrected to
'clip() preserves Int64 dtype'.
🟠 Probe category: 'time_window' → 'snapshot_consistency'; taxonomy
test updated to accept the new value.
🟠 Probe ACV ordering: added .sort_values('created_at') before
groupby.first() in both production code (snapshots.py) and probe
(leakage_probes.py) so the tie-break is deterministic when a lead
has multiple open opportunities.
🟡 Collapsed redundant if len(od) > 0 branch; _cutoff now computed
once and reused for both the creation-date filter and the open/closed
gate — eliminates the duplicate lead_id.map(cutoff_map) call.
🟡 FeatureSpec description for touches_days_0_7 trimmed to one clean
sentence; rename history belongs in git log, not user-facing CSV.
🟡 first_touch_channel removal comment moved from inside the tuple
literal to a proper prose note above the adjacent FeatureSpec.
🟡 Added three fixture-based tests for probe_opportunity_snapshot_
consistency: fires on old close_outcome.isna() semantics, silent on
correct closed_at > cutoff semantics, raises on missing columns.
Cascade: audit_channel_signal.py default CHANNEL_COLUMNS updated to
drop first_touch_channel (column no longer in flat snapshot); related
test fixtures and assertions updated accordingly; regenerated
channel_signal_audit.{md,json}, release/docs mirror, kaggle metadata,
and kaggle preview HTML.
1418 tests pass, 5 skipped. Ruff clean.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
pr-agent-context report: This run includes unresolved review comments on PR #82 in repository https://github.com/leadforge-dev/leadforge
For each unresolved review comment, recommend one of: resolve as irrelevant, accept and implement
the recommended solution, open a separate issue and resolve as out-of-scope for this PR, accept and
implement a different solution, or resolve as already treated by the code.
After I reply with my decision per item, implement the accepted actions, resolve the corresponding
PR comments, and push all of these changes in a single commit.
# Copilot Comments
## COPILOT-1
Location: leadforge/render/snapshots.py:428
URL: https://github.com/leadforge-dev/leadforge/pull/82#discussion_r3297192877
Root author: copilot-pull-request-reviewer
Comment:
The post-noise non-negativity clamp runs *before* outlier injection, but outlier injection can create negative values again (it adds ±5σ). This undermines the stated goal of preventing non-physical negatives (durations/ACV). Consider moving the clamp to the end of `_apply_difficulty_distortions` (after outliers), or clamping both after noise and after outliers, so the final output always respects the physical bounds.
## COPILOT-2
Location: leadforge/render/snapshots.py
URL: https://github.com/leadforge-dev/leadforge/pull/82#discussion_r3297192924
Status: outdated
Root author: copilot-pull-request-reviewer
Comment:
`opportunity_estimated_acv` is documented in the feature spec as the ACV of the *most recent* open opportunity, but the implementation drops `created_at` and then uses `groupby(...).first()`, which depends on input row order and does not guarantee “most recent”. Please make the selection deterministic and consistent with the spec (e.g., sort by `created_at` and take the latest open opp per lead, or use `idxmax` on a parsed `created_at` timestamp).
## COPILOT-3
Location: leadforge/validation/leakage_probes.py:1052
URL: https://github.com/leadforge-dev/leadforge/pull/82#discussion_r3297192948
Root author: copilot-pull-request-reviewer
Comment:
This probe computes expected `opportunity_estimated_acv` via `groupby('lead_id').first()`, which is order-dependent and may not match the snapshot builder if row ordering differs (e.g., after Parquet read) or if a lead has multiple open opportunities. To avoid false findings, make the selection deterministic (e.g., sort by `created_at` and take the most recent open opp per lead) and mirror the snapshot builder’s exact selection semantics.Run metadata: |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
PR 8.1 — eight targeted fixes to the snapshot builder, validation probes, and schema, plus full bundle regeneration.
1. Fix
has_open_opportunity/opportunity_estimated_acvpost-snapshot leak (HIGH)File:
leadforge/render/snapshots.pyThe previous implementation used
od["close_outcome"].isna()to identify open opportunities. This is incorrect:close_outcomeis set by the post-simulation pass for all closed opportunities regardless of when they closed. An opportunity closed on day 91 (after snapshot day 30) would be falsely treated as already closed at snapshot time, artificially deflatinghas_open_opportunity.Fix: Changed to
closed_at is NaT OR closed_at > lead_created_at + snapshot_day.AUC delta: Intro LR 0.879→0.879, Intermediate 0.886→0.886, Advanced 0.886→0.886. The AUC is stable because the simulation DGP has most opportunities closed after the snapshot window being won/lost proportionally to the underlying conversion rate — the error inflates false negatives and false positives roughly symmetrically. Correctness is restored regardless.
2. Add
probe_opportunity_snapshot_consistencyleakage probe (MEDIUM)File:
leadforge/validation/leakage_probes.pyNew opt-in probe that recomputes
has_open_opportunityandopportunity_estimated_acvunder correctclosed_at > cutoffsemantics and asserts equality with the shipped column values. Registered inPROBE_REGISTRYas taxonomy=time_window,opt_in=True(requires full-horizon opportunities table, not available in public bundles). This probe would have caught bug #1 automatically.3.
to_dataframes_snapshot_safeguard on missinglead_id(MEDIUM)File:
leadforge/render/relational_snapshot_safe.py_filter_to_snapshot_windownow raisesValueErrorwith a descriptive message iflead_idis missing from an event table instead of silently producing all-NaT cutoffs.4. Clamp Gaussian noise to physical ranges (MEDIUM/LOW)
File:
leadforge/render/snapshots.pyPost-noise clamp in
_apply_difficulty_distortions:days_since_*, ACV, and monetary columns are clamped to>= 0; count columns (touch_count,session_count, etc.) are clamped to>= 0. Removes the most visible synthetic-data tell (negative durations / negative counts in Advanced tier).5. Exempt
total_touches_allfrom difficulty distortion (MEDIUM)File:
leadforge/render/snapshots.pyAdded
_DISTORTION_EXEMPT_COLS = {"total_touches_all"}. The leakage trap was getting up to 18% NaN injection in the Advanced tier, muddying the pedagogical leakage-detection exercise it's designed to deliver cleanly.6. Drop
first_touch_channelfrom task snapshot (MEDIUM)Files:
leadforge/schema/features.py,leadforge/render/snapshots.py(implicit via column list), docsfirst_touch_channelis byte-identical tolead_sourcein v1 (both are set to the same origination value inpopulation.py). Removed fromLEAD_SNAPSHOT_FEATURESand task splits. Still present in relationaltables/leads.parquet(it's aLeadRowentity field). Pipeline RENAME_MAPs for v5/v6/v7 CSVs are unaffected. Snapshot count: 32 → 31 columns.7. Rename
touches_week_1→touches_days_0_7(LOW)Files:
leadforge/schema/features.py,leadforge/render/snapshots.py, tests, docsThe feature implementation uses
_day <= 7(days 0 through 7 inclusive = 8 values), not 7 values. The nametouches_week_1implies a 7-day window;touches_days_0_7is precise. v5/v6/v7 build pipelines add atouches_days_0_7 → touches_week_1entry to their RENAME_MAP for backwards compatibility with existing CSV consumers.8. Fix label window:
<→<=(MEDIUM)File:
leadforge/simulation/engine.pystate.conversion_day < config.label_window_days→state.conversion_day <= config.label_window_days. The spec says "within 90 days" (inclusive). A conversion exactly on day 90 was previously excluded; now correctly labeled positive.Test results
1439 passed, 5 skipped — all green. Ruff lint + format clean.
Bundle regeneration
All three tiers regenerated with seed=42:
validate_release_candidate.py: PASS — 3 tiers, 5 seeds, 0 leakage findingsAUC delta from snapshot fix
Pre-fix (old): intro LR 0.879 / GBM n/a, intermediate LR 0.886, advanced LR 0.886
Post-fix (new): intro LR 0.8788 / GBM 0.8729, intermediate LR 0.8859 / GBM 0.8755, advanced LR 0.8861 / GBM 0.8726
The AUC is effectively unchanged. The opportunity leak was correcting a categorical feature (has_open_opportunity) rather than a continuous predictor, and the simulation DGP means the mis-labeled opportunity-open flags are uncorrelated with conversion in aggregate. Correctness of the feature semantics is restored regardless of the marginal AUC impact.
🤖 Generated with Claude Code