feat: Milestone 8 — render/bundle layer and end-to-end Generator.generate()#13
Conversation
…rate() Implements the full observation model that transforms hidden simulation output into the canonical output bundle structure. - render/relational.py: to_dataframes() — 9 typed DataFrames from SimulationResult + PopulationResult - render/snapshots.py: build_snapshot() — 30-column leakage-free lead snapshot; touch/session/ activity aggregates, account/contact field joins, vectorised pandas operations - render/tasks.py: write_task_splits() — deterministic 70/15/15 train/valid/test Parquet split + task_manifest.json; seeded random shuffle guarantees reproducibility - render/manifests.py: build_manifest() / write_manifest() — manifest.json with full provenance, row counts, and SHA-256 file hashes - api/bundle.py: write_bundle() — orchestrates all render steps in order - core/models.py: WorldBundle enriched with population/simulation_result/world_graph fields; WorldBundle.save(path) delegates to write_bundle() via lazy import (avoids circular import) - api/generator.py: Generator.generate() fully implemented end-to-end - tests/render/test_render.py: 31 tests covering all render modules and full bundle write smoke test - tests/api/test_generator.py: replaced stale NotImplementedError test with WorldBundle assertion 521 tests passing; ruff + mypy clean. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This comment has been minimized.
This comment has been minimized.
Ten issues identified and fixed:
snapshots.py:
- Replace row-wise apply() for account/contact joins with vectorised merge;
eliminates ~40k Python function calls for a 5k-lead dataset
- Remove silent except (ValueError, TypeError): pass on dtype casts;
data-incompatible casts must fail loudly
- Eliminate duplicate hardcoded fallback column lists in if/else empty-list
guards; use entity empty_dataframe() instead so column names are always
authoritative
relational.py:
- Add EntityRowProtocol to schema/entities.py and use it as the type for
_TABLE_SOURCES values; removes both # type: ignore[attr-defined] comments
- Replace magic "population"/"simulation" strings with Literal["population",
"simulation"] alias; typos are now caught at type-check time
tasks.py:
- Replace raw random.Random(seed) with RNGRoot(seed).child("task_split_shuffle")
to honour the project's single-seeded-root design
- Remove dangling comment left after previous ruff fix
bundle.py:
- Move render_dataset_card import to module level (was buried inside function
body with no circular-import justification)
tests/render/test_render.py:
- test_unpopulated_bundle_raises: use tmp_path fixture instead of hardcoded /tmp
- test_inbound_plus_outbound_le_total → _equals_total: assert == not <=
(only two directions exist in v1; < would indicate miscategorised touches)
- Replace vague "no leakage" test with test_no_post_anchor_columns_in_snapshot:
asserts that conversion_timestamp, closed_at, close_outcome are absent
- Add test_fk_integrity: calls validate_fk() on all ALL_CONSTRAINTS against
the produced DataFrames — the core correctness property of relational export
523 tests passing; ruff + mypy clean.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Pull request overview
Implements Milestone 8’s render/bundle layer so hidden simulation outputs can be rendered into canonical on-disk bundles, and wires Generator.generate() end-to-end to produce a fully populated WorldBundle.
Changes:
- Added render modules to export relational tables, build the lead snapshot, create deterministic task splits, and write provenance manifests.
- Added bundle orchestration (
write_bundle) andWorldBundle.save()delegation; implementedGenerator.generate()end-to-end. - Added comprehensive render/bundle tests and updated generator tests to assert
generate()returns a populatedWorldBundle.
Reviewed changes
Copilot reviewed 11 out of 12 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
leadforge/render/relational.py |
Converts SimulationResult + PopulationResult into typed relational DataFrames. |
leadforge/render/snapshots.py |
Builds the canonical leakage-free lead snapshot DataFrame used for ML/task export. |
leadforge/render/tasks.py |
Writes deterministic train/valid/test Parquet splits plus a task manifest. |
leadforge/render/manifests.py |
Builds/writes bundle manifest.json with row counts and SHA-256 hashes. |
leadforge/api/bundle.py |
Orchestrates full bundle writing pipeline (tables → snapshot → tasks → docs → manifest). |
leadforge/api/generator.py |
Implements Generator.generate() to produce a populated WorldBundle. |
leadforge/core/models.py |
Extends WorldBundle to hold generation artefacts and adds save() method. |
tests/render/test_render.py |
Adds end-to-end and unit coverage for render/bundle components. |
tests/api/test_generator.py |
Updates generator test to expect a populated WorldBundle from generate(). |
tests/render/__init__.py |
Establishes tests.render package. |
.agent-plan.md |
Updates planning/status notes to reflect Milestone 8 completion and Milestone 9 next steps. |
Comments suppressed due to low confidence (2)
leadforge/render/snapshots.py:195
- These account/contact joins use
DataFrame.apply(..., axis=1)per field, which is O(n) Python-level work and will become a bottleneck asn_leadsgrows. Prefer vectorized joins (buildaccounts_dfkeyed byaccount_idandcontacts_dfkeyed bycontact_idandmergeonce) orSeries.mapfrom id→value dicts for each field.
# -------------------------------------------------------------------
acct_df = pd.DataFrame([a.to_dict() for a in population.accounts])[_ACCOUNT_JOIN_COLS]
cont_df = pd.DataFrame([c.to_dict() for c in population.contacts])[_CONTACT_JOIN_COLS]
lead_df = lead_df.merge(acct_df, on="account_id", how="left")
lead_df = lead_df.merge(cont_df, on="contact_id", how="left")
# -------------------------------------------------------------------
# Select and order columns per canonical feature spec; apply dtypes.
# -------------------------------------------------------------------
snapshot = lead_df[_SNAPSHOT_COLUMNS].copy()
for col, dtype in _SNAPSHOT_DTYPES.items():
if col in snapshot.columns:
snapshot[col] = snapshot[col].astype(dtype)
return snapshot
leadforge/render/snapshots.py:208
- The dtype enforcement loop suppresses
ValueError/TypeErrorduringastype()and continues silently. If a column can’t be cast to the canonical feature dtype, this will hide schema drift and can break downstream consumers unexpectedly. Consider either letting the exception propagate or raising a clearer error that includes the column name and intended dtype.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Nine issues fixed: snapshots.py: - Derive _ACCOUNT_JOIN_COLS / _CONTACT_JOIN_COLS from LEAD_SNAPSHOT_FEATURES categories instead of maintaining a separate hand-written list; new account or contact features in the spec are now included automatically - Consolidate dtype application: replace eight individual fillna+astype lines with a single bulk fillna block, then rely on the final dtype loop as the sole cast; removes duplicate astype calls on the same columns - Vectorise days_since_last_touch: pd.to_datetime returns NaT for nulls and (Timestamp - NaT) yields NaN, so the has_touch.any() branch is unnecessary; one expression replaces six lines relational.py: - Replace opaque 3-tuple with _TableSource NamedTuple (cls, origin, attr); each field is named at the call site - Drop getattr default []: a typo in _TABLE_SOURCES now raises AttributeError immediately instead of silently returning empty DataFrames manifests.py: - Remove `if abs_path.exists() else ""` guard on _sha256 calls; a missing Parquet file is a bug that should raise, not silently produce a corrupt empty-hash manifest bundle.py: - Replace hardcoded "converted_within_90_days" string with CONVERTED_WITHIN_90_DAYS.task_id; one source of truth for the task ID tasks.py: - Move detached comment inline next to the "test" slice it describes tests/render/test_render.py: - _make_narrative() now accepts and forwards seed so determinism test helpers (_snap, _run) no longer lie about which seed they're using 523 tests passing; ruff + mypy clean. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This comment has been minimized.
This comment has been minimized.
COPILOT-1 (tasks.py — random.Random): already fixed in prior commit;
now uses RNGRoot(seed).child("task_split_shuffle").
COPILOT-2 (manifests.py — empty SHA on missing table file): already fixed;
_sha256() is called unconditionally, raises FileNotFoundError if absent.
COPILOT-3 (manifests.py — same for task split files, outdated): same fix
covers both table and task SHA calls.
COPILOT-4 (manifests.py — misleading module docstring): fixed by narrowing
the docstring to accurately describe what the manifest covers (Parquet
data files only — relational tables and task splits). Expanding to hash
dataset_card.md / feature_dictionary.csv / task_manifest.json is not
required by the architecture spec and is deferred.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Pull request overview
Implements Milestone 8’s render/bundle “observation model” so a full Generator.generate() run can be rendered into a canonical on-disk bundle (Parquet tables, snapshot/task splits, and a provenance/integrity manifest).
Changes:
- Add render layer modules for relational export, lead snapshot construction, deterministic task splitting, and manifest hashing/writing.
- Add bundle writer orchestration and implement
Generator.generate()end-to-end, enrichingWorldBundleto carry simulation outputs. - Add comprehensive render/bundle tests and update generator tests to assert a populated
WorldBundle.
Reviewed changes
Copilot reviewed 11 out of 12 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
leadforge/render/relational.py |
Converts SimulationResult/PopulationResult into typed DataFrames for each relational table |
leadforge/render/snapshots.py |
Builds the ML-ready, leakage-free lead snapshot with vectorized aggregates + joins |
leadforge/render/tasks.py |
Writes deterministic train/valid/test Parquet splits and a task_manifest.json |
leadforge/render/manifests.py |
Builds/writes manifest.json with row counts and SHA-256 hashes |
leadforge/api/bundle.py |
Orchestrates full bundle writing pipeline (tables → snapshot/tasks → docs → manifest) |
leadforge/api/generator.py |
Implements Generator.generate() to produce a populated WorldBundle |
leadforge/core/models.py |
Extends WorldBundle and adds save() that delegates to bundle writer |
leadforge/schema/entities.py |
Adds EntityRowProtocol and tightens typing for the entity registry |
tests/render/test_render.py |
New end-to-end tests for render layer correctness, determinism, and bundle outputs |
tests/api/test_generator.py |
Updates generator test to validate generate() returns a populated bundle |
.agent-plan.md |
Updates milestone status/roadmap notes |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This comment has been minimized.
This comment has been minimized.
COPILOT-1 (generator.py — difficulty default silently overrides recipe): Default difficulty to _MISSING instead of DifficultyProfile.intermediate; the override is only applied when the caller explicitly passes a value. Generator.from_recipe(..., difficulty="advanced").generate() now keeps the advanced profile rather than silently reverting to intermediate. Added two regression tests. COPILOT-2 (models.py — WorldBundle fields typed Any): Import SimulationResult, PopulationResult, WorldGraph under TYPE_CHECKING and annotate the three fields with their concrete union types. With from __future__ import annotations the imports are lazy at runtime so there is no circular import risk. FAIL-1 (PR agent context refresh startup_failure): Infrastructure-level runner failure unrelated to code; resolved as irrelevant. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
pr-agent-context report: No unresolved review comments, failing checks, or actionable patch coverage gaps were found on PR
#13. Treat this PR as all clear unless new signals appear.Run metadata: |
Fold the brutal self-review's findings back into the PR before review. Bugs: - (#1) run_packager validate→write order — both packagers wrote README/metadata on validation failure, leaving corrupt artifacts on disk that would silently get committed. Gated on `errors == ()`; added no-write tests for both packagers. - (#2) Instructor README inlined the public 3-tier README into a 1-tier dataset card. Replaced with a dedicated `INSTRUCTOR_BODY` constant that links to the public dataset and describes only the instructor-specific additions (full-horizon tables, hidden DAG, latent registry, mechanism summary). - (#3) validate_upload_dir_safe also blocks strict descendants of release_dir; `--huggingface-dir release/intro` would otherwise rmtree the intro bundle. Architecture: - (#5) Finished shared-primitives extraction: SOURCE_TREE_BLOCK, validate_readme_substitution, replace_file, replace_dir, load_manifest now live in scripts/_release_common.py. Both packagers reduced to imports. - (#6) Replaced 60-line hand-rolled YAML renderer with yaml.safe_dump + a 4-line _IndentedDumper subclass. - (#7) Removed dead --owner / --dataset-slug CLI flags. - (#8) assemble_upload_dir now takes rendered_readme and writes it. - (#9) build_config_for_tier made pure (no I/O); cheap manifest-stat preflight via _assert_tier_dir_exists. - (#10) --default-config with --variant=instructor errors loudly. CI: - (#4) Added [publish] extra (datasets>=2.14, kaggle>=1.6) so the gated G12.3 / G12.4 / G11.3 tests install in one line. Cleanups: visual cruft (#13–#16), test cruft (#17 — unused tmp_path, dead tag_lines), em-dash YAML round-trip parametrised for the instructor pretty_name. Verification: 1223 tests pass + 5 gated skips; ruff + mypy clean; hash determinism PASS 67/67; leakage probes 0/3 reconstruct on every tier; validate_release_candidate --no-rebuild exits 0. release/{kaggle,huggingface,huggingface-instructor}/dataset-metadata .json|README.md regenerated; audit-artifact-sync tests guard them. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* PR 5.2: HuggingFace release packager + load_dataset smoke test
Add `scripts/package_hf_release.py` to generate `release/huggingface/README.md`
with G12.1-compliant YAML frontmatter (pretty_name, license, language,
task_categories, size_categories, tags, three configs with `default: true`
on intermediate per G12.2), inlining the rewritten `release/README.md`
body with HF-specific link rewrites. `--variant=instructor` packages the
companion repo (G12.4) from `release/intermediate_instructor/` into a
separate `release/huggingface-instructor/` upload tree. G12.3 covered
by a parametrised `load_dataset()` smoke test gated on the optional
`datasets` SDK.
Extract shared release-packaging primitives (link rewriter, dir-safety
guard, cover-image validator) into `scripts/_release_common.py`; refactor
the Kaggle packager to import them. `release/kaggle/dataset-metadata.json`
is byte-stable across the refactor.
Delete the legacy `release/HF_DATASET_CARD.md` stub — superseded by the
generated card. Gitignore `release/huggingface{,-instructor}/*` except
the committed README.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* PR 5.2 self-review fixes (Kaggle + HF packagers)
Fold the brutal self-review's findings back into the PR before review.
Bugs:
- (#1) run_packager validate→write order — both packagers wrote
README/metadata on validation failure, leaving corrupt artifacts on
disk that would silently get committed. Gated on `errors == ()`;
added no-write tests for both packagers.
- (#2) Instructor README inlined the public 3-tier README into a
1-tier dataset card. Replaced with a dedicated `INSTRUCTOR_BODY`
constant that links to the public dataset and describes only the
instructor-specific additions (full-horizon tables, hidden DAG,
latent registry, mechanism summary).
- (#3) validate_upload_dir_safe also blocks strict descendants of
release_dir; `--huggingface-dir release/intro` would otherwise
rmtree the intro bundle.
Architecture:
- (#5) Finished shared-primitives extraction: SOURCE_TREE_BLOCK,
validate_readme_substitution, replace_file, replace_dir,
load_manifest now live in scripts/_release_common.py. Both
packagers reduced to imports.
- (#6) Replaced 60-line hand-rolled YAML renderer with yaml.safe_dump
+ a 4-line _IndentedDumper subclass.
- (#7) Removed dead --owner / --dataset-slug CLI flags.
- (#8) assemble_upload_dir now takes rendered_readme and writes it.
- (#9) build_config_for_tier made pure (no I/O); cheap manifest-stat
preflight via _assert_tier_dir_exists.
- (#10) --default-config with --variant=instructor errors loudly.
CI:
- (#4) Added [publish] extra (datasets>=2.14, kaggle>=1.6) so the
gated G12.3 / G12.4 / G11.3 tests install in one line.
Cleanups: visual cruft (#13–#16), test cruft (#17 — unused tmp_path,
dead tag_lines), em-dash YAML round-trip parametrised for the
instructor pretty_name.
Verification: 1223 tests pass + 5 gated skips; ruff + mypy clean;
hash determinism PASS 67/67; leakage probes 0/3 reconstruct on every
tier; validate_release_candidate --no-rebuild exits 0.
release/{kaggle,huggingface,huggingface-instructor}/dataset-metadata
.json|README.md regenerated; audit-artifact-sync tests guard them.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* PR 5.2 Copilot-review fixes (Kaggle + HF packagers)
Fold Copilot's two real findings on the self-review revision back in.
COPILOT-1 — validate_upload_dir_safe was only invoked inside
assemble_upload_dir, which --dry-run skips. A dry-run with
--huggingface-dir release (or .) would write the README into the
unsafe path BEFORE the safety net fired. Hoist the check into
run_packager (both packagers) so it runs before any mkdir or write;
the inner assemble_upload_dir call stays as defence-in-depth for
direct callers. New tests: dry-run with unsafe upload-dir raises
without writing; the same path through main() returns rc=2.
COPILOT-2 — Cover-image path resolution was inconsistent:
validate_cover_image used cover_image as passed, while
assemble_upload_dir did a separate ``release_dir / cover_image.name``
fallback. Diverged for bare-basename inputs (false validation
failures) and two-paths-sharing-a-basename (assembler shadowing the
explicit path). Added resolve_cover_image_path() to
_release_common.py (explicit-wins, release-dir fallback);
run_packager calls it once and threads the resolved path through
validation, the metadata's image field, and assembly. New
tests/scripts/test_release_common.py covers the four resolution
branches; new packager-side tests confirm bare-basename success +
metadata field plumbing.
COPILOT-3 — outdated; already addressed by self-review fix #8 in
commit f2fc4a2. Resolved as already treated; no code change.
Verification: 1232/1232 tests pass + 5 gated skips; ruff + mypy
clean; hash determinism PASS 67/67; leakage probes rc=0 on every
tier; validate_release_candidate --no-rebuild exits 0;
BUNDLE_SCHEMA_VERSION unchanged at 5.
release/{kaggle,huggingface,huggingface-instructor}/* artifacts
regenerated byte-identically.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
…dataframes history (self-review) [LTV-Pf.2] Self-review caught that the relational.py split lost git history and left two files named relational.py. Both files shared the basename, so git's same-path heuristic recorded the scheme file (which inherits to_dataframes + _TABLE_SOURCES, the bulk of the original) as a brand-new ADD and the gutted shared remnant as a MODIFY — breaking `git blame`/`log --follow` on to_dataframes, and contradicting the history-preservation standard Pf.1 set. Fix: rename the shared writer module leadforge/render/relational.py → leadforge/render/relational_io.py. With the basename collision gone, git now records the scheme file as R067 (rename of the original — to_dataframes history preserved back to M8 #13) and relational_io.py as a clean ADD. Bonus: no more ambiguous duplicate `relational.py` (the shared writer is scheme-agnostic and isn't "relational"-specific anyway). - Importers of write_relational_tables updated (scheme __init__, test). - test_module_layout.py: assert the shared writer is at render.relational_io and the flat `leadforge.render.relational` module is gone. - CHANGELOG "Moved" table + CLAUDE.md (both layout sections) updated. Verified byte-identical (14/14); full suite 1532 passed / 51 skipped; ruff + mypy clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… [LTV-Pf.2] (#110) * refactor: move lead-scoring render under schemes/lead_scoring/render/ [LTV-Pf.2] Second half of the physical reorg (LTV-Pf). Relocates the lead-scoring render modules under the scheme and splits the one genuinely-shared piece out of the envelope. Hard break, no shims (D12); byte-identical output. - git mv snapshots.py, relational_snapshot_safe.py, tasks.py → leadforge/schemes/lead_scoring/render/. - Split render/relational.py: the 9-table assembler `to_dataframes` (+ _TABLE_SOURCES) moved to schemes/lead_scoring/render/relational.py; the scheme-agnostic `write_relational_tables` stays in leadforge/render/relational.py (now a small shared writer module). `leadforge.render` remains the shared envelope (write_relational_tables + manifests). - Rewrote importers: the three moved modules repo-wide; `to_dataframes` imports to the scheme path; `write_relational_tables` import unchanged. - CHANGELOG "Moved" table extended; CLAUDE.md Repository Map + canonical layout updated; roadmap Pf.2 + agent-plan updated. - tests/schemes/test_module_layout.py: render modules added to the moved set; new tests assert the shared envelope stays and the relational split landed (to_dataframes in scheme, not in shared render). Verified byte-identical to pre-reorg main (14/14 files); full suite 1524 passed / 51 skipped; ruff + mypy clean (89 source files); scripts compile. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * docs(ltv): record LTV-Pf.2 (#110) in roadmap + agent-plan [LTV-Pf.2] Link PR #110; advance status. Next: LTV-Pg (scaffold schemes/lifecycle/). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * refactor(render): rename shared writer to relational_io; preserve to_dataframes history (self-review) [LTV-Pf.2] Self-review caught that the relational.py split lost git history and left two files named relational.py. Both files shared the basename, so git's same-path heuristic recorded the scheme file (which inherits to_dataframes + _TABLE_SOURCES, the bulk of the original) as a brand-new ADD and the gutted shared remnant as a MODIFY — breaking `git blame`/`log --follow` on to_dataframes, and contradicting the history-preservation standard Pf.1 set. Fix: rename the shared writer module leadforge/render/relational.py → leadforge/render/relational_io.py. With the basename collision gone, git now records the scheme file as R067 (rename of the original — to_dataframes history preserved back to M8 #13) and relational_io.py as a clean ADD. Bonus: no more ambiguous duplicate `relational.py` (the shared writer is scheme-agnostic and isn't "relational"-specific anyway). - Importers of write_relational_tables updated (scheme __init__, test). - test_module_layout.py: assert the shared writer is at render.relational_io and the flat `leadforge.render.relational` module is gone. - CHANGELOG "Moved" table + CLAUDE.md (both layout sections) updated. Verified byte-identical (14/14); full suite 1532 passed / 51 skipped; ruff + mypy clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Summary
Implements the full observation model (Milestone 8) that transforms hidden simulation output into the canonical output bundle on disk.
render/relational.py:to_dataframes()— convertsSimulationResult+PopulationResultto 9 typedpd.DataFrames, one per relational tablerender/snapshots.py:build_snapshot()— 30-column leakage-free lead snapshot; vectorised touch/session/activity aggregates + account/contact field joins;has_open_opportunityuses pandas.where()to avoid deprecation warningsrender/tasks.py:write_task_splits()— deterministic 70/15/15 train/valid/test Parquet split +task_manifest.jsonusing seeded shufflerender/manifests.py:build_manifest()/write_manifest()— full provenance manifest with row counts and SHA-256 file hashesapi/bundle.py:write_bundle()— orchestrates all render steps (tables → snapshot → task splits → dataset card → feature dictionary → manifest)core/models.py:WorldBundleenriched withpopulation,simulation_result,world_graph;WorldBundle.save(path)lazy-importswrite_bundleto avoid circular importsapi/generator.py:Generator.generate()fully implemented end-to-endtests/render/test_render.py: 31 tests covering relational export, snapshot correctness, task splits (disjointness, ratios, determinism), manifest integrity, and full bundle smoke testtest_generate_not_implementedwith aWorldBundleassertion521 tests passing; ruff + mypy clean.
Test plan
pytest tests/render/test_render.py— 31 new render tests passpytest— 521 total tests pass (no regressions)ruff check .— cleanruff format --check .— cleanmypy leadforge/— no issues in 63 source files🤖 Generated with Claude Code