feat: Milestone 2 — narrative layer (NarrativeSpec, WorldSpec, dataset card)#5
Merged
Conversation
…t card) - narrative/spec.py: frozen dataclasses for NarrativeSpec hierarchy (CompanySpec, ProductSpec, MarketSpec, GtmMotionSpec, PersonaSpec, FunnelStageSpec) with validated from_dict() classmethods - narrative/dataset_card.py: render_dataset_card() produces Markdown dataset card from WorldSpec (header, narrative summary, task, stubs for table inventory and feature categories, use cases, caveats) - core/models.py: WorldSpec.narrative field (NarrativeSpec | None) - api/generator.py: world_spec property; from_recipe() resolves the recipe's narrative.yaml into a NarrativeSpec and populates WorldSpec - 51 new tests covering spec validation, card rendering, and Generator integration (110 total); ruff + mypy clean Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Pull request overview
Introduces Milestone 2’s “narrative layer” by adding typed, validated narrative spec models, wiring them into WorldSpec/Generator, and providing a Markdown dataset-card renderer with accompanying tests.
Changes:
- Add frozen dataclass hierarchy for
NarrativeSpecand sub-specs withfrom_dict()parsing/validation. - Add
render_dataset_card(WorldSpec) -> strto produce a Markdown dataset card (with stubs for future milestones). - Extend
WorldSpecwith an optionalnarrativefield and haveGenerator.from_recipe()populate/expose aworld_spec.
Reviewed changes
Copilot reviewed 7 out of 8 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
leadforge/narrative/spec.py |
New narrative datamodels + validation helpers. |
leadforge/narrative/dataset_card.py |
Markdown dataset card renderer based on WorldSpec. |
leadforge/core/models.py |
Add WorldSpec.narrative (optional) + docstring update. |
leadforge/api/generator.py |
Populate/expose world_spec (incl. narrative) from recipes; update constructor. |
tests/narrative/test_spec.py |
Tests for parsing/validation, immutability, and real YAML parsing. |
tests/narrative/test_dataset_card.py |
Tests for dataset card rendering and generator integration. |
tests/narrative/__init__.py |
New test package marker. |
.agent-plan.md |
Project plan updated to reflect Milestone 2 completion and Milestone 3 next steps. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This comment has been minimized.
This comment has been minimized.
- spec.py: _require_keys now guards against non-dict input (COPILOT-3) - spec.py: NarrativeSpec.from_dict validates each personas/funnel_stages element is a dict before passing to sub-from_dict (COPILOT-3) - spec.py: GtmMotionSpec.from_dict validates channels is a list of strings, rejects bools for share floats, and enforces [0, 1] range (COPILOT-1) - spec.py: PersonaSpec.from_dict validates title_variants is a list of strings instead of silently splitting a bare string (COPILOT-2) - spec.py: ProductSpec.from_dict requires free_trial_available / demo_available to be actual bools; rejects int/str coercion (COPILOT-6) - spec.py: MarketSpec.from_dict validates icp_industries and geographies are lists of strings (COPILOT-7) - generator.py: Generator.__init__ takes only world_spec; config property derives from world_spec.config (single source of truth) (COPILOT-4) - dataset_card.py: stub text changed to "Narrative unavailable for this dataset." (COPILOT-5); test updated to match Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 7 out of 8 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
pr-agent-context report: This is a refreshed snapshot of the current PR state.
This run includes an unresolved review comment on PR #5.
For each unresolved review comment, recommend one of: resolve as irrelevant, accept and implement
the recommended solution, open a separate issue and resolve as out-of-scope for this PR, accept and
implement a different solution, or resolve as already treated by the code.
After I reply with my decision per item, implement the accepted actions, resolve the corresponding
PR comments, and push all of these changes in a single commit.
# Copilot Comments
## COPILOT-1
Location: leadforge/narrative/dataset_card.py:103
URL: https://github.com/leadforge-dev/leadforge/pull/5#discussion_r3115717615
Root author: copilot-pull-request-reviewer
Comment:
The "Primary task" section is hard-coded to `converted_within_90_days` and a 90-day label definition. That matches the current shipped recipe, but it will drift once additional recipes/tasks exist (or if task parameters become configurable). Consider carrying `primary_task` (and any label-window parameter) into `WorldSpec` during `Generator.from_recipe()` so the dataset card renders from resolved spec rather than literals.
~~~suggestion
primary_task = getattr(
world_spec,
"primary_task",
getattr(cfg, "primary_task", "converted_within_90_days"),
)
label_window_days = getattr(
world_spec,
"label_window_days",
getattr(cfg, "label_window_days", 90),
)
lines += [
"## Primary task",
"",
f"**Task:** `{primary_task}`",
"",
"**Label definition:** A lead is considered converted if a `closed_won` event "
f"is recorded within {label_window_days} days of the lead's snapshot anchor date. "
~~~Run metadata: |
shaypal5
added a commit
that referenced
this pull request
May 6, 2026
Fold the brutal self-review's findings back into the PR before review. Bugs: - (#1) run_packager validate→write order — both packagers wrote README/metadata on validation failure, leaving corrupt artifacts on disk that would silently get committed. Gated on `errors == ()`; added no-write tests for both packagers. - (#2) Instructor README inlined the public 3-tier README into a 1-tier dataset card. Replaced with a dedicated `INSTRUCTOR_BODY` constant that links to the public dataset and describes only the instructor-specific additions (full-horizon tables, hidden DAG, latent registry, mechanism summary). - (#3) validate_upload_dir_safe also blocks strict descendants of release_dir; `--huggingface-dir release/intro` would otherwise rmtree the intro bundle. Architecture: - (#5) Finished shared-primitives extraction: SOURCE_TREE_BLOCK, validate_readme_substitution, replace_file, replace_dir, load_manifest now live in scripts/_release_common.py. Both packagers reduced to imports. - (#6) Replaced 60-line hand-rolled YAML renderer with yaml.safe_dump + a 4-line _IndentedDumper subclass. - (#7) Removed dead --owner / --dataset-slug CLI flags. - (#8) assemble_upload_dir now takes rendered_readme and writes it. - (#9) build_config_for_tier made pure (no I/O); cheap manifest-stat preflight via _assert_tier_dir_exists. - (#10) --default-config with --variant=instructor errors loudly. CI: - (#4) Added [publish] extra (datasets>=2.14, kaggle>=1.6) so the gated G12.3 / G12.4 / G11.3 tests install in one line. Cleanups: visual cruft (#13–#16), test cruft (#17 — unused tmp_path, dead tag_lines), em-dash YAML round-trip parametrised for the instructor pretty_name. Verification: 1223 tests pass + 5 gated skips; ruff + mypy clean; hash determinism PASS 67/67; leakage probes 0/3 reconstruct on every tier; validate_release_candidate --no-rebuild exits 0. release/{kaggle,huggingface,huggingface-instructor}/dataset-metadata .json|README.md regenerated; audit-artifact-sync tests guard them. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
4 tasks
shaypal5
added a commit
that referenced
this pull request
May 6, 2026
* PR 5.2: HuggingFace release packager + load_dataset smoke test
Add `scripts/package_hf_release.py` to generate `release/huggingface/README.md`
with G12.1-compliant YAML frontmatter (pretty_name, license, language,
task_categories, size_categories, tags, three configs with `default: true`
on intermediate per G12.2), inlining the rewritten `release/README.md`
body with HF-specific link rewrites. `--variant=instructor` packages the
companion repo (G12.4) from `release/intermediate_instructor/` into a
separate `release/huggingface-instructor/` upload tree. G12.3 covered
by a parametrised `load_dataset()` smoke test gated on the optional
`datasets` SDK.
Extract shared release-packaging primitives (link rewriter, dir-safety
guard, cover-image validator) into `scripts/_release_common.py`; refactor
the Kaggle packager to import them. `release/kaggle/dataset-metadata.json`
is byte-stable across the refactor.
Delete the legacy `release/HF_DATASET_CARD.md` stub — superseded by the
generated card. Gitignore `release/huggingface{,-instructor}/*` except
the committed README.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* PR 5.2 self-review fixes (Kaggle + HF packagers)
Fold the brutal self-review's findings back into the PR before review.
Bugs:
- (#1) run_packager validate→write order — both packagers wrote
README/metadata on validation failure, leaving corrupt artifacts on
disk that would silently get committed. Gated on `errors == ()`;
added no-write tests for both packagers.
- (#2) Instructor README inlined the public 3-tier README into a
1-tier dataset card. Replaced with a dedicated `INSTRUCTOR_BODY`
constant that links to the public dataset and describes only the
instructor-specific additions (full-horizon tables, hidden DAG,
latent registry, mechanism summary).
- (#3) validate_upload_dir_safe also blocks strict descendants of
release_dir; `--huggingface-dir release/intro` would otherwise
rmtree the intro bundle.
Architecture:
- (#5) Finished shared-primitives extraction: SOURCE_TREE_BLOCK,
validate_readme_substitution, replace_file, replace_dir,
load_manifest now live in scripts/_release_common.py. Both
packagers reduced to imports.
- (#6) Replaced 60-line hand-rolled YAML renderer with yaml.safe_dump
+ a 4-line _IndentedDumper subclass.
- (#7) Removed dead --owner / --dataset-slug CLI flags.
- (#8) assemble_upload_dir now takes rendered_readme and writes it.
- (#9) build_config_for_tier made pure (no I/O); cheap manifest-stat
preflight via _assert_tier_dir_exists.
- (#10) --default-config with --variant=instructor errors loudly.
CI:
- (#4) Added [publish] extra (datasets>=2.14, kaggle>=1.6) so the
gated G12.3 / G12.4 / G11.3 tests install in one line.
Cleanups: visual cruft (#13–#16), test cruft (#17 — unused tmp_path,
dead tag_lines), em-dash YAML round-trip parametrised for the
instructor pretty_name.
Verification: 1223 tests pass + 5 gated skips; ruff + mypy clean;
hash determinism PASS 67/67; leakage probes 0/3 reconstruct on every
tier; validate_release_candidate --no-rebuild exits 0.
release/{kaggle,huggingface,huggingface-instructor}/dataset-metadata
.json|README.md regenerated; audit-artifact-sync tests guard them.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* PR 5.2 Copilot-review fixes (Kaggle + HF packagers)
Fold Copilot's two real findings on the self-review revision back in.
COPILOT-1 — validate_upload_dir_safe was only invoked inside
assemble_upload_dir, which --dry-run skips. A dry-run with
--huggingface-dir release (or .) would write the README into the
unsafe path BEFORE the safety net fired. Hoist the check into
run_packager (both packagers) so it runs before any mkdir or write;
the inner assemble_upload_dir call stays as defence-in-depth for
direct callers. New tests: dry-run with unsafe upload-dir raises
without writing; the same path through main() returns rc=2.
COPILOT-2 — Cover-image path resolution was inconsistent:
validate_cover_image used cover_image as passed, while
assemble_upload_dir did a separate ``release_dir / cover_image.name``
fallback. Diverged for bare-basename inputs (false validation
failures) and two-paths-sharing-a-basename (assembler shadowing the
explicit path). Added resolve_cover_image_path() to
_release_common.py (explicit-wins, release-dir fallback);
run_packager calls it once and threads the resolved path through
validation, the metadata's image field, and assembly. New
tests/scripts/test_release_common.py covers the four resolution
branches; new packager-side tests confirm bare-basename success +
metadata field plumbing.
COPILOT-3 — outdated; already addressed by self-review fix #8 in
commit f2fc4a2. Resolved as already treated; no code change.
Verification: 1232/1232 tests pass + 5 gated skips; ruff + mypy
clean; hash determinism PASS 67/67; leakage probes rc=0 on every
tier; validate_release_candidate --no-rebuild exits 0;
BUNDLE_SCHEMA_VERSION unchanged at 5.
release/{kaggle,huggingface,huggingface-instructor}/* artifacts
regenerated byte-identically.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
leadforge/narrative/spec.py— frozen dataclasses for the fullNarrativeSpechierarchy:CompanySpec,ProductSpec,MarketSpec,GtmMotionSpec,PersonaSpec,FunnelStageSpec. Each has a validatedfrom_dict()classmethod that rejects bools masquerading as ints, enforces list-pair shapes, and raisesInvalidRecipeErroron bad input.leadforge/narrative/dataset_card.py—render_dataset_card(world_spec: WorldSpec) -> strproducing a Markdown card with: metadata header table, narrative summary (vendor/product/market/GTM/personas), primary task + label definition, stub sections for table inventory and feature categories (v0.3.0+), suggested use cases, and caveats.leadforge/core/models.py—WorldSpecgains anarrative: NarrativeSpec | None = Nonefield.leadforge/api/generator.py—Generator.from_recipe()loads and parsesnarrative.yamlinto aNarrativeSpec, wraps it in aWorldSpec, and exposes it via theworld_specproperty.tests/narrative/test_spec.pyandtests/narrative/test_dataset_card.py— covering sub-model validation, bool rejection, frozen-dataclass immutability, real YAML round-trip, card content assertions, and Generator integration. Total: 110 tests passing.Test plan
pytest— 110 tests passruff check . && ruff format --check .— cleanmypy leadforge/— no errorstest_real_narrative_yaml_parses— verifiesb2b_saas_procurement_v1/narrative.yamlround-tripstest_card_with_narrative_contains_company_name— verifies card contains "Veridian Technologies"test_narrative_spec_frozen— confirmsFrozenInstanceErroron mutation attempt🤖 Generated with Claude Code