feat: Milestone 0 — project foundation and package skeleton (v0.1.0)#3
Conversation
Bootstraps the leadforge codebase from empty to a fully installable, testable, lint-clean package skeleton. All Milestone 0 acceptance criteria pass: `pip install -e .` works, `leadforge --help` shows all four commands, `leadforge list-recipes` returns the v1 recipe, and CI is configured. Package scaffold - pyproject.toml: setuptools build, Typer+PyYAML runtime deps, ruff/mypy/ pytest dev deps, `leadforge` CLI entry point - Full subpackage skeleton with __init__.py stubs for every module in the canonical layout (api, cli, core, narrative, schema, structure, mechanisms, simulation, render, exposure, validation, recipes) - leadforge/version.py: __version__ = "0.1.0" - README.md: install, quickstart, API snippet, doc links - .pre-commit-config.yaml: ruff (lint+format) + pre-commit-hooks Core primitives (leadforge/core/) - enums.py: ExposureMode (StrEnum), DifficultyProfile (StrEnum) - exceptions.py: LeadforgeError base + 6 typed subclasses - models.py: GenerationConfig, WorldSpec, WorldBundle dataclass stubs - rng.py, ids.py: documented stubs for Milestone 1 Recipe system (leadforge/recipes/) - registry.py: list_recipes() + load_recipe() reading from YAML files - b2b_saas_procurement_v1/recipe.yaml: id, title, primary_task, supported_modes, supported_difficulty, default_population CLI (leadforge/cli/) - main.py: Typer app with --version flag and four registered commands - commands/list_recipes.py: fully implemented with Rich table output - commands/generate.py: stub with full option spec (--recipe, --seed, --mode, --out, --difficulty, --n-accounts, --n-contacts, --n-leads, --horizon-days, --override); exits 1 with "coming in v0.2.0" - commands/inspect.py, validate.py: stubs with correct argument spec CI (.github/workflows/ci.yml) - Three jobs: lint (ruff check+format), typecheck (mypy), test matrix (Python 3.11 + 3.12 with pytest-cov + coverage artifact upload for pr-agent-context integration) Tests (20 passing) - tests/test_cli.py: help, version, list-recipes output, stub exit codes - tests/core/test_enums.py: values and string construction - tests/core/test_exceptions.py: hierarchy and message preservation - tests/recipes/test_registry.py: list/load, required fields, error case Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Pull request overview
Bootstraps the leadforge repository into an installable Python package with a Typer-based CLI, an initial YAML-backed recipe registry, CI/tooling configuration, and a baseline test suite to support future milestones.
Changes:
- Added project packaging/tooling (
pyproject.toml, pre-commit, CI) and a full package/module skeleton. - Implemented a first working CLI command (
list-recipes) plus stubs forgenerate,inspect,validate. - Added a YAML recipe registry + an initial
b2b_saas_procurement_v1recipe and accompanying tests.
Reviewed changes
Copilot reviewed 23 out of 44 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test_cli.py | CLI smoke tests for help/version and command stubs |
| tests/recipes/test_registry.py | Tests for recipe listing/loading and error case |
| tests/recipes/init.py | Test package marker |
| tests/core/test_exceptions.py | Exception hierarchy tests |
| tests/core/test_enums.py | Enum value/from-string tests |
| tests/core/init.py | Test package marker |
| tests/init.py | Test package marker |
| pyproject.toml | Packaging metadata, deps, ruff/mypy/pytest configuration |
| leadforge/version.py | Defines __version__ |
| leadforge/validation/init.py | Subpackage stub |
| leadforge/structure/init.py | Subpackage stub |
| leadforge/simulation/init.py | Subpackage stub |
| leadforge/schema/init.py | Subpackage stub |
| leadforge/sample_data/public/.gitkeep | Placeholder for sample data |
| leadforge/sample_data/instructor/.gitkeep | Placeholder for sample data |
| leadforge/render/init.py | Subpackage stub |
| leadforge/recipes/registry.py | Implements recipe discovery/loading from YAML |
| leadforge/recipes/b2b_saas_procurement_v1/recipe.yaml | Adds initial recipe metadata YAML |
| leadforge/recipes/b2b_saas_procurement_v1/init.py | Subpackage stub |
| leadforge/recipes/init.py | Subpackage stub |
| leadforge/narrative/init.py | Subpackage stub |
| leadforge/mechanisms/init.py | Subpackage stub |
| leadforge/exposure/init.py | Subpackage stub |
| leadforge/examples/notebooks/.gitkeep | Placeholder for examples |
| leadforge/examples/configs/.gitkeep | Placeholder for examples |
| leadforge/core/rng.py | RNG utilities stub docstring |
| leadforge/core/models.py | Dataclass stubs for config/spec/bundle |
| leadforge/core/ids.py | ID scheme stub docstring |
| leadforge/core/exceptions.py | Defines project exception hierarchy |
| leadforge/core/enums.py | Defines ExposureMode / DifficultyProfile |
| leadforge/core/init.py | Core subpackage stub |
| leadforge/cli/main.py | Typer app entrypoint + command registration |
| leadforge/cli/commands/validate.py | validate command stub |
| leadforge/cli/commands/list_recipes.py | list-recipes command implementation (Rich table) |
| leadforge/cli/commands/inspect.py | inspect command stub |
| leadforge/cli/commands/generate.py | generate command stub + option surface |
| leadforge/cli/commands/init.py | Commands subpackage stub |
| leadforge/cli/init.py | CLI subpackage stub |
| leadforge/api/init.py | API subpackage stub |
| leadforge/init.py | Package init exporting __version__ |
| README.md | Project README with install/quickstart/docs links |
| .pre-commit-config.yaml | Pre-commit hooks (ruff, formatting, basic checks) |
| .github/workflows/ci.yml | CI for lint/typecheck/tests with coverage artifacts |
| .agent-plan.md | Updates project plan to mark M0 complete / M1 next |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This comment has been minimized.
This comment has been minimized.
COPILOT-1 — Add Generator stub to leadforge/api - leadforge/api/generator.py: Generator class with from_recipe() and generate() raising NotImplementedError with "coming in v0.2.0" messages - leadforge/api/__init__.py: export Generator so `from leadforge.api import Generator` resolves correctly COPILOT-2 — Sort list_recipes() by recipe id field, not path - return sorted(recipes, key=lambda r: r["id"]) instead of relying on filesystem iteration order COPILOT-3 — Validate yaml.safe_load() result in registry - Extract _parse_and_validate() helper; raises InvalidRecipeError if the parsed value is not a dict or is missing the required 'id' key; used by both list_recipes() and load_recipe() COPILOT-4 — Guard load_recipe() against path traversal - Resolve the candidate path and verify it stays within _RECIPES_DIR before checking existence or opening; raises InvalidRecipeError for any recipe_id that would escape the recipes directory COPILOT-5 — Comment out unimplemented CLI commands in README - Quickstart now shows generate/inspect/validate as commented-out examples with "Coming in v0.x.0" labels; only `list-recipes` is shown as immediately runnable - Python API snippet annotated with "(coming in v0.2.0)" Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 25 out of 45 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…mits Documents the mandatory step of resolving GitHub review threads via GraphQL after addressing PR comments, so the omission from PR #3 does not recur. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This comment has been minimized.
This comment has been minimized.
- registry.py: replace string-prefix path traversal guard with Path.is_relative_to() (Python 3.11+), closing the prefix-collision bypass (e.g. recipes_evil alongside recipes) - pyproject.toml: add "S" (bandit) ruleset to ruff select so security checks are active on non-test code; widen per-file-ignores glob from tests/* to tests/**/* to cover subdirectories; add S108 to test ignores to suppress the /tmp false-positive in test CLI invocations Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
pr-agent-context report: This run includes unresolved review comments on PR #3.
For each unresolved review comment, recommend one of: resolve as irrelevant, accept and implement
the recommended solution, open a separate issue and resolve as out-of-scope for this PR, accept and
implement a different solution, or resolve as already treated by the code.
After I reply with my decision per item, implement the accepted actions, resolve the corresponding
PR comments, and push all of these changes in a single commit.
# Copilot Comments
## COPILOT-1
Location: leadforge/recipes/registry.py
URL: https://github.com/leadforge-dev/leadforge/pull/3#discussion_r3105050071
Status: outdated
Root author: copilot-pull-request-reviewer
Comment:
The path traversal guard is bypassable because it uses a string prefix check. For example, a resolved path like `/.../recipes_evil` will still start with `/.../recipes`, so `recipe_id='../recipes_evil'` can incorrectly pass this check. Use `Path.is_relative_to()` (Py3.11+) or `recipe_dir.relative_to(base_dir)` in a try/except to ensure the resolved path is actually within `_RECIPES_DIR`.
~~~suggestion
base_dir = _RECIPES_DIR.resolve()
recipe_dir = (base_dir / recipe_id).resolve()
try:
recipe_dir.relative_to(base_dir)
except ValueError:
~~~
## COPILOT-2
Location: pyproject.toml
URL: https://github.com/leadforge-dev/leadforge/pull/3#discussion_r3105050075
Status: outdated
Root author: copilot-pull-request-reviewer
Comment:
The Ruff per-file-ignores glob `tests/*` only matches files directly under `tests/` and won’t apply to tests in subdirectories like `tests/core/...` or `tests/recipes/...`. If the intent is to ignore rules across the whole test tree, use `tests/**` (or similar). Also, `S101` is currently a no-op unless the `S` ruleset is enabled in `select`.Run metadata: |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 26 out of 46 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Review feedback addressed: - Remove primary_task/label_window_days as explicit kwargs from resolve_config() and Generator.from_recipe() — these fields are resolved from recipe YAML and override dict only, not casually overridable, since the generation pipeline doesn't yet support arbitrary task types (Copilot-1, Copilot-3, shaypal5 #1, #2) - Add label_window_days <= horizon_days validation in GenerationConfig.__post_init__ (Copilot-2, shaypal5 #3) - Add tests for invalid primary_task on GenerationConfig: empty string, non-string type (shaypal5 #6, pr-agent-context) - Add tests for invalid label_window_days on Recipe.from_dict: bool, non-positive, float (shaypal5 #7, pr-agent-context) - Add test for label_window_days > horizon_days rejection - Fix existing test using horizon_days=30 (now conflicts with default label_window_days=90) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* feat: carry primary_task and label_window_days into WorldSpec for dataset card Add `primary_task` and `label_window_days` fields to `GenerationConfig` (with defaults preserving current behavior). Propagate through `Recipe.from_dict()`, `resolve_config()`, and `Generator.from_recipe()` so recipe YAML can override them. Update `render_dataset_card()` to read from `world_spec.config` instead of hard-coded string literals. Closes #6 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: update .agent-plan.md for WorldSpec task fields (PR #36) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address review feedback — tighten scope, add validation + tests Review feedback addressed: - Remove primary_task/label_window_days as explicit kwargs from resolve_config() and Generator.from_recipe() — these fields are resolved from recipe YAML and override dict only, not casually overridable, since the generation pipeline doesn't yet support arbitrary task types (Copilot-1, Copilot-3, shaypal5 #1, #2) - Add label_window_days <= horizon_days validation in GenerationConfig.__post_init__ (Copilot-2, shaypal5 #3) - Add tests for invalid primary_task on GenerationConfig: empty string, non-string type (shaypal5 #6, pr-agent-context) - Add tests for invalid label_window_days on Recipe.from_dict: bool, non-positive, float (shaypal5 #7, pr-agent-context) - Add test for label_window_days > horizon_days rejection - Fix existing test using horizon_days=30 (now conflicts with default label_window_days=90) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Fold the brutal self-review's findings back into the PR before review. Bugs: - (#1) run_packager validate→write order — both packagers wrote README/metadata on validation failure, leaving corrupt artifacts on disk that would silently get committed. Gated on `errors == ()`; added no-write tests for both packagers. - (#2) Instructor README inlined the public 3-tier README into a 1-tier dataset card. Replaced with a dedicated `INSTRUCTOR_BODY` constant that links to the public dataset and describes only the instructor-specific additions (full-horizon tables, hidden DAG, latent registry, mechanism summary). - (#3) validate_upload_dir_safe also blocks strict descendants of release_dir; `--huggingface-dir release/intro` would otherwise rmtree the intro bundle. Architecture: - (#5) Finished shared-primitives extraction: SOURCE_TREE_BLOCK, validate_readme_substitution, replace_file, replace_dir, load_manifest now live in scripts/_release_common.py. Both packagers reduced to imports. - (#6) Replaced 60-line hand-rolled YAML renderer with yaml.safe_dump + a 4-line _IndentedDumper subclass. - (#7) Removed dead --owner / --dataset-slug CLI flags. - (#8) assemble_upload_dir now takes rendered_readme and writes it. - (#9) build_config_for_tier made pure (no I/O); cheap manifest-stat preflight via _assert_tier_dir_exists. - (#10) --default-config with --variant=instructor errors loudly. CI: - (#4) Added [publish] extra (datasets>=2.14, kaggle>=1.6) so the gated G12.3 / G12.4 / G11.3 tests install in one line. Cleanups: visual cruft (#13–#16), test cruft (#17 — unused tmp_path, dead tag_lines), em-dash YAML round-trip parametrised for the instructor pretty_name. Verification: 1223 tests pass + 5 gated skips; ruff + mypy clean; hash determinism PASS 67/67; leakage probes 0/3 reconstruct on every tier; validate_release_candidate --no-rebuild exits 0. release/{kaggle,huggingface,huggingface-instructor}/dataset-metadata .json|README.md regenerated; audit-artifact-sync tests guard them. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* PR 5.2: HuggingFace release packager + load_dataset smoke test
Add `scripts/package_hf_release.py` to generate `release/huggingface/README.md`
with G12.1-compliant YAML frontmatter (pretty_name, license, language,
task_categories, size_categories, tags, three configs with `default: true`
on intermediate per G12.2), inlining the rewritten `release/README.md`
body with HF-specific link rewrites. `--variant=instructor` packages the
companion repo (G12.4) from `release/intermediate_instructor/` into a
separate `release/huggingface-instructor/` upload tree. G12.3 covered
by a parametrised `load_dataset()` smoke test gated on the optional
`datasets` SDK.
Extract shared release-packaging primitives (link rewriter, dir-safety
guard, cover-image validator) into `scripts/_release_common.py`; refactor
the Kaggle packager to import them. `release/kaggle/dataset-metadata.json`
is byte-stable across the refactor.
Delete the legacy `release/HF_DATASET_CARD.md` stub — superseded by the
generated card. Gitignore `release/huggingface{,-instructor}/*` except
the committed README.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* PR 5.2 self-review fixes (Kaggle + HF packagers)
Fold the brutal self-review's findings back into the PR before review.
Bugs:
- (#1) run_packager validate→write order — both packagers wrote
README/metadata on validation failure, leaving corrupt artifacts on
disk that would silently get committed. Gated on `errors == ()`;
added no-write tests for both packagers.
- (#2) Instructor README inlined the public 3-tier README into a
1-tier dataset card. Replaced with a dedicated `INSTRUCTOR_BODY`
constant that links to the public dataset and describes only the
instructor-specific additions (full-horizon tables, hidden DAG,
latent registry, mechanism summary).
- (#3) validate_upload_dir_safe also blocks strict descendants of
release_dir; `--huggingface-dir release/intro` would otherwise
rmtree the intro bundle.
Architecture:
- (#5) Finished shared-primitives extraction: SOURCE_TREE_BLOCK,
validate_readme_substitution, replace_file, replace_dir,
load_manifest now live in scripts/_release_common.py. Both
packagers reduced to imports.
- (#6) Replaced 60-line hand-rolled YAML renderer with yaml.safe_dump
+ a 4-line _IndentedDumper subclass.
- (#7) Removed dead --owner / --dataset-slug CLI flags.
- (#8) assemble_upload_dir now takes rendered_readme and writes it.
- (#9) build_config_for_tier made pure (no I/O); cheap manifest-stat
preflight via _assert_tier_dir_exists.
- (#10) --default-config with --variant=instructor errors loudly.
CI:
- (#4) Added [publish] extra (datasets>=2.14, kaggle>=1.6) so the
gated G12.3 / G12.4 / G11.3 tests install in one line.
Cleanups: visual cruft (#13–#16), test cruft (#17 — unused tmp_path,
dead tag_lines), em-dash YAML round-trip parametrised for the
instructor pretty_name.
Verification: 1223 tests pass + 5 gated skips; ruff + mypy clean;
hash determinism PASS 67/67; leakage probes 0/3 reconstruct on every
tier; validate_release_candidate --no-rebuild exits 0.
release/{kaggle,huggingface,huggingface-instructor}/dataset-metadata
.json|README.md regenerated; audit-artifact-sync tests guard them.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* PR 5.2 Copilot-review fixes (Kaggle + HF packagers)
Fold Copilot's two real findings on the self-review revision back in.
COPILOT-1 — validate_upload_dir_safe was only invoked inside
assemble_upload_dir, which --dry-run skips. A dry-run with
--huggingface-dir release (or .) would write the README into the
unsafe path BEFORE the safety net fired. Hoist the check into
run_packager (both packagers) so it runs before any mkdir or write;
the inner assemble_upload_dir call stays as defence-in-depth for
direct callers. New tests: dry-run with unsafe upload-dir raises
without writing; the same path through main() returns rc=2.
COPILOT-2 — Cover-image path resolution was inconsistent:
validate_cover_image used cover_image as passed, while
assemble_upload_dir did a separate ``release_dir / cover_image.name``
fallback. Diverged for bare-basename inputs (false validation
failures) and two-paths-sharing-a-basename (assembler shadowing the
explicit path). Added resolve_cover_image_path() to
_release_common.py (explicit-wins, release-dir fallback);
run_packager calls it once and threads the resolved path through
validation, the metadata's image field, and assembly. New
tests/scripts/test_release_common.py covers the four resolution
branches; new packager-side tests confirm bare-basename success +
metadata field plumbing.
COPILOT-3 — outdated; already addressed by self-review fix #8 in
commit f2fc4a2. Resolved as already treated; no code change.
Verification: 1232/1232 tests pass + 5 gated skips; ruff + mypy
clean; hash determinism PASS 67/67; leakage probes rc=0 on every
tier; validate_release_candidate --no-rebuild exits 0;
BUNDLE_SCHEMA_VERSION unchanged at 5.
release/{kaggle,huggingface,huggingface-instructor}/* artifacts
regenerated byte-identically.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Summary
Bootstraps the leadforge codebase from empty to a fully installable, lint-clean, tested package skeleton. This is the complete Milestone 0 delivery.
All acceptance criteria pass:
pip install -e .✓leadforge --helpshows all four commands ✓leadforge list-recipesreturnsb2b_saas_procurement_v1✓What's included
Package scaffold
pyproject.toml— setuptools build, Typer + PyYAML runtime, ruff/mypy/pytest dev deps,leadforgeCLI entry pointapi,cli,core,narrative,schema,structure,mechanisms,simulation,render,exposure,validation,recipes) with__init__.pystubsREADME.md— install, CLI quickstart, Python API snippet, doc links.pre-commit-config.yaml— ruff (lint + format) + pre-commit-hooksCore primitives (
leadforge/core/)enums.py—ExposureMode+DifficultyProfileasStrEnumexceptions.py—LeadforgeErrorbase + 6 typed subclassesmodels.py—GenerationConfig,WorldSpec,WorldBundledataclass stubsrng.py,ids.py— documented stubs (implemented in M1)Recipe system (
leadforge/recipes/)registry.py—list_recipes()+load_recipe()reading YAML filesb2b_saas_procurement_v1/recipe.yaml— id, title, primary task, modes, difficulty levelsCLI (
leadforge/cli/)main.py— Typer app with--versionand four registered commandslist-recipes— fully implemented with Rich table outputgenerate— stub with full option spec per architecture doc; exits 1 with "coming in v0.2.0"inspect,validate— stubs with correct argument specCI (
.github/workflows/ci.yml)lint(ruff check + format),typecheck(mypy),testmatrix (Python 3.11 + 3.12)pr-agent-context-coverage-py*prefix for pr-agent-context integrationTests (20 passing)
tests/test_cli.py— help, version, list-recipes content, stub exit codestests/core/test_enums.py— values + string constructiontests/core/test_exceptions.py— hierarchy + message preservationtests/recipes/test_registry.py— list/load, required fields, error caseTest plan
leadforge --helpshows all four commandsleadforge list-recipesshowsb2b_saas_procurement_v1leadforge --versionprintsleadforge 0.1.0🤖 Generated with Claude Code