docs: add mandatory branch/PR workflow rules#1
Merged
Conversation
Locks the rule that no work goes directly to main. Every unit of work must branch from latest main, commit there, update .agent-plan.md in the same PR, and open a GitHub PR. Also installs a git pre-push hook (.git/hooks/pre-push) that mechanically blocks any push targeting main. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Adds explicit “Branch & PR Workflow” rules to the repo’s agent-facing documentation and updates the agent plan to reflect the new workflow constraints.
Changes:
- Adds a mandatory Branch & PR Workflow section at the top of
CLAUDE.md. - Updates
.agent-plan.mdcurrent-state text to mention the workflow rules and their enforcement.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
CLAUDE.md |
Documents mandatory branching/PR workflow rules, including “never push to main”. |
.agent-plan.md |
Updates current system state to reference the new workflow rules and claimed enforcement mechanism. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Extends the branch/PR workflow in CLAUDE.md with two required steps: apply labels (type + layer taxonomy) and assign to a milestone. Adds the full label taxonomy and a milestone map table to CLAUDE.md. GitHub side: creates 22 labels across type/layer/status groups and 6 milestones (v0.1.0 through v1.0.0) matching the roadmap release gates. Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Integrates shaypal5/pr-agent-context@v4 with two workflows: - .github/workflows/pr-agent-context.yml Runs on pull_request (opened/synchronize/reopened). Calls the reusable workflow in ci mode with publish_mode=append and include_outdated_review_threads=true so stale-but-unresolved threads always appear in the generated agent context comment. Coverage artifacts are collected under the pr-agent-context-coverage prefix for use by the refresh flow via cross-run lookup. - .github/workflows/pr-agent-context-refresh.yml Triggered by pull_request_review, pull_request_review_comment, and completed non-Actions check_run events. Runs in refresh mode with publish_mode=append (new comment per refresh, previous managed comments hidden), wait_for_reviews_to_settle=true, and enable_cross_run_coverage_lookup=true to reuse coverage artifacts from the CI run on the same head SHA. Both workflows explicitly set include_outdated_review_threads=true per project configuration. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This comment has been minimized.
This comment has been minimized.
- CLAUDE.md: scope the force-push prohibition to `main` explicitly (`git push --force origin main`) and add a note clarifying that the local pre-push hook is a personal convenience, not a versioned repo-wide enforcement; GitHub branch protection is the team-level guard - .agent-plan.md: correct state description to say "local convenience hook (not versioned) and GitHub branch protection" instead of implying the hook is a reliable repo-wide mechanism Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
pr-agent-context report: This run includes unresolved review comments on PR #1.
For each unresolved review comment, recommend one of: resolve as irrelevant, accept and implement
the recommended solution, open a separate issue and resolve as out-of-scope for this PR, accept and
implement a different solution, or resolve as already treated by the code.
After I reply with my decision per item, implement the accepted actions, resolve the corresponding
PR comments, and push all of these changes in a single commit.
# Copilot Comments
## COPILOT-1
Location: .agent-plan.md
URL: https://github.com/leadforge-dev/leadforge/pull/1#discussion_r3104702550
Status: outdated
Root author: copilot-pull-request-reviewer
Comment:
The statement that the workflow is “enforced via `.git/hooks/pre-push`” is misleading/inaccurate in a repo context: `.git/hooks` isn’t versioned/cloned for other developers, and this PR doesn’t add any tracked hook script that would be installed automatically. Consider replacing this with a versioned hooks directory (and documented `core.hooksPath` setup), or describing enforcement via GitHub branch protection/CI checks instead of `.git/hooks`.
## COPILOT-2
Location: CLAUDE.md
URL: https://github.com/leadforge-dev/leadforge/pull/1#discussion_r3104702559
Status: outdated
Root author: copilot-pull-request-reviewer
Comment:
The examples list `git push --force` without specifying `main`, but the sentence is about commands that target `main` directly. As written, it can be read as banning force-pushes to any branch; if the intent is only to block pushes to `main`, consider tightening the wording/examples to explicitly include `main` (or split into two rules: no pushes to `main`, and separate guidance on force-push usage).
~~~suggestion
Never use `git push origin main`, `git push --force origin main`, or any variant that targets `main` directly.
~~~
## COPILOT-3
Location: .github/workflows/pr-agent-context.yml:14
URL: https://github.com/leadforge-dev/leadforge/pull/1#discussion_r3104738699
Root author: copilot-pull-request-reviewer
Comment:
This workflow calls a third-party reusable workflow pinned only to the `v4` tag. For supply-chain security and reproducibility, pin `uses:` to an immutable commit SHA (and optionally keep a comment noting the corresponding release tag).
~~~suggestion
uses: shaypal5/pr-agent-context/.github/workflows/pr-agent-context.yml@<FULL_40_CHAR_COMMIT_SHA_FOR_V4> # v4
~~~
## COPILOT-4
Location: .github/workflows/pr-agent-context-refresh.yml:39
URL: https://github.com/leadforge-dev/leadforge/pull/1#discussion_r3104738714
Root author: copilot-pull-request-reviewer
Comment:
This workflow calls a third-party reusable workflow pinned only to the `v4` tag. For supply-chain security and reproducibility, pin `uses:` to an immutable commit SHA (and optionally keep a comment noting the corresponding release tag).
~~~suggestion
uses: shaypal5/pr-agent-context/.github/workflows/pr-agent-context.yml@<FULL_40_CHAR_COMMIT_SHA_FOR_V4> # v4
~~~Run metadata: |
shaypal5
added a commit
that referenced
this pull request
May 1, 2026
Review feedback addressed: - Remove primary_task/label_window_days as explicit kwargs from resolve_config() and Generator.from_recipe() — these fields are resolved from recipe YAML and override dict only, not casually overridable, since the generation pipeline doesn't yet support arbitrary task types (Copilot-1, Copilot-3, shaypal5 #1, #2) - Add label_window_days <= horizon_days validation in GenerationConfig.__post_init__ (Copilot-2, shaypal5 #3) - Add tests for invalid primary_task on GenerationConfig: empty string, non-string type (shaypal5 #6, pr-agent-context) - Add tests for invalid label_window_days on Recipe.from_dict: bool, non-positive, float (shaypal5 #7, pr-agent-context) - Add test for label_window_days > horizon_days rejection - Fix existing test using horizon_days=30 (now conflicts with default label_window_days=90) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
shaypal5
added a commit
that referenced
this pull request
May 1, 2026
* feat: carry primary_task and label_window_days into WorldSpec for dataset card Add `primary_task` and `label_window_days` fields to `GenerationConfig` (with defaults preserving current behavior). Propagate through `Recipe.from_dict()`, `resolve_config()`, and `Generator.from_recipe()` so recipe YAML can override them. Update `render_dataset_card()` to read from `world_spec.config` instead of hard-coded string literals. Closes #6 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: update .agent-plan.md for WorldSpec task fields (PR #36) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address review feedback — tighten scope, add validation + tests Review feedback addressed: - Remove primary_task/label_window_days as explicit kwargs from resolve_config() and Generator.from_recipe() — these fields are resolved from recipe YAML and override dict only, not casually overridable, since the generation pipeline doesn't yet support arbitrary task types (Copilot-1, Copilot-3, shaypal5 #1, #2) - Add label_window_days <= horizon_days validation in GenerationConfig.__post_init__ (Copilot-2, shaypal5 #3) - Add tests for invalid primary_task on GenerationConfig: empty string, non-string type (shaypal5 #6, pr-agent-context) - Add tests for invalid label_window_days on Recipe.from_dict: bool, non-positive, float (shaypal5 #7, pr-agent-context) - Add test for label_window_days > horizon_days rejection - Fix existing test using horizon_days=30 (now conflicts with default label_window_days=90) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
This was referenced May 4, 2026
shaypal5
added a commit
that referenced
this pull request
May 4, 2026
* fix: redact current_stage from student_public bundles Add an `is_leakage_trap` flag to FeatureSpec and an exposure-layer column-redaction pass that drops `leakage_risk and not is_leakage_trap` columns from student_public snapshots, task splits, feature dictionary, and dataset card. `current_stage` (true label leak) is now removed in student_public; `total_touches_all` (intentional pedagogical trap) is preserved via the new flag. - `schema/features.py`: add `is_leakage_trap` field; mark `total_touches_all`; export `STUDENT_PUBLIC_REDACTED_COLUMNS`. - `schema/dictionaries.py`: emit `is_leakage_trap` column in `feature_dictionary.csv`. - `exposure/filters.py`: `BundleFilter.redacted_columns`; populated for student_public. - `api/bundle.py`: drop redacted columns from snapshot before splits; thread the visible feature tuple to dataset card and dictionary writers. - `narrative/dataset_card.py`: accept a `features` arg so categories / leakage section reflect what is actually published. - `validation/bundle_checks.py`: new `_check_exposure_redaction()` enforces CLAUDE.md invariant #1 — redacted columns must not appear in the published task splits or dictionary. - `validation/invariants.py`: relax exposure-monotonicity from byte identity to subset-with-shared-content for `feature_dictionary.csv` and task splits, so redaction doesn't trip the check. Tests: extend `tests/schema/test_features.py` with five new assertions covering the trap flag, the redaction set, and `current_stage`'s role. All 880 existing tests still pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs: update release docs and add end-to-end redaction tests - `tests/exposure/test_redaction.py` (19 tests): bundle-level proof that `current_stage` is absent from every student_public task split, that `total_touches_all` (the pedagogical trap) is preserved, that the research_instructor bundle still contains both, that the student feature dictionary differs from the instructor's by exactly the redacted set, that shared columns hold identical values across modes, and that `validate_bundle()` flags a tampered student_public bundle whose task split still carries `current_stage`. - `scripts/build_public_release.py`: drop `_FLAT_CSV_DROP_COLS` — the bundle writer now redacts at source, so the flat CSV inherits it. - `release/README.md`, `release/HF_DATASET_CARD.md`: update column counts (33 features + 1 trap + 1 target in student_public; 34 + 1 + 1 in the instructor companion) and rewrite the leakage section to distinguish stripped columns from the deliberate trap. - `.agent-plan.md`: flip "Known issue: current_stage leakage" to resolved with a link to the redaction mechanism. Bundles regenerated; all four pass `leadforge validate`; `scripts/verify_hash_determinism.py` confirms 73/73 files identical across two consecutive builds. Full test suite: 899 passing. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * refactor: collapse two-flag redaction into prescriptive redact_in_modes Addresses senior-dev review of the previous two commits: 1. **Single source of truth.** Replaced the awkward `leakage_risk + is_leakage_trap` flag pair with a single prescriptive field `FeatureSpec.redact_in_modes: frozenset[ExposureMode]`. `leakage_risk` reverts to purely descriptive ("post-snapshot correlated"). `current_stage` carries `redact_in_modes={student_public}`; `total_touches_all` keeps `leakage_risk=True` with empty `redact_in_modes` (deliberate trap). New `redacted_columns_for(mode, features=...)` parameterizable function replaces the frozen module-level set. 2. **Validation independent of the writer.** `_check_exposure_redaction` now derives the expected redaction set directly from `LEAD_SNAPSHOT_FEATURES` via `redacted_columns_for`, *not* from `BundleFilter`. A bug in the filter no longer agrees-with-itself past the validator. Added a real regression test that writes a real student_public bundle, mutates a parquet to reinsert `current_stage`, and asserts validation fails. 3. **Self-describing manifest.** `manifest.json` now records `redacted_columns: [...]`. The validator cross-checks the manifest's declared set against the feature-spec-derived expected set; a second new test mutates the manifest to claim nothing was redacted and asserts validation flags the disagreement. 4. **Stricter exposure monotonicity.** `check_exposure_monotonicity` now asserts that `instructor_columns - student_columns` *equals* `redacted_columns_for(student_public)` — not just "is a superset". A future column drop in student_public outside the redaction set is now caught. 5. **Reverted feature_dictionary.csv schema.** The previous commit added an `is_leakage_trap` column to the published CSV without flagging the schema change. Removed: the redaction policy is package-internal and the bundle's actual published schema is observable from the parquet files and `manifest.redacted_columns`. CSV stays back-compat with prior releases (6 columns: name, dtype, description, category, is_target, leakage_risk). 6. **De-duplicated documentation.** `release/README.md` no longer maintains a per-category feature count table by hand — that data is in the auto-generated `dataset_card.md`. Removed the duplicated table; added explicit caveats covering known structural-leakage issues (event aggregates over the label window, `is_mql` zero variance, `is_sql=False` near-deterministic for non-conversion). 7. **Documented the structural follow-up.** `.agent-plan.md` now includes a clearly-scoped "Follow-up: structural leakage" section covering: (a) windowed-snapshot fix for event aggregates; (b) measured `is_sql` leakage P(conv|is_sql=False) = 0.038/0.015/0.006 across tiers; (c) `is_mql` is constant True (zero variance). Suggests filing a tracked issue (deferred — needs user confirmation per shared-state-action rules). Removed: `BundleFilter.redacted_columns` (replaced by direct calls to `redacted_columns_for(mode)` from the bundle writer). Tests: schema tests rewritten for `redact_in_modes` semantics; redaction tests updated for the new function-based API; +2 manifest-redaction tests; +1 real regression test (mutated parquet) replacing the previous manifest-string-edit test. 902 tests passing (was 899). Hash determinism verified: 73/73 files identical across two consecutive builds. All four release bundles regenerated; `feature_dictionary.csv` is now back to its pre-PR 6-column schema. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
5 tasks
shaypal5
added a commit
that referenced
this pull request
May 6, 2026
Fold the brutal self-review's findings back into the PR before review. Bugs: - (#1) run_packager validate→write order — both packagers wrote README/metadata on validation failure, leaving corrupt artifacts on disk that would silently get committed. Gated on `errors == ()`; added no-write tests for both packagers. - (#2) Instructor README inlined the public 3-tier README into a 1-tier dataset card. Replaced with a dedicated `INSTRUCTOR_BODY` constant that links to the public dataset and describes only the instructor-specific additions (full-horizon tables, hidden DAG, latent registry, mechanism summary). - (#3) validate_upload_dir_safe also blocks strict descendants of release_dir; `--huggingface-dir release/intro` would otherwise rmtree the intro bundle. Architecture: - (#5) Finished shared-primitives extraction: SOURCE_TREE_BLOCK, validate_readme_substitution, replace_file, replace_dir, load_manifest now live in scripts/_release_common.py. Both packagers reduced to imports. - (#6) Replaced 60-line hand-rolled YAML renderer with yaml.safe_dump + a 4-line _IndentedDumper subclass. - (#7) Removed dead --owner / --dataset-slug CLI flags. - (#8) assemble_upload_dir now takes rendered_readme and writes it. - (#9) build_config_for_tier made pure (no I/O); cheap manifest-stat preflight via _assert_tier_dir_exists. - (#10) --default-config with --variant=instructor errors loudly. CI: - (#4) Added [publish] extra (datasets>=2.14, kaggle>=1.6) so the gated G12.3 / G12.4 / G11.3 tests install in one line. Cleanups: visual cruft (#13–#16), test cruft (#17 — unused tmp_path, dead tag_lines), em-dash YAML round-trip parametrised for the instructor pretty_name. Verification: 1223 tests pass + 5 gated skips; ruff + mypy clean; hash determinism PASS 67/67; leakage probes 0/3 reconstruct on every tier; validate_release_candidate --no-rebuild exits 0. release/{kaggle,huggingface,huggingface-instructor}/dataset-metadata .json|README.md regenerated; audit-artifact-sync tests guard them. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
4 tasks
shaypal5
added a commit
that referenced
this pull request
May 6, 2026
* PR 5.2: HuggingFace release packager + load_dataset smoke test
Add `scripts/package_hf_release.py` to generate `release/huggingface/README.md`
with G12.1-compliant YAML frontmatter (pretty_name, license, language,
task_categories, size_categories, tags, three configs with `default: true`
on intermediate per G12.2), inlining the rewritten `release/README.md`
body with HF-specific link rewrites. `--variant=instructor` packages the
companion repo (G12.4) from `release/intermediate_instructor/` into a
separate `release/huggingface-instructor/` upload tree. G12.3 covered
by a parametrised `load_dataset()` smoke test gated on the optional
`datasets` SDK.
Extract shared release-packaging primitives (link rewriter, dir-safety
guard, cover-image validator) into `scripts/_release_common.py`; refactor
the Kaggle packager to import them. `release/kaggle/dataset-metadata.json`
is byte-stable across the refactor.
Delete the legacy `release/HF_DATASET_CARD.md` stub — superseded by the
generated card. Gitignore `release/huggingface{,-instructor}/*` except
the committed README.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* PR 5.2 self-review fixes (Kaggle + HF packagers)
Fold the brutal self-review's findings back into the PR before review.
Bugs:
- (#1) run_packager validate→write order — both packagers wrote
README/metadata on validation failure, leaving corrupt artifacts on
disk that would silently get committed. Gated on `errors == ()`;
added no-write tests for both packagers.
- (#2) Instructor README inlined the public 3-tier README into a
1-tier dataset card. Replaced with a dedicated `INSTRUCTOR_BODY`
constant that links to the public dataset and describes only the
instructor-specific additions (full-horizon tables, hidden DAG,
latent registry, mechanism summary).
- (#3) validate_upload_dir_safe also blocks strict descendants of
release_dir; `--huggingface-dir release/intro` would otherwise
rmtree the intro bundle.
Architecture:
- (#5) Finished shared-primitives extraction: SOURCE_TREE_BLOCK,
validate_readme_substitution, replace_file, replace_dir,
load_manifest now live in scripts/_release_common.py. Both
packagers reduced to imports.
- (#6) Replaced 60-line hand-rolled YAML renderer with yaml.safe_dump
+ a 4-line _IndentedDumper subclass.
- (#7) Removed dead --owner / --dataset-slug CLI flags.
- (#8) assemble_upload_dir now takes rendered_readme and writes it.
- (#9) build_config_for_tier made pure (no I/O); cheap manifest-stat
preflight via _assert_tier_dir_exists.
- (#10) --default-config with --variant=instructor errors loudly.
CI:
- (#4) Added [publish] extra (datasets>=2.14, kaggle>=1.6) so the
gated G12.3 / G12.4 / G11.3 tests install in one line.
Cleanups: visual cruft (#13–#16), test cruft (#17 — unused tmp_path,
dead tag_lines), em-dash YAML round-trip parametrised for the
instructor pretty_name.
Verification: 1223 tests pass + 5 gated skips; ruff + mypy clean;
hash determinism PASS 67/67; leakage probes 0/3 reconstruct on every
tier; validate_release_candidate --no-rebuild exits 0.
release/{kaggle,huggingface,huggingface-instructor}/dataset-metadata
.json|README.md regenerated; audit-artifact-sync tests guard them.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* PR 5.2 Copilot-review fixes (Kaggle + HF packagers)
Fold Copilot's two real findings on the self-review revision back in.
COPILOT-1 — validate_upload_dir_safe was only invoked inside
assemble_upload_dir, which --dry-run skips. A dry-run with
--huggingface-dir release (or .) would write the README into the
unsafe path BEFORE the safety net fired. Hoist the check into
run_packager (both packagers) so it runs before any mkdir or write;
the inner assemble_upload_dir call stays as defence-in-depth for
direct callers. New tests: dry-run with unsafe upload-dir raises
without writing; the same path through main() returns rc=2.
COPILOT-2 — Cover-image path resolution was inconsistent:
validate_cover_image used cover_image as passed, while
assemble_upload_dir did a separate ``release_dir / cover_image.name``
fallback. Diverged for bare-basename inputs (false validation
failures) and two-paths-sharing-a-basename (assembler shadowing the
explicit path). Added resolve_cover_image_path() to
_release_common.py (explicit-wins, release-dir fallback);
run_packager calls it once and threads the resolved path through
validation, the metadata's image field, and assembly. New
tests/scripts/test_release_common.py covers the four resolution
branches; new packager-side tests confirm bare-basename success +
metadata field plumbing.
COPILOT-3 — outdated; already addressed by self-review fix #8 in
commit f2fc4a2. Resolved as already treated; no code change.
Verification: 1232/1232 tests pass + 5 gated skips; ruff + mypy
clean; hash determinism PASS 67/67; leakage probes rc=0 on every
tier; validate_release_candidate --no-rebuild exits 0;
BUNDLE_SCHEMA_VERSION unchanged at 5.
release/{kaggle,huggingface,huggingface-instructor}/* artifacts
regenerated byte-identically.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
CLAUDE.mdas the first section (before all other rules), locking the rule that no work goes directly tomain.git/hooks/pre-pushto mechanically block any push targetingmain, regardless of who or what triggers it.agent-plan.mdto reflect that these workflow rules are now lockedWorkflow rules locked
git checkout main && git pullbefore starting any workgit checkout -b <descriptive-branch-name>— always branch from latest main.agent-plan.mdto reflect state after the PR merges; commit to same branchmaindirectlyEnforcement layers
CLAUDE.md— read by Claude at session start.git/hooks/pre-push— blocks push tomainat git level🤖 Generated with Claude Code