Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
59 changes: 56 additions & 3 deletions .agent-plan.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,13 +6,66 @@

## Current System State

**v1.0.0 released (2026-05-02).** All milestones (M0–M13) complete. Teaching dataset series (v1–v7) approved by consumer. Package version bumped to 1.0.0 in pyproject.toml and leadforge/version.py.
**leadforge package v1.0.0 released (2026-05-02).** All framework milestones (M0–M13) complete. Teaching dataset series (v1–v7) approved by consumer. Package version bumped to 1.0.0 in pyproject.toml and leadforge/version.py.

**v0.1.0-alpha datasets shipped (2026-05-05).** Five bundles (intro / intermediate / advanced / intermediate_instructor / tiny_demo) in `release/`, packaged for external review by Gemini + ChatGPT (two iterations each). Synthesis lives under `docs/external_review/summaries/`. **Critical finding:** public relational tables reconstruct the target via joins (verified locally by ChatGPT v2 in `chatgpt_report_v2.md §0`). The v1 release work below addresses this plus the rest of the review consensus.

---

## Next Up — `leadforge-lead-scoring-v1` Curated Dataset Release

Goal: ship a best-in-class educational synthetic CRM lead-scoring dataset family to Kaggle and Hugging Face. Dataset version is decoupled from the leadforge package version (package stays at `1.x`).

**Source of truth:** `docs/release/v1_release_roadmap.md`
**Companion docs:** `docs/release/v1_release_design.md`, `docs/release/v1_acceptance_gates.md`, `docs/release/post_v1_roadmap.md`
**External review materials:** `docs/external_review/{gemini,chatgpt}/` (raw) + `docs/external_review/summaries/` (synthesized)

### Phase 1 — Audit and naming
- [ ] Reproduce relational-leakage finding on alpha bundles → `docs/release/v1_current_state_audit.md`
- [ ] Lock dataset release name `leadforge-lead-scoring-v1`

### Phase 2 — Snapshot-safe relational export
- [ ] `leadforge/render/relational_snapshot_safe.py` (new)
- [ ] `leadforge/validation/relational_leakage.py` (new)
- [ ] `BUNDLE_SCHEMA_VERSION` 4 → 5; manifest gains `relational_snapshot_safe`
- [ ] Drop `converted_within_90_days` / `conversion_timestamp` from public `leads`; drop `close_outcome` / `closed_at` from public `opportunities`; omit `customers` / `subscriptions` from public bundles
- [ ] Hash-determinism preserved on regenerated bundles

### Phase 3 — Release validation hardening
- [ ] `leadforge/validation/{release_quality,leakage_probes,reporting}.py` (new)
- [ ] `scripts/validate_release_candidate.py` (new)
- [ ] Resolve numeric `TBD-*` bands in `v1_acceptance_gates.md`
- [ ] `release/validation/validation_report.{json,md}` + figures auto-generated

### Phase 4 — Channel-signal audit + dataset card hardening
- [ ] `scripts/audit_channel_signal.py` → `docs/release/channel_signal_audit.md`
- [ ] `release/README.md` rewrite (release-grade dataset card; macro-framing paragraph; simulation-simplifications section)
- [ ] `docs/release/{generation_method,feature_dictionary}.md`

### Phase 5 — Platform packaging
- [ ] `scripts/package_kaggle_release.py` → `release/kaggle/dataset-metadata.json`
- [ ] `scripts/package_hf_release.py` → `release/huggingface/README.md` with YAML configs/default/pretty_name/tags
- [ ] `release/dataset-cover-image.png` (≥560×280)
- [ ] Local `load_dataset()` smoke test; Kaggle dry-run package validation

### Phase 6 — Notebook sequence + adversarial framing
- [ ] `release/notebooks/{02_relational_feature_engineering,03_leakage_and_time_windows,04_lift_calibration_value_ranking}.ipynb`
- [ ] Update `01_baseline_lead_scoring.ipynb` to reproduce validation report metrics
- [ ] `.github/ISSUE_TEMPLATE/{dataset_breakage_report,realism_feedback}.yml`
- [ ] `docs/release/{break_me_guide,v2_decision_log}.md`

### Phase 7 — LLM critique + publish
- [ ] `leadforge/validation/llm_critique.py` (single-provider, env-var creds, skips cleanly)
- [ ] `docs/release/llm_critique_prompt.md` + `scripts/run_llm_critique.py`
- [ ] Adjudicate any high-severity findings (resolve in code or document in `v2_decision_log.md`)
- [ ] `scripts/{publish_kaggle,publish_hf}.py` (dry-run → private/draft → public)
- [ ] Tag `leadforge-lead-scoring-v1`; `docs/release/v1_release_notes.md`

---

## Next UpPublic Kaggle/HuggingFace Release
## Alpha Releasev0.1.0-alpha (shipped; superseded by v1 work above)

First public dataset release: `leadforge-b2b-lead-scoring`. Three difficulty tiers (intro/intermediate/advanced) as full relational bundles + flat CSV convenience exports, plus a research_instructor companion for intermediate.
First public dataset alpha: `leadforge-b2b-lead-scoring`. Three difficulty tiers (intro/intermediate/advanced) as full relational bundles + flat CSV convenience exports, plus a research_instructor companion for intermediate.

### Public release — Phase 1: Dataset card improvement ✓ (in PR)

Expand Down
6 changes: 6 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -204,6 +204,7 @@ Key abstractions: `Recipe`, `GenerationConfig`, `WorldSpec`, `WorldBundle`, `Exp
## Hard Constraints — Do Not Violate
- Never use a single fixed hidden world (DGP must vary by motif family + rewiring).
- Never leak post-snapshot-anchor data into flat task features.
- **Never publish public relational tables that allow label reconstruction via joins.** Public relational exports must be snapshot-safe: every `*_timestamp` column in event tables (`touches.touch_timestamp`, `sessions.session_timestamp`, `sales_activities.activity_timestamp`) must satisfy `<= lead_created_at + snapshot_day`; `opportunities` must be filtered by `created_at <= lead_created_at + snapshot_day`; no terminal-state fields (`close_outcome`, `closed_at`, `converted_within_90_days`, `conversion_timestamp`) in public `leads`/`opportunities`; no conversion-conditional entities (`customers`, `subscriptions`) in public bundles.
- Never require external APIs for core generation.
- Never publish hidden truth in `student_public` mode.
- Never derive `converted_within_90_days` as a directly sampled label; it must emerge from simulated events.
Expand Down Expand Up @@ -360,3 +361,8 @@ The current focus is producing a v4 lead scoring intro dataset. See `docs/v4/` f
- Architecture/spec: `docs/leadforge_architecture_spec.md`
- Implementation roadmap: `docs/leadforge_implementation_plan.md`
- v4 dataset plan: `docs/v4/design.md`
- **v1 dataset release roadmap (active): `docs/release/v1_release_roadmap.md`**
- v1 release design: `docs/release/v1_release_design.md`
- v1 acceptance gates: `docs/release/v1_acceptance_gates.md`
- Post-v1 roadmap: `docs/release/post_v1_roadmap.md`
- External review synthesis: `docs/external_review/summaries/`
Empty file.
Loading
Loading