Skip to content

refactor: extract pipeline functions from scripts/ into leadforge package #29

@shaypal5

Description

@shaypal5

The build pipeline functions in scripts/build_v5_snapshot.py (e.g., subsample(), inject_missingness(), derive_binary_features(), cap_expected_acv(), rename_and_select(), boost_leakage_trap()) contain testable logic that should live in a proper module under leadforge/ (e.g., leadforge/pipelines/build_v5.py).

The script itself should become a thin CLI wrapper that imports from the package, similar to how scripts/validate_lead_scoring_dataset.py already delegates to leadforge.validation.lead_scoring.

This would:

  • Eliminate the importlib hack needed to test these functions (see tests/scripts/test_build_v5_snapshot.py)
  • Make the functions importable by other code
  • Keep scripts/ as thin CLI entry points

Identified during review of PR #28.

Metadata

Metadata

Assignees

No one assigned

    Labels

    layer: recipesrecipes/ recipe assets and registrytype: refactorCode change with no behavior difference

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions