Skip to content

Add tests for build pipeline scripts (build_v4_snapshot.py, validate_v4_dataset.py) #23

@shaypal5

Description

@shaypal5

Problem

scripts/build_v4_snapshot.py (213 lines) and scripts/validate_v4_dataset.py (334 lines) have no automated tests. The validation script is itself a form of integration test for the generated CSV, but individual functions like subsample(), inject_missingness(), and the validation checks themselves are untested in isolation.

Suggested scope

  • Unit tests for subsample() edge cases (insufficient positives/negatives)
  • Property tests for inject_missingness() (rates within expected bounds)
  • Unit tests for each validation check function
  • Integration test: build_v4_dataset()validate() returns 0

These scripts live in scripts/ (not leadforge/), so they need explicit test discovery or a test helper that imports them.

Context

Identified in self-review of PR #21. The scripts will stabilize further in v4-M2 (release), making that a natural point to add tests.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requesttype: testTest additions or fixes

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions