Skip to content

pipelines: align RNG interface with RNGRoot convention #32

@shaypal5

Description

@shaypal5

Context

Pipeline functions in leadforge/pipelines/build_v5.py accept np.random.RandomState directly. The rest of the package standardizes on leadforge.core.rng.RNGRoot with named substreams for deterministic derivation.

Now that these functions are in-package, they should follow the same convention.

What to do

  1. Change pipeline function signatures to accept a seed (int) and derive per-step sub-RNGs via RNGRoot, OR accept an RNGRoot and call .child("subsample") etc.
  2. Update scripts/build_v5_snapshot.py orchestration to pass seed/RNGRoot instead of raw RandomState
  3. Verify determinism is preserved (same seed → same output)
  4. Update tests accordingly

Considerations

  • np.random.RandomState is legacy; np.random.Generator is preferred for new code
  • RNGRoot currently wraps random.Random — may need a numpy substream method

Origin

Copilot review comment on PR #30.

Metadata

Metadata

Assignees

No one assigned

    Labels

    layer: corecore/ primitives (RNG, IDs, models, exceptions)layer: pipelinespipelines/ — dataset build transformationstype: refactorCode change with no behavior difference

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions