Skip to content

NeuroEval reproducibility/transparency improvements (#765)#766

Merged
Hananel-Hazan merged 16 commits into
masterfrom
neuroeval/wo-04-05-citation-version
Jun 15, 2026
Merged

NeuroEval reproducibility/transparency improvements (#765)#766
Hananel-Hazan merged 16 commits into
masterfrom
neuroeval/wo-04-05-citation-version

Conversation

@Hananel-Hazan

Copy link
Copy Markdown
Collaborator

Addresses the NeuroEval reproducibility/transparency scorecard in #765, plus carries two bugfixes already present in the local base.

#765 artifacts

  • DATA.md — central dataset & stimulus declaration
  • REPRODUCING.md — paper→script→command→seed map, with a measured eth_mnist reference (0.81 all-activity / 0.82 proportion, seed 0, 20k-train/10k-test subset)
  • CITATION.cff + Zenodo software DOI wiring (v0.3.4)
  • CHANGELOG.md, neural model-spec docs page, breakout pretrained-artifact provenance, seeded smoke-reproduction test, README version/badge cleanup

Bugfixes

  • MulticompartmentConnection._apply now moves feature + learning-rule tensors to device — the GPU path (e.g. eth_mnist/DiehlAndCook2015) crashed with a cpu/cuda mismatch (cfb4ee1b)
  • batch_eth_mnist.py duplicate device=device kwarg SyntaxError (09f4577a)

Also includes (already committed in the local base, not yet on master)

59/59 tests pass.

🤖 Generated with Claude Code

Hananel-Hazan and others added 16 commits June 14, 2026 10:37
Merge pull request #763 from BindsNET/hananel
Covers the fix in #761 (commit a9f7e43): a preallocated Monitor (time set)
run for fewer steps than the preallocated duration left placeholder lists in
the recording, crashing torch.cat in Monitor.get. The new TestMonitorShortRun
asserts get() returns a tensor truncated to the actual run length, and still
returns the full length once the buffer is filled.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
bernoulli_loader read kwargs.get("dt") instead of kwargs.get("max_prob").
Since dt is a named parameter it never lands in **kwargs, so max_prob was
silently ignored and the spike rate stayed at 1.0 regardless of the argument.

Adds test_bernoulli_loader_max_prob, which asserts the empirical spike rate
tracks max_prob (the existing test only checked output shape, so the bug went
undetected). Fix mirrors PR #743.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…onsistency

Addresses two reproducibility/transparency gaps flagged in #765:

- WO-05: README "Requirements" said Python >=3.9,<3.12 while pyproject.toml
  and CI target >=3.11,<3.14 (tested 3.11/3.12/3.13). Align the README and
  add a "Reproducible install" note pointing at poetry.lock and Dockerfile.
- WO-04: add machine-readable CITATION.cff (CFF 1.2.0) with preferred-citation
  to the Frontiers paper (DOI 10.3389/fninf.2018.00089). Software Zenodo DOI
  left as a TODO pending WO-03.

Refs #765

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Declares every dataset and synthetic stimulus used by the examples, benchmarks,
and dataset loaders — source, retrieval method, license pointer, and spike-encoding
preprocessing. Targets the Data & Stimulus Disclosure and Dataset Resolvability
axes flagged in #765. Linked from README and the docs index.

Refs #765

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Maps each shipped model / published claim to model class, example script, exact
command, seed, and expected output, with determinism notes. Verified against the
source; accuracy/timing cells are honestly marked as not-yet-measured rather than
asserted. Documents that the multi-simulator benchmark script is not a single-command
repro. Targets the Model-to-Code Traceability axis flagged in #765. Linked from
README and docs index.

Refs #765

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
New docs/source/models_spec.rst documenting the difference equations and default
parameters for every neuron model (IF/LIF/CurrentLIF/AdaptiveLIF/DiehlAndCook/
Izhikevich/SRM0/CSRM/McCullochPitts) and the PostPre learning rule, transcribed
from source; remaining learning rules summarized with source pointers. Added to the
docs toctree. Targets the Neural Model Spec Clarity axis flagged in #765.

Note: docs build not run locally (sphinx/docutils unavailable here); RST uses only
standard directives. Verify via the Read the Docs build.

Refs #765

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add examples/breakout/README.md describing the shipped pretrained artifact
(Linear(6400,1000)->ReLU->Linear(1000,4) Breakout Q-network), how it is consumed by
play_breakout_from_ANN.py (ANN->SNN transplant), and that no training script ships.
Cross-linked from DATA.md. Targets the Resolvability/Traceability axes flagged in #765.

Refs #765

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Keep a Changelog format with an Unreleased section sourced from git log since the
0.3.3 tag (2024-10-18) and a pointer from CONTRIBUTING.md. Older history links to the
GitHub releases page rather than being reconstructed. Targets the Reproducibility
Package Quality axis flagged in #765.

Refs #765

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Uses the concept DOI (10.5281/zenodo.20695115), which always resolves to the
latest release on Zenodo.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Completes the Zenodo software-archive work (release 0.3.4):
- CITATION.cff: add concept DOI (10.5281/zenodo.20695115) and version DOI
  (10.5281/zenodo.20695116); bump version to 0.3.4.
- README: add a "Citing the software" block alongside the DOI badge.
- pyproject.toml: bump version 0.3.3 -> 0.3.4 to match the released tag.
- CHANGELOG: add the released [0.3.4] section with the archival DOIs.

Refs #765

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Adds a build-status (monitor) badge for the "BindsNET build status" workflow
alongside the existing CodeQL, docs, and DOI badges.

Refs #765

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…cs version

- test/repro/test_smoke_repro.py: deterministic end-to-end CPU run of a tiny network
  asserting an exact pre-measured output (179 spikes), verified to pass and to be
  stable across repeated runs. Runs as part of the existing pytest CI.
- REPRODUCING.md: reference the automated smoke test.
- models_spec.rst: bump stated version to 0.3.4.

Refs #765

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…a()/.cpu()

AbstractMulticompartmentConnection stores its features (and their learning-rule
state) in a plain `self.pipeline` list of non-Module objects, so their tensors
(e.g. Weight.value, learning-rule nu/traces/eligibility) were not relocated by
torch.nn.Module._apply. As a result, `network.to('cuda')` left feature tensors on
the CPU and `compute`/`update` raised "Expected all tensors to be on the same
device" — the GPU path for models built on MulticompartmentConnection (e.g.
DiehlAndCook2015 / examples/mnist/eth_mnist.py) crashed.

Override `_apply` on AbstractMulticompartmentConnection to also relocate each
feature's tensors and its learning rule's tensor state. The weight `value` is moved
in place via `.data` so it stays aliased to the learning rule's cached reference
(which is updated in place during training). Verified: eth_mnist now runs end-to-end
on GPU; full test suite passes on CPU.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Overnight GPU reference run (RTX 2070, torch 2.6, seed 0, --n_train 20000
--n_test 10000) measured DiehlAndCook2015 / eth_mnist at 0.81 (all-activity) and
0.82 (proportion-weighting) test accuracy. Recorded in REPRODUCING.md with the
exact config, replacing the "(not measured here)" placeholder.

Refs #765

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…5 call

`device=device` was passed twice to DiehlAndCook2015(...), making
examples/mnist/batch_eth_mnist.py raise SyntaxError (repeated keyword argument)
and fail to run at all. Remove the duplicate.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@Hananel-Hazan Hananel-Hazan merged commit 00d870d into master Jun 15, 2026
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant