Skip to content

Add equalent np.convolve (just full mode)#7

Merged
Oceania2018 merged 1 commit into
SciSharp:masterfrom
dotChris90:master
Oct 17, 2018
Merged

Add equalent np.convolve (just full mode)#7
Oceania2018 merged 1 commit into
SciSharp:masterfrom
dotChris90:master

Conversation

@dotChris90

Copy link
Copy Markdown
Member

@Oceania2018

Copy link
Copy Markdown
Member

Hi @dotChris90 , the np.convolve looks great, could you please add the convolve to the Implemented APIs list in the README?

@Oceania2018

Copy link
Copy Markdown
Member

Keep this open due to same and valid mode is not finished yet.

@Oceania2018 Oceania2018 merged commit 5d9b3c7 into SciSharp:master Oct 17, 2018
Nucs added a commit that referenced this pull request Apr 23, 2026
Implements fixes detailed in docs/NPYITER_FIXES_REQUIRED.md to improve
NumPy compatibility of the NpyIter implementation.

Fix #1: Coalescing Always Runs
- Changed NpyIterRef.Initialize() to always coalesce axes after
  construction unless MULTI_INDEX flag is set
- Matches NumPy's nditer_constr.c line 395-396 behavior

Fix #2: Inner Stride Cache
- Added InnerStrides[MaxOperands] array to NpyIterState
- Added UpdateInnerStrides() method to gather inner strides
- GetInnerStrideArray() now returns contiguous array matching
  NumPy's NpyIter_GetInnerStrideArray() format

Fix #3: op_axes Parameter Implementation
- Added ApplyOpAxes() method to support axis remapping
- Supports -1 entries for broadcast/reduction axes
- Enables reduction operations via custom axis mapping

Fix #4: Multi-Index Support
- Added GetMultiIndex(Span<long>) for coordinate retrieval
- Added GotoMultiIndex(ReadOnlySpan<long>) for coordinate jumping
- Added HasMultiIndex property
- HASMULTIINDEX flag tracked during construction

Fix #5: Ranged Iteration
- Added ResetToIterIndexRange(start, end) for parallel chunking
- Added IterStart, IterEnd, and IsRanged properties
- RANGE flag tracks ranged iteration mode

Fix #6: Buffer Copy Type Dispatch
- Added non-generic CopyToBuffer/CopyFromBuffer overloads
- Runtime dtype dispatch for all 12 NumSharp types
- Enables dtype-agnostic iteration code

Fix #7: Flag Bit Positions Documented
- Added documentation explaining NumSharp's flag bit layout
- Legacy compatibility flags use bits 0-7
- NumPy-equivalent flags use bits 8-15
- Semantic meaning matches NumPy, positions differ

Fix #8: MaxDims Increased to 64
- Changed MaxDims from 32 to 64 to match NPY_MAXDIMS
- Supports high-dimensional array iteration

Test coverage:
- 13 new tests for coalescing, multi-index, ranged iteration,
  inner strides, and MaxDims validation
- All 5666 non-OpenBugs tests pass

Note: Full axis reordering before coalescing (for complete 1D
coalescing of contiguous arrays) not yet implemented. Current
implementation coalesces adjacent compatible axes only.
Nucs added a commit that referenced this pull request Apr 23, 2026
Explicit the hierarchy — Tier A/B/C were always sub-tiers of Layer 3
(the baked-ufunc layer). Numbering them `3A/3B/3C` makes the
relationship visible at a glance:

  Layer 1  —  ForEach (delegate)
  Layer 2  —  ExecuteGeneric (struct-generic)
  Layer 3  —  ExecuteBinary / Unary / ...  (baked)
  Tier 3A  —  ExecuteRawIL              (sub-tier: custom IL)
  Tier 3B  —  ExecuteElementWise        (sub-tier: templated)
  Tier 3C  —  ExecuteExpression / Call  (sub-tier: DSL)

100 references touched across 6 files:
  docs/website-src/docs/NDIter.md  — prose, TOC, anchor links, worked-
                                    example heading anchors (#6, #7, #8)
  src/NumSharp.Core/Backends/Iterators/NpyExpr.cs       — header comment
  src/NumSharp.Core/Backends/Iterators/NpyIter.Execution.Custom.cs
    — file header, region comments for each tier entry point
  src/NumSharp.Core/Backends/Kernels/ILKernelGenerator.InnerLoop.cs
    — factory method docstrings
  test/NumSharp.UnitTest/Backends/Iterators/NpyIterCustomOpTests.cs
    — class docstring, region comments, 10 test method names
    (TierA_* → Tier3A_*, TierB_* → Tier3B_*, TierC_* → Tier3C_*)
  test/NumSharp.UnitTest/Backends/Iterators/NpyIterCustomOpEdgeCaseTests.cs
    — region comments, 2 test method names (Validate_TierA_* →
    Validate_Tier3A_*)

No behavior changes. 264/264 NpyExpr + custom-op tests pass on net8 +
net10. Full suite still green (0 regressions).
Nucs added a commit that referenced this pull request Jun 6, 2026
Adds docs/FUZZ_FINDINGS.md: every NumSharp-vs-NumPy-2.4.2 divergence the differential fuzzer
surfaced, each bit-exact verified, with minimal NumPy-vs-NumSharp reproductions, root cause where
known, and disposition (FIXED / BUG / INTENDED / SCOPING) mapped to tasks #7-#12.

22 findings: 1 fixed (complex->bool), 18 confirmed bugs (integer ÷0/mod0, float //0,
mixed-precision mod, complex power; NaN <=/>=; NEP50 unary promotion, negative-unsigned,
reciprocal, complex unary; reduction NaN propagation, complex-axis throw, bool min/max,
summation precision, result dtype; complex where throw; bool arithmetic, size-1 collapse,
complex binary cancellation), 2 intended Misaligned (NEP50 weak-scalar, complex-divide ULP),
1 scoping note (ops vs raw offset!=0 / junk size-1 strides).
Nucs added a commit that referenced this pull request Jun 6, 2026
…ion north star)

Adds docs/ROADMAP.md consolidating the entire path from the current baseline to the
/np-function goal (every np.* bit-identical to NumPy 2.4.2 OR >=1.5x faster), correctness-first
with the differential fuzzer as the regression net for the perf refactor.

Five phases:
  0  Baseline (Plan A done) — harness, T1-T6 matrices, random fuzzer, CI gate.
  1  Correctness backlog — fix the 22 findings, grouped by shared root (F1 div-by-zero, F2 NaN,
     F3 NEP50 promotion, F4 unsupported throws, F5 complex algorithms, F6 bool, F7 size-1 shape,
     F8 summation precision, F9 representation). Each fix flips its classifier branch and re-arms
     the gate; F5/F8/F9 are implement-vs-document judgment calls.
  2  Coverage breadth — 2A finish #2 sections C+E (out=/where=/aliasing/overlap/mask), 2B op
     tiers T7-T15 (manipulation, matmul/dot, bitwise, nan-aware, cumulative, stat, logic,
     sorting, multi-output) = the ~75 untested transformation ops.
  3  NpyIter behavioral parity (#3 / Plan B) — port test_nditer.py; de-risks Phase 5.
  4  Depth — params (order=/dtype=/ddof/axis), SIMD-tail & large shapes, error parity (#4),
     unmanaged lifecycle (#5), metamorphic invariants (#7).
  5  Performance — the >=1.5x-NumPy mission (#6): the DirectILKernelGenerator -> ILKernelGenerator
     (NpyIter-driven) migration in CLAUDE.md priority order, benchmark ledger, perf CI gates;
     the differential matrices keep the kernel rewrite from breaking parity.

Includes a dependency graph, value-weighted recommended order, and an effort-shape table. Detail
for phases 2A/3 lives in FUZZ_PLAN_NEXT.md; findings in FUZZ_FINDINGS.md.
Nucs added a commit that referenced this pull request Jun 6, 2026
…cs (Phase 1 F3a)

NumSharp promoted every integer input to float64 for the float-producing unary
ufuncs (sqrt/cbrt/exp/log/trig/...), regardless of input width. NumPy NEP50 uses
WIDTH-BASED promotion: bool/int8/uint8 -> float16, int16/uint16 -> float32,
int32/uint32/int64/uint64 -> float64 (float/complex preserved). FUZZ_FINDINGS #7.

Fix: new DefaultEngine.ResolveUnaryFloatReturnType implements the width-based rule;
the 20 transcendental Default.<Op>.cs files (ACos/ASin/ATan/Cbrt/Cos/Cosh/Deg2Rad/
Exp/Exp2/Expm1/Log/Log10/Log1p/Log2/Rad2Deg/Sin/Sinh/Sqrt/Tan/Tanh) now call it
instead of ResolveUnaryReturnType (which widened to float64 via GetComputingType).
The int->Half / int->Single unary kernels already existed, so this is a pure
result-dtype routing change; int32/int64 are unaffected (already float64).

The dtype-PRESERVING ufuncs (square/floor/ceil/trunc/round/reciprocal) are
intentionally left on the old resolver pending F3b (they need integer identity /
x*x / int-reciprocal kernels to preserve the integer dtype).

Verification:
  * 364 of 494 unary dtype divergences clear bit-exact; the remaining 130 are the
    F3b-pending preserve-dtype ops, now scoped in MisalignedRegistry so a
    transcendental promotion regression fails the gate (not silently excused).
  * Half/Single transcendental values stay within 2 ULP of NumPy's float16/float32
    libm (documented algorithm difference, same class as complex-divide ULP).
  * DtypeCoverageTests.Sqrt_IntegerDtypes updated: it asserted the old uniform
    float64 (codified the bug); now asserts the NumPy width-based dtype per input.
  * Full net10.0 suite green: 9422 passed / 0 failed; FuzzMatrix 17/17.

Clears the transcendental half of FUZZ_FINDINGS #7 (and #15 dtype for those ops).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants