Add equalent np.convolve (just full mode)#7
Merged
Conversation
dotChris90
commented
Oct 17, 2018
Member
- according to issue implement numpy.convolve #6
Member
|
Hi @dotChris90 , the np.convolve looks great, could you please add the convolve to the Implemented APIs list in the README? |
Member
|
Keep this open due to same and valid mode is not finished yet. |
Nucs
added a commit
that referenced
this pull request
Apr 23, 2026
Implements fixes detailed in docs/NPYITER_FIXES_REQUIRED.md to improve NumPy compatibility of the NpyIter implementation. Fix #1: Coalescing Always Runs - Changed NpyIterRef.Initialize() to always coalesce axes after construction unless MULTI_INDEX flag is set - Matches NumPy's nditer_constr.c line 395-396 behavior Fix #2: Inner Stride Cache - Added InnerStrides[MaxOperands] array to NpyIterState - Added UpdateInnerStrides() method to gather inner strides - GetInnerStrideArray() now returns contiguous array matching NumPy's NpyIter_GetInnerStrideArray() format Fix #3: op_axes Parameter Implementation - Added ApplyOpAxes() method to support axis remapping - Supports -1 entries for broadcast/reduction axes - Enables reduction operations via custom axis mapping Fix #4: Multi-Index Support - Added GetMultiIndex(Span<long>) for coordinate retrieval - Added GotoMultiIndex(ReadOnlySpan<long>) for coordinate jumping - Added HasMultiIndex property - HASMULTIINDEX flag tracked during construction Fix #5: Ranged Iteration - Added ResetToIterIndexRange(start, end) for parallel chunking - Added IterStart, IterEnd, and IsRanged properties - RANGE flag tracks ranged iteration mode Fix #6: Buffer Copy Type Dispatch - Added non-generic CopyToBuffer/CopyFromBuffer overloads - Runtime dtype dispatch for all 12 NumSharp types - Enables dtype-agnostic iteration code Fix #7: Flag Bit Positions Documented - Added documentation explaining NumSharp's flag bit layout - Legacy compatibility flags use bits 0-7 - NumPy-equivalent flags use bits 8-15 - Semantic meaning matches NumPy, positions differ Fix #8: MaxDims Increased to 64 - Changed MaxDims from 32 to 64 to match NPY_MAXDIMS - Supports high-dimensional array iteration Test coverage: - 13 new tests for coalescing, multi-index, ranged iteration, inner strides, and MaxDims validation - All 5666 non-OpenBugs tests pass Note: Full axis reordering before coalescing (for complete 1D coalescing of contiguous arrays) not yet implemented. Current implementation coalesces adjacent compatible axes only.
Nucs
added a commit
that referenced
this pull request
Apr 23, 2026
Explicit the hierarchy — Tier A/B/C were always sub-tiers of Layer 3
(the baked-ufunc layer). Numbering them `3A/3B/3C` makes the
relationship visible at a glance:
Layer 1 — ForEach (delegate)
Layer 2 — ExecuteGeneric (struct-generic)
Layer 3 — ExecuteBinary / Unary / ... (baked)
Tier 3A — ExecuteRawIL (sub-tier: custom IL)
Tier 3B — ExecuteElementWise (sub-tier: templated)
Tier 3C — ExecuteExpression / Call (sub-tier: DSL)
100 references touched across 6 files:
docs/website-src/docs/NDIter.md — prose, TOC, anchor links, worked-
example heading anchors (#6, #7, #8)
src/NumSharp.Core/Backends/Iterators/NpyExpr.cs — header comment
src/NumSharp.Core/Backends/Iterators/NpyIter.Execution.Custom.cs
— file header, region comments for each tier entry point
src/NumSharp.Core/Backends/Kernels/ILKernelGenerator.InnerLoop.cs
— factory method docstrings
test/NumSharp.UnitTest/Backends/Iterators/NpyIterCustomOpTests.cs
— class docstring, region comments, 10 test method names
(TierA_* → Tier3A_*, TierB_* → Tier3B_*, TierC_* → Tier3C_*)
test/NumSharp.UnitTest/Backends/Iterators/NpyIterCustomOpEdgeCaseTests.cs
— region comments, 2 test method names (Validate_TierA_* →
Validate_Tier3A_*)
No behavior changes. 264/264 NpyExpr + custom-op tests pass on net8 +
net10. Full suite still green (0 regressions).
Nucs
added a commit
that referenced
this pull request
Jun 6, 2026
Adds docs/FUZZ_FINDINGS.md: every NumSharp-vs-NumPy-2.4.2 divergence the differential fuzzer surfaced, each bit-exact verified, with minimal NumPy-vs-NumSharp reproductions, root cause where known, and disposition (FIXED / BUG / INTENDED / SCOPING) mapped to tasks #7-#12. 22 findings: 1 fixed (complex->bool), 18 confirmed bugs (integer ÷0/mod0, float //0, mixed-precision mod, complex power; NaN <=/>=; NEP50 unary promotion, negative-unsigned, reciprocal, complex unary; reduction NaN propagation, complex-axis throw, bool min/max, summation precision, result dtype; complex where throw; bool arithmetic, size-1 collapse, complex binary cancellation), 2 intended Misaligned (NEP50 weak-scalar, complex-divide ULP), 1 scoping note (ops vs raw offset!=0 / junk size-1 strides).
Nucs
added a commit
that referenced
this pull request
Jun 6, 2026
…ion north star)
Adds docs/ROADMAP.md consolidating the entire path from the current baseline to the
/np-function goal (every np.* bit-identical to NumPy 2.4.2 OR >=1.5x faster), correctness-first
with the differential fuzzer as the regression net for the perf refactor.
Five phases:
0 Baseline (Plan A done) — harness, T1-T6 matrices, random fuzzer, CI gate.
1 Correctness backlog — fix the 22 findings, grouped by shared root (F1 div-by-zero, F2 NaN,
F3 NEP50 promotion, F4 unsupported throws, F5 complex algorithms, F6 bool, F7 size-1 shape,
F8 summation precision, F9 representation). Each fix flips its classifier branch and re-arms
the gate; F5/F8/F9 are implement-vs-document judgment calls.
2 Coverage breadth — 2A finish #2 sections C+E (out=/where=/aliasing/overlap/mask), 2B op
tiers T7-T15 (manipulation, matmul/dot, bitwise, nan-aware, cumulative, stat, logic,
sorting, multi-output) = the ~75 untested transformation ops.
3 NpyIter behavioral parity (#3 / Plan B) — port test_nditer.py; de-risks Phase 5.
4 Depth — params (order=/dtype=/ddof/axis), SIMD-tail & large shapes, error parity (#4),
unmanaged lifecycle (#5), metamorphic invariants (#7).
5 Performance — the >=1.5x-NumPy mission (#6): the DirectILKernelGenerator -> ILKernelGenerator
(NpyIter-driven) migration in CLAUDE.md priority order, benchmark ledger, perf CI gates;
the differential matrices keep the kernel rewrite from breaking parity.
Includes a dependency graph, value-weighted recommended order, and an effort-shape table. Detail
for phases 2A/3 lives in FUZZ_PLAN_NEXT.md; findings in FUZZ_FINDINGS.md.
Nucs
added a commit
that referenced
this pull request
Jun 6, 2026
…cs (Phase 1 F3a) NumSharp promoted every integer input to float64 for the float-producing unary ufuncs (sqrt/cbrt/exp/log/trig/...), regardless of input width. NumPy NEP50 uses WIDTH-BASED promotion: bool/int8/uint8 -> float16, int16/uint16 -> float32, int32/uint32/int64/uint64 -> float64 (float/complex preserved). FUZZ_FINDINGS #7. Fix: new DefaultEngine.ResolveUnaryFloatReturnType implements the width-based rule; the 20 transcendental Default.<Op>.cs files (ACos/ASin/ATan/Cbrt/Cos/Cosh/Deg2Rad/ Exp/Exp2/Expm1/Log/Log10/Log1p/Log2/Rad2Deg/Sin/Sinh/Sqrt/Tan/Tanh) now call it instead of ResolveUnaryReturnType (which widened to float64 via GetComputingType). The int->Half / int->Single unary kernels already existed, so this is a pure result-dtype routing change; int32/int64 are unaffected (already float64). The dtype-PRESERVING ufuncs (square/floor/ceil/trunc/round/reciprocal) are intentionally left on the old resolver pending F3b (they need integer identity / x*x / int-reciprocal kernels to preserve the integer dtype). Verification: * 364 of 494 unary dtype divergences clear bit-exact; the remaining 130 are the F3b-pending preserve-dtype ops, now scoped in MisalignedRegistry so a transcendental promotion regression fails the gate (not silently excused). * Half/Single transcendental values stay within 2 ULP of NumPy's float16/float32 libm (documented algorithm difference, same class as complex-divide ULP). * DtypeCoverageTests.Sqrt_IntegerDtypes updated: it asserted the old uniform float64 (codified the bug); now asserts the NumPy width-based dtype per input. * Full net10.0 suite green: 9422 passed / 0 failed; FuzzMatrix 17/17. Clears the transcendental half of FUZZ_FINDINGS #7 (and #15 dtype for those ops).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.