Skip to content

Implement Portable bucketize#20287

Open
Gallinator wants to merge 28 commits into
pytorch:mainfrom
Gallinator:bucketize
Open

Implement Portable bucketize#20287
Gallinator wants to merge 28 commits into
pytorch:mainfrom
Gallinator:bucketize

Conversation

@Gallinator

@Gallinator Gallinator commented Jun 15, 2026

Copy link
Copy Markdown

Fixes #20270

Summary

Add portable scalar and tensor bucketize operator, based on the PyTorch implementation.
Key differences from the PyTorch implementation:

  • Implemented only bucketize, no searchsorted
  • Boundaries tensor size is not checked against output type, as SizesType is an int32_t and must be 1D
  • In the original implementation the input scalar is wrapped into a tensor, here two sets of functions are used instead
  • Supports both NHWC and NCHW as it is a pointwise kernel and there is no sorter argument. This avoids creating contiguous tensors

Release notes: ops & kernels

Test plan

C++ tests are extensive and include failure and edge cases tests.
Python tests compare the output of an exported module to the PyTorch implementation only for normal and edge cases, similar to #15893.

cc @larryliu0820 @manuelcandales

@pytorch-bot

pytorch-bot Bot commented Jun 15, 2026

Copy link
Copy Markdown

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/20287

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

⚠️ 12 Awaiting Approval

As of commit 7c7b547 with merge base 0378fc4 (image):

AWAITING APPROVAL - The following workflows need approval before CI can run:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla

meta-cla Bot commented Jun 15, 2026

Copy link
Copy Markdown

Hi @Gallinator!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks!

@linux-foundation-easycla

linux-foundation-easycla Bot commented Jun 15, 2026

Copy link
Copy Markdown

CLA Signed
The committers listed above are authorized under a signed CLA.

@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 16, 2026
@meta-cla

meta-cla Bot commented Jun 16, 2026

Copy link
Copy Markdown

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks!

@Gallinator

Copy link
Copy Markdown
Author

@pytorchbot label "release notes: ops & kernels"

@pytorch-bot pytorch-bot Bot added the release notes: ops & kernels Changes to the opset and any new / changed kernel implementations label Jun 16, 2026
@Gallinator Gallinator marked this pull request as ready for review June 17, 2026 11:33
@nil-is-all nil-is-all added the module: kernels Issues related to kernel libraries and utilities, and code under kernels/ label Jun 22, 2026
@Gasoonjia

Copy link
Copy Markdown
Contributor

@claude review this PR

@claude

claude Bot commented Jun 22, 2026

Copy link
Copy Markdown

Claude finished @Gasoonjia's task in 5m 32s —— View job


PR Review: Implement Portable bucketize

  • Read changed files (op_bucketize.cpp, tests, functions.yaml, build files)
  • Analyze correctness, conventions, and tests

Nice work — the kernel is clean, the C++ test coverage is genuinely thorough (edge cases, inf, empty, dtype matrix, failure paths), and the choice to support both NHWC/NCHW via the matching-dim-order check is correct: since self and out are required to have the same shape and the same dim order, iterating linear physical indices i for both loads and stores maps to the same logical element, so no contiguous copy is needed. Below are the issues I found.

Bugs

1. test_bucketize_scalar_out_int32 doesn't test int32kernels/test/test_bucketize.py:126-131
The "int32" test constructs BucketizeModule(out_int32=False, ...), so it's an exact duplicate of the int64 test and never exercises the int32 path. Set out_int32=True.

def test_bucketize_scalar_out_int32(self):
    model = BucketizeModule(out_int32=False, right=False)  # should be True

Fix this →

Conventions

2. Missing BSD license headerop_bucketize.cpp, op_bucketize_test.cpp, test_bucketize.py
All three new files start directly with includes/code. Every other file under kernels/portable/cpu/ and kernels/test/ carries the standard Copyright (c) Meta Platforms, Inc. and affiliates. ... BSD-style license header. Please add it.

3. bucketize_tensor_out requires a pre-sized out instead of resizing itop_bucketize.cpp:177-180
The kernel checks tensors_have_same_shape(self, out) but never calls resize_tensor(out, self.sizes()). Most portable out-variant kernels (e.g. op_clamp.cpp:129) resize out to the input size so they work with dynamic / upper-bound shapes where the planner allocates out at a max size. With a hard same-shape check this op will fail under dynamic shapes. Consider:

ET_KERNEL_CHECK(
    context, resize_tensor(out, self.sizes()) == Error::Ok, InvalidArgument, out);

(For the scalar variant, the out.dim() == 0 check is fine as-is.)

Minor / style

  • op_bucketize.cpp:69,108const bool& right passes a bool by const reference; pass by value (bool right).
  • op_bucketize.cpp:80,116boundaries.sizes().back() can be boundaries.numel() since boundaries is already validated as 1D; slightly clearer.
  • test_bucketize.py — every test docstring says "Test bucketize.Tensor_out", including the scalar tests (lines 120, 126, etc.). Worth correcting the scalar ones.

Things I checked that look correct

  • Compute-type promotion (promote_type_with_scalar for scalar, promoteTypes for tensor) and the REALHBF16 dtype switch.
  • out_int32 ↔ output dtype validation, complex-type and N-D-boundaries rejection paths.
  • Binary-search cus_lower_bound/cus_upper_bound for right/left semantics, empty boundaries (returns 0), and out-of-range / inf inputs — all covered by tests.
  • Build registration in functions.yaml, targets.bzl, and CMakeLists.txt.

Overall this is close to merge-ready; only item #1 is a real functional gap, and #2 is required for CI conventions.
· branch bucketize

@Gasoonjia

Copy link
Copy Markdown
Contributor

Thanks for contribution! I have a question left in the issue mind check that?

Comment thread kernels/test/test_bucketize.py Outdated
# LICENSE file in the root directory of this source tree.

"""
Test for bucketize operations in ExecuTorch.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

plz do not put export test script under kernels/test. Only the runtime test should be fine.

@Gallinator Gallinator Jun 22, 2026

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Gasoonjia So I have to remove test_bucketize.py ? I see that in #15893 they added an export test as well

@Gallinator Gallinator requested a review from Gasoonjia June 24, 2026 19:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. module: kernels Issues related to kernel libraries and utilities, and code under kernels/ release notes: ops & kernels Changes to the opset and any new / changed kernel implementations

Projects

None yet

Development

Successfully merging this pull request may close these issues.

kernel 'aten::bucketize.Tensor_out' not found.

3 participants