Manual merge of PRs #20394–#20397 (slice_copy + permute_copy) by JulianCloudNTH · Pull Request #20550 · pytorch/executorch

JulianCloudNTH · 2026-06-26T18:05:26Z

Summary

Manual merge of four WebGPU-delegate op PRs that landed internally but could not auto-merge
to main. These are stacked ghstack PRs — when the lower PRs in the stack merged, their head
branches were deleted and these four PRs' base branches were orphaned, so the orig-PR
proposer failed with 422 base invalid. This PR re-lands the same four commits (identical
content to the originals, flat test layout) as a clean stack on top of current main:

#20394 — Add slice_copy op
(aten.slice_copy.Tensor)
#20395 — slice_copy op test suite
(cases.py op-test framework)
#20396 — Add permute_copy + IntList
graph support (aten.permute_copy.default)
#20397 — permute_copy op test suite
(cases.py op-test framework)

Test plan

Each op ships with its cases.py op-test suite (exported via VulkanPartitioner, compared
to a torch golden on Dawn) plus an export-delegation smoke test, exercised by the WebGPU
op-test CI (etvk-*). Verified internally; content is identical to the original four PRs.

@diff-train-skip-merge

Pull Request resolved: pytorch#20394 Adds `aten.slice_copy.Tensor` to the WebGPU delegate as a gather: each output element is mapped back to its source input element along the sliced dim via `start + coord * step`. Composition (single compute dispatch): - `runtime/ops/slice/Slice.cpp` — reads `args = [self, dim, start, end, step, out]` via `read_scalar` (static `Int`/`Null`-sentinel default; throws on dynamic `SymInt`); normalizes negative `dim`/`start`, clamps `start` to `[0, in_size]`; builds two `TensorMeta` UBOs + a `SliceParams{dim, start, step}` uniform; guards fp32; dispatches over `compute_1d_workgroup_count(out.numel)` with `override wg_size`; releases all uniforms after the bind group. - `runtime/ops/slice/slice.wgsl` — delinearizes the output index over the contiguous output strides, maps the sliced-dim coordinate back to the input (`start + coord*step`), relinearizes over the input strides. ghstack-source-id: 397026527 @exported-using-ghexport Differential Revision: [D108793168](https://our.internmc.facebook.com/intern/diff/D108793168/)

…work) Pull Request resolved: pytorch#20395 Registers `aten.slice_copy.Tensor` in the `cases.py` op-test framework: a `_slice_suite` of 4 configs (leading-dim slice `[:,1:5]`, last-dim slice `[...,1:3]`, step-2 `[:,0:8:2]`, negative-end `[:,1:-1]`) that `generate_op_tests` exports via `VulkanPartitioner` and compares to a torch golden on Dawn. Also adds `test/ops/slice/test_slice.py` (`SliceModule` + `CONFIGS` + export-delegation/eager smoke test) and the `aten.slice_copy.Tensor` partitioner-allowlist entry in `tester.py`. ghstack-source-id: 397026537 @exported-using-ghexport Differential Revision: [D108793151](https://our.internmc.facebook.com/intern/diff/D108793151/)

…ermute_copy.default) Pull Request resolved: pytorch#20396 Adds `aten.permute_copy.default` (a coordinate-reorder gather) to the WebGPU delegate, and the `IntList` graph value type it needs to read its `dims` argument. Composition: - `runtime/WebGPUGraph.{h,cpp}` — adds `ValueType::IntList` backed by `std::vector<std::vector<int64_t>> int_lists_` + `get_int_list(int)`; `build()` deserializes `vkgraph::GraphTypes::IntList` via `value_as_IntList()->items()` (int64, matching the FlatBuffer `[long]`); mirrors the existing scalar value plumbing. - `runtime/ops/permute/Permute.cpp` — reads the permutation via `get_int_list`, normalizes negative dims, validates it is a permutation of `[0, ndim)`, builds two `TensorMeta` UBOs + a `PermuteParams{perm: vec4<u32>}` uniform, guards fp32 + rank≤4, dispatches over `compute_1d_workgroup_count(out.numel)` with `override wg_size`; releases all uniforms after the bind group. - `runtime/ops/permute/permute.wgsl` — delinearizes the output index over the contiguous output strides, reads `input` at `in.strides[perm[d]]` per dim (mirrors Vulkan `permute_buffer.glsl`). - Registers both `aten.permute_copy.default` and `aten.permute.default` to the same handler. ghstack-source-id: 397026548 @exported-using-ghexport Differential Revision: [D108793162](https://our.internmc.facebook.com/intern/diff/D108793162/)

…mework) Pull Request resolved: pytorch#20397 Registers `aten.permute_copy.default` in the `cases.py` op-test framework: a `_permute_suite` of 4 configs (3D rotation, 4D middle-dim transpose, 2D transpose, full 4D shuffle) that `generate_op_tests` exports via `VulkanPartitioner` and compares to a torch golden on Dawn. Also adds `test/ops/permute/test_permute.py` (`PermuteModule` + `CONFIGS` + `_op_delegated` smoke test) and the `aten.permute_copy.default` partitioner-allowlist entry in `tester.py`. ghstack-source-id: 397026550 @exported-using-ghexport Differential Revision: [D108793156](https://our.internmc.facebook.com/intern/diff/D108793156/)

pytorch-bot · 2026-06-26T18:05:29Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/20550

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

⏳ No Failures, 5 Pending

As of commit dde991a with merge base b919db7 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

github-actions · 2026-06-26T18:06:18Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

psiddh

Approving this to unblock the diff train

JulianCloudNTH added 4 commits June 26, 2026 11:02

JulianCloudNTH requested a review from psiddh June 26, 2026 18:05

JulianCloudNTH requested review from kirklandsign and larryliu0820 as code owners June 26, 2026 18:05

meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 26, 2026

Merge branch 'main' into webgpu-slice-permute-manual-merge

dde991a

psiddh approved these changes Jun 26, 2026

View reviewed changes

JulianCloudNTH merged commit a03f97b into pytorch:main Jun 26, 2026
181 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Manual merge of PRs #20394–#20397 (slice_copy + permute_copy)#20550

Manual merge of PRs #20394–#20397 (slice_copy + permute_copy)#20550
JulianCloudNTH merged 5 commits into
pytorch:mainfrom
JulianCloudNTH:webgpu-slice-permute-manual-merge

JulianCloudNTH commented Jun 26, 2026 •

edited

Loading

Uh oh!

pytorch-bot Bot commented Jun 26, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 26, 2026

Uh oh!

psiddh left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

JulianCloudNTH commented Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

pytorch-bot Bot commented Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/20550

⏳ No Failures, 5 Pending

Uh oh!

github-actions Bot commented Jun 26, 2026

This PR needs a release notes: label

Uh oh!

psiddh left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

JulianCloudNTH commented Jun 26, 2026 •

edited

Loading

pytorch-bot Bot commented Jun 26, 2026 •

edited

Loading

This PR needs a `release notes:` label