Skip to content

[branch-54] refactor: wrap HigherOrderUDFImpl in a concrete HigherOrderUDF struct (#22593)#22635

Open
alamb wants to merge 1 commit into
branch-54from
alamb/backport_22593
Open

[branch-54] refactor: wrap HigherOrderUDFImpl in a concrete HigherOrderUDF struct (#22593)#22635
alamb wants to merge 1 commit into
branch-54from
alamb/backport_22593

Conversation

@alamb
Copy link
Copy Markdown
Contributor

@alamb alamb commented May 30, 2026

This PR:

Note on conflict resolution

The cherry-pick had conflicts in two test areas:

  • datafusion/functions-nested/src/array_any_match.rs — only the test-module use line conflicted; branch-54's test does not reference the renamed symbols, so the existing import was kept (the production HigherOrderUDF -> HigherOrderUDFImpl rename applied cleanly).
  • datafusion/substrait/tests/cases/roundtrip_logical_plan.rs — the upstream PR modified the roundtrip_array_transform_higher_order_function test and ArrayTransform helper, but that test was added to main after branch-54 was cut and does not exist on branch-54. It is out of scope for this refactor backport, so the branch-54 file was left unchanged.

…#22593)

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

Part of  #21172

`HigherOrderUDF` was the only UDF kind defined as a trait that callers
used directly via `Arc<dyn HigherOrderUDF>`. The other UDFs:
`ScalarUDF`, `AggregateUDF`, `WindowUDF` — are concrete structs that
wrap their respective `*Impl trait`, which makes inherent methods like
`with_aliases` ergonomic to call on the function object. With the
trait-only setup, adding aliases to an existing higher-order function
required an extension trait import or a free helper function.

This PR brings higher order functions in line with the other UDFs so the
same `with_aliases` pattern works.

- Rename the `HigherOrderUDF` trait to `HigherOrderUDFImpl`, matching
`ScalarUDFImpl`/`AggregateUDFImpl`.
Add a concrete `HigherOrderUDF` struct wrapping `Arc<dyn
HigherOrderUDFImpl>`, with the same shape as `ScalarUDF`: new_from_impl,
new_from_shared_impl, inner, with_aliases, From<F: HigherOrderUDFImpl>,
and delegate methods for every trait method.
`with_aliases` is backed by a private `AliasedHigherOrderUDFImpl`
decorator (same pattern as `AliasedScalarUDFImpl`).
- Update `Expr::HigherOrderFunction`, `FunctionRegistry`, the
`create_higher_order! `singleton macro, and all consumer files ( across
several crates) to use `Arc<HigherOrderUDF>` instead of `Arc<dyn
HigherOrderUDF>`.
Existing impls (`ArrayFilter`, `ArrayTransform`, `ArrayAnyMatch`) now
implement `HigherOrderUDFImpl`; their public constructors continue to
return `Arc<HigherOrderUDF>` so external call sites need no changes.

Callers can now write:

`array_filter_higher_order_function().with_aliases(["filter"])
`

exactly like the existing scalar pattern:

`make_array_udf().as_ref().clone().with_aliases(["array_construct"])
`

Covered by existing tests

Yes, any code referring to `Arc<dyn HigherOrderUDF>` needs to become
`Arc<HigherOrderUDF>`, and any code that wrote `impl HigherOrderUDF for
MyHOF` needs to write i`mpl HigherOrderUDFImpl for MyType`. Constructing
a HigherOrderUDF from an impl is HigherOrderUDF::new_from_impl(my_impl)
(or my_impl.into()).
@github-actions github-actions Bot added sql SQL Planner logical-expr Logical plan and expressions physical-expr Changes to the physical-expr crates optimizer Optimizer rules core Core DataFusion crate execution Related to the execution crate proto Related to proto crate functions Changes to functions implementation datasource Changes to the datasource crate ffi Changes to the ffi crate spark labels May 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Core DataFusion crate datasource Changes to the datasource crate execution Related to the execution crate ffi Changes to the ffi crate functions Changes to functions implementation logical-expr Logical plan and expressions optimizer Optimizer rules physical-expr Changes to the physical-expr crates proto Related to proto crate spark sql SQL Planner

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants