Skip to content

feat: Add CometBroadcastExchangeExec to support broadcasting the result of Comet native operator#80

Merged
viirya merged 1 commit into
apache:mainfrom
viirya:broadcast_exec
Feb 21, 2024
Merged

feat: Add CometBroadcastExchangeExec to support broadcasting the result of Comet native operator#80
viirya merged 1 commit into
apache:mainfrom
viirya:broadcast_exec

Conversation

@viirya

@viirya viirya commented Feb 21, 2024

Copy link
Copy Markdown
Member

Which issue does this PR close?

Closes #81.

Rationale for this change

This patch adds CometBroadcastExchangeExec operator to support broadcasting the result of Comet native operator. This is another step towards supporting BroadcastHashJoinExec.

Like Spark BroadcastExchangeExec, CometBroadcastExchangeExec is not directly called with doExecute or doExecuteColumnar as other operators. CometBroadcastExchangeExec is used when Spark query planner inserts BroadcastDistribution on top of an operator to broadcast it. The upstream operator of CometBroadcastExchangeExec, e.g., BroadcastHashJoinExec, will call executeBroadcast method of CometBroadcastExchangeExec to execute the query plan and broadcast its results.

What changes are included in this PR?

How are these changes tested?

@sunchao sunchao left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, this has been reviewed internally before.

@viirya viirya merged commit 637dba9 into apache:main Feb 21, 2024
@viirya

viirya commented Feb 21, 2024

Copy link
Copy Markdown
Member Author

Merged. Thanks.

schenksj added a commit to schenksj/datafusion-comet that referenced this pull request Jun 9, 2026
…pache#85)

The design docs had drifted from the kernel-read refactor (apache#76/apache#77/apache#80/apache#81/apache#82/
apache#84/#2/apache#78/apache#86). Audited all 13 docs against current code and corrected:

- Removed the deleted ParquetSource + DV-sweep + DeltaSyntheticColumnsExec read
  stack as the "current" path everywhere; it is now kernel-read only (apache#50/apache#82),
  with DeltaKernelScanExec doing in-worker synthesis. The old stack is kept only
  as clearly-labeled history / rejected alternatives.
- delta_scan.rs is a ~72-line shim delegating to comet_contrib_delta::planner
  (apache#77); column-mapping physicalisation dropped, kernel ships the schemas (apache#76).
- CDF (readChangeFeed) is kernel-native via TableChanges -> CometDeltaCdfScanExec,
  split multi-partition (apache#84/#2) -- corrected docs that called it unsupported,
  declined, or a synthetic-columns fallback.
- 08-known-limitations.md: removed all of Part B (B1-B9 were development-time
  regressions, all now fixed + guarded) and A3 (path-based CDF now engages
  native, apache#84); kept only genuine current limitations (A1 DPP residual, A2e
  credential residual, A4 VARIANT, A5 decline gates, A6 INT96 kernel gap, A7
  CM-id repoint). 466 -> 230 lines.
- Fixed config keys, build/module layout, JNI symbols, file paths, CI workflow
  references, and supported-feature lists (added CDF, _metadata, INT96) across
  the build / README / user-guide docs.

Every claim verified against code; markdown passes prettier.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add CometBroadcastExchangeExec to support broadcasting the result of Comet native operator

2 participants