Skip to content

fix: Fix compilation error for CometBroadcastExchangeExec#86

Merged
viirya merged 1 commit into
apache:mainfrom
viirya:fix_broadcast_compile
Feb 22, 2024
Merged

fix: Fix compilation error for CometBroadcastExchangeExec#86
viirya merged 1 commit into
apache:mainfrom
viirya:fix_broadcast_compile

Conversation

@viirya

@viirya viirya commented Feb 22, 2024

Copy link
Copy Markdown
Member

Which issue does this PR close?

Closes #.

Rationale for this change

There are some compilation error on CometBroadcastExchangeExec in Spark 3.2/3.2:

17:38:27.154 [�[1;31mERROR�[m] 
/spark/src/main/scala/org/apache/spark/sql/comet/CometBroadcastExchangeExec.scala:115: value broadcastInternal is not a member of org.apache.spark.SparkContext
    17:38:27.154 possible cause: maybe a semicolon is missing before `value broadcastInternal'?
    17:38:27.154 [�[1;31mERROR�[m] 
/spark/src/main/scala/org/apache/spark/sql/b comet oson/CometBroadcastExchangeExec.scala:115: not found: value serializedOnly

This is due to SPARK-39983 which is 3.4 only patch.
And, QueryExecutionErrors.notEnoughMemoryToBuildAndBroadcastTableError adds a parameter in Spark 3.4.

What changes are included in this PR?

How are these changes tested?

@sunchao sunchao left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@viirya viirya merged commit 0cca52e into apache:main Feb 22, 2024
@viirya

viirya commented Feb 22, 2024

Copy link
Copy Markdown
Member Author

Merged. Thanks.

schenksj added a commit to schenksj/datafusion-comet that referenced this pull request Jun 9, 2026
apache#86)

The delta-contrib/spark-3.5.8 CI cell failed 8/146 tests, all
SparkFileNotFoundException on bare deletion-vector UUIDs. Root cause: under
`Utils.isTesting`, Delta prepends `spark.databricks.delta.testOnly.dvFileNamePrefix`
(default `test%dv%prefix-` on Delta 3.3.2, empty on 4.x) to every DV file it
writes -> `<prefix>deletion_vector_<uuid>.bin`. delta-kernel-rs has no knowledge
of that JVM-only conf, so for "u" (UUID-relative) storage it resolves the
un-prefixed name and the executor-side DV read fails. It passed on 4.x only
because the effective prefix is empty there.

Fix: ship the conf to native and splice it into kernel's own resolved DV path
for "u" descriptors, reading through an absolute-path ("p") descriptor.

- DeltaReflection.dvFileNamePrefix reads the value via the `ConfigEntry`
  (DeltaSQLConf.TEST_DV_NAME_PREFIX) -- the prefix is the entry's testing-gated
  DEFAULT, not a session setting, so getConfString(key, "") would always miss it.
  Reflection (no compile dep on delta-spark); empty when the entry is absent.
- New DeltaScanCommon.dv_file_name_prefix proto field carries it to the executor.
- dv_reader::read_dv_indexes splices the prefix in front of the
  `deletion_vector_<uuid>.bin` filename (only for "u" storage; "p"/"i" untouched)
  via kernel's absolute_path, operating on the decoded fs path so the literal `%`
  re-encodes correctly. Empty prefix (production) is a no-op.

Validated: full contrib.delta package 153/0 on BOTH spark-3.5/delta-3.3.2 (was
9/1 on the repro suite, 8 failures across the package) and spark-4.1/delta-4.1.0
(no regression). Plus 3 splice unit tests. The existing
CometDeltaFilterPushdownAuditSuite "DV + range filter" test is the red->green
guard on the 3.5 matrix cell.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
schenksj added a commit to schenksj/datafusion-comet that referenced this pull request Jun 9, 2026
…pache#85)

The design docs had drifted from the kernel-read refactor (apache#76/apache#77/apache#80/apache#81/apache#82/
apache#84/#2/apache#78/apache#86). Audited all 13 docs against current code and corrected:

- Removed the deleted ParquetSource + DV-sweep + DeltaSyntheticColumnsExec read
  stack as the "current" path everywhere; it is now kernel-read only (apache#50/apache#82),
  with DeltaKernelScanExec doing in-worker synthesis. The old stack is kept only
  as clearly-labeled history / rejected alternatives.
- delta_scan.rs is a ~72-line shim delegating to comet_contrib_delta::planner
  (apache#77); column-mapping physicalisation dropped, kernel ships the schemas (apache#76).
- CDF (readChangeFeed) is kernel-native via TableChanges -> CometDeltaCdfScanExec,
  split multi-partition (apache#84/#2) -- corrected docs that called it unsupported,
  declined, or a synthetic-columns fallback.
- 08-known-limitations.md: removed all of Part B (B1-B9 were development-time
  regressions, all now fixed + guarded) and A3 (path-based CDF now engages
  native, apache#84); kept only genuine current limitations (A1 DPP residual, A2e
  credential residual, A4 VARIANT, A5 decline gates, A6 INT96 kernel gap, A7
  CM-id repoint). 466 -> 230 lines.
- Fixed config keys, build/module layout, JNI symbols, file paths, CI workflow
  references, and supported-feature lists (added CDF, _metadata, INT96) across
  the build / README / user-guide docs.

Every claim verified against code; markdown passes prettier.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants