fix: Fix compilation error for CometBroadcastExchangeExec#86
Merged
Conversation
Member
Author
|
Merged. Thanks. |
schenksj
added a commit
to schenksj/datafusion-comet
that referenced
this pull request
Jun 9, 2026
apache#86) The delta-contrib/spark-3.5.8 CI cell failed 8/146 tests, all SparkFileNotFoundException on bare deletion-vector UUIDs. Root cause: under `Utils.isTesting`, Delta prepends `spark.databricks.delta.testOnly.dvFileNamePrefix` (default `test%dv%prefix-` on Delta 3.3.2, empty on 4.x) to every DV file it writes -> `<prefix>deletion_vector_<uuid>.bin`. delta-kernel-rs has no knowledge of that JVM-only conf, so for "u" (UUID-relative) storage it resolves the un-prefixed name and the executor-side DV read fails. It passed on 4.x only because the effective prefix is empty there. Fix: ship the conf to native and splice it into kernel's own resolved DV path for "u" descriptors, reading through an absolute-path ("p") descriptor. - DeltaReflection.dvFileNamePrefix reads the value via the `ConfigEntry` (DeltaSQLConf.TEST_DV_NAME_PREFIX) -- the prefix is the entry's testing-gated DEFAULT, not a session setting, so getConfString(key, "") would always miss it. Reflection (no compile dep on delta-spark); empty when the entry is absent. - New DeltaScanCommon.dv_file_name_prefix proto field carries it to the executor. - dv_reader::read_dv_indexes splices the prefix in front of the `deletion_vector_<uuid>.bin` filename (only for "u" storage; "p"/"i" untouched) via kernel's absolute_path, operating on the decoded fs path so the literal `%` re-encodes correctly. Empty prefix (production) is a no-op. Validated: full contrib.delta package 153/0 on BOTH spark-3.5/delta-3.3.2 (was 9/1 on the repro suite, 8 failures across the package) and spark-4.1/delta-4.1.0 (no regression). Plus 3 splice unit tests. The existing CometDeltaFilterPushdownAuditSuite "DV + range filter" test is the red->green guard on the 3.5 matrix cell. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
schenksj
added a commit
to schenksj/datafusion-comet
that referenced
this pull request
Jun 9, 2026
…pache#85) The design docs had drifted from the kernel-read refactor (apache#76/apache#77/apache#80/apache#81/apache#82/ apache#84/#2/apache#78/apache#86). Audited all 13 docs against current code and corrected: - Removed the deleted ParquetSource + DV-sweep + DeltaSyntheticColumnsExec read stack as the "current" path everywhere; it is now kernel-read only (apache#50/apache#82), with DeltaKernelScanExec doing in-worker synthesis. The old stack is kept only as clearly-labeled history / rejected alternatives. - delta_scan.rs is a ~72-line shim delegating to comet_contrib_delta::planner (apache#77); column-mapping physicalisation dropped, kernel ships the schemas (apache#76). - CDF (readChangeFeed) is kernel-native via TableChanges -> CometDeltaCdfScanExec, split multi-partition (apache#84/#2) -- corrected docs that called it unsupported, declined, or a synthetic-columns fallback. - 08-known-limitations.md: removed all of Part B (B1-B9 were development-time regressions, all now fixed + guarded) and A3 (path-based CDF now engages native, apache#84); kept only genuine current limitations (A1 DPP residual, A2e credential residual, A4 VARIANT, A5 decline gates, A6 INT96 kernel gap, A7 CM-id repoint). 466 -> 230 lines. - Fixed config keys, build/module layout, JNI symbols, file paths, CI workflow references, and supported-feature lists (added CDF, _metadata, INT96) across the build / README / user-guide docs. Every claim verified against code; markdown passes prettier. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Closes #.
Rationale for this change
There are some compilation error on
CometBroadcastExchangeExecin Spark 3.2/3.2:This is due to SPARK-39983 which is 3.4 only patch.
And,
QueryExecutionErrors.notEnoughMemoryToBuildAndBroadcastTableErroradds a parameter in Spark 3.4.What changes are included in this PR?
How are these changes tested?