Skip to content

test(workflow-operator): fill uncovered branches in filter, reader, and dual-port UDF classes#6057

Merged
aglinxinyuan merged 2 commits into
apache:mainfrom
aglinxinyuan:test-operator-branch-coverage
Jul 2, 2026
Merged

test(workflow-operator): fill uncovered branches in filter, reader, and dual-port UDF classes#6057
aglinxinyuan merged 2 commits into
apache:mainfrom
aglinxinyuan:test-operator-branch-coverage

Conversation

@aglinxinyuan

Copy link
Copy Markdown
Contributor

What changes were proposed in this PR?

Fill uncovered code paths in three workflow-operator classes, selected from the Codecov report. No production-code changes.

File Codecov before Missed lines targeted
FilterPredicate.java 40.0% 26 — the remaining evaluate branches: GE/LT operators, null-field short-circuit, IS_NULL both ways, boolean (case-insensitive), double, long, and timestamp evaluators, the numeric-string fast path, and the equals identity/null/other-type branches
BufferedBlockReader.java 0% 51 (whole class) — delimiter splitting, EOF flush (incl. kept-filter bypass), CR terminators and the CRLF single-null artifact, empty fields → null, fields spanning buffer reads, delimiter-at-buffer-start carry, kept-index filtering, hasNext block-boundary semantics, close
DualInputPortsPythonUDFOpDescV2.scala 40.0% 27 — the entire getPhysicalOp: worker precondition, env-name resolution (default/blank-reject/trim), single vs parallel wiring, unknown-partition derivation, and the port-1-drives-output schema propagation incl. the duplicate-column rejection

Extends the existing FilterPredicateSpec and adds BufferedBlockReaderSpec (with a chunked-stream stub to force buffer-boundary paths) and DualInputPortsPythonUDFOpDescV2Spec.

Any related issues, documentation, discussions?

Follow-up to the review feedback on #6043: prioritize tests that fill uncovered code paths.

How was this PR tested?

  • sbt "WorkflowOperator/testOnly *FilterPredicateSpec *BufferedBlockReaderSpec *DualInputPortsPythonUDFOpDescV2Spec" — 39 tests, all green
  • sbt "WorkflowOperator/Test/scalafmtCheck" and sbt "WorkflowOperator/scalafixAll --check" — clean
  • CI + Codecov delta to confirm

Was this PR authored or co-authored using generative AI tooling?

Generated-by: Claude Code (Opus 4.8 [1M context])

…nd dual-port UDF classes

Targets Codecov-missed lines: FilterPredicate (26), BufferedBlockReader (51,
previously 0%), DualInputPortsPythonUDFOpDescV2 (27). Extends the existing
FilterPredicateSpec and adds BufferedBlockReaderSpec and
DualInputPortsPythonUDFOpDescV2Spec.
Copilot AI review requested due to automatic review settings July 2, 2026 04:56
@github-actions github-actions Bot added the common label Jul 2, 2026
@github-actions

github-actions Bot commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

Automated Reviewer Suggestions

Based on the git blame history of the changed files, we recommend the following reviewers:

  • No candidates found from git blame history.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR increases unit-test coverage in common/workflow-operator by adding/expanding specs to exercise previously uncovered branches in filter evaluation, buffered line parsing, and dual-input Python UDF physical-op/schema logic, without changing production code.

Changes:

  • Extend FilterPredicateSpec to cover remaining evaluate branches across comparison types and attribute types, plus equals edge cases.
  • Add BufferedBlockReaderSpec to cover delimiter splitting, EOF flushing, CR/CRLF handling artifacts, buffer-boundary behavior, kept-field filtering, hasNext, and close.
  • Add DualInputPortsPythonUDFOpDescV2Spec to cover getPhysicalOp wiring/preconditions/env-name resolution and schema propagation behaviors.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
common/workflow-operator/src/test/scala/org/apache/texera/amber/operator/udf/python/DualInputPortsPythonUDFOpDescV2Spec.scala New spec covering getPhysicalOp validation/wiring and schema propagation (incl. duplicate-column rejection) for the dual-input Python UDF operator.
common/workflow-operator/src/test/scala/org/apache/texera/amber/operator/source/BufferedBlockReaderSpec.scala New spec exercising BufferedBlockReader parsing and boundary conditions, including a chunked-stream stub for buffer-edge paths.
common/workflow-operator/src/test/scala/org/apache/texera/amber/operator/filter/FilterPredicateSpec.scala Expanded spec covering remaining FilterPredicate.evaluate branches across operators/types and equals edge cases.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@codecov-commenter

codecov-commenter commented Jul 2, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 56.62%. Comparing base (322babf) to head (1a01f35).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##               main    #6057      +/-   ##
============================================
+ Coverage     56.37%   56.62%   +0.25%     
- Complexity     2989     3035      +46     
============================================
  Files          1129     1129              
  Lines         43794    43802       +8     
  Branches       4743     4743              
============================================
+ Hits          24689    24804     +115     
+ Misses        17654    17547     -107     
  Partials       1451     1451              
Flag Coverage Δ *Carryforward flag
access-control-service 70.00% <ø> (ø)
agent-service 44.59% <ø> (ø) Carriedforward from 46322d5
amber 57.95% <ø> (+0.65%) ⬆️
computing-unit-managing-service 0.00% <ø> (ø)
config-service 52.30% <ø> (ø)
file-service 62.81% <ø> (ø)
frontend 50.12% <ø> (ø) Carriedforward from 46322d5
notebook-migration-service 78.57% <ø> (ø)
pyamber 90.20% <ø> (ø) Carriedforward from 46322d5
python 90.76% <ø> (ø) Carriedforward from 46322d5
workflow-compiling-service 55.14% <ø> (ø)

*This pull request uses carry forward flags. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@github-actions

github-actions Bot commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

⚠️ Benchmark changes need a look

🟢 2 better · 🔴 3 worse · ⚪ 10 noise (<±5%) · 0 without baseline

Compared against main 878eb8a benchmarked on this same runner, so the delta is largely free of cross-runner hardware noise. The "7d avg" column still reflects the gh-pages dashboard. Treat <±5% as noise unless repeated.

Dashboard · Run

config throughput MB/s latency max Δ latest / 7d
🔴 bs=10 sw=10 sl=64 372 0.227 27,057/35,335/35,335 us 🔴 +21.8% / 🔴 +139.0%
bs=100 sw=10 sl=64 790 0.482 123,543/147,472/147,472 us ⚪ within ±5% / 🔴 +38.0%
bs=1000 sw=10 sl=64 930 0.568 1,067,832/1,109,928/1,109,928 us ⚪ within ±5% / 🔴 -9.1%
Baseline details

Latest main 878eb8a from same runner

config metric PR latest main 7d avg Δ latest Δ 7d
bs=10 sw=10 sl=64 throughput 372 tuples/sec 403 tuples/sec 786.27 tuples/sec -7.7% -52.7%
bs=10 sw=10 sl=64 MB/s 0.227 MB/s 0.246 MB/s 0.48 MB/s -7.7% -52.7%
bs=10 sw=10 sl=64 p50 27,057 us 22,211 us 12,495 us +21.8% +116.5%
bs=10 sw=10 sl=64 p95 35,335 us 39,849 us 14,784 us -11.3% +139.0%
bs=10 sw=10 sl=64 p99 35,335 us 39,849 us 18,468 us -11.3% +91.3%
bs=100 sw=10 sl=64 throughput 790 tuples/sec 819 tuples/sec 991.49 tuples/sec -3.5% -20.3%
bs=100 sw=10 sl=64 MB/s 0.482 MB/s 0.5 MB/s 0.605 MB/s -3.6% -20.4%
bs=100 sw=10 sl=64 p50 123,543 us 120,624 us 100,929 us +2.4% +22.4%
bs=100 sw=10 sl=64 p95 147,472 us 146,642 us 106,894 us +0.6% +38.0%
bs=100 sw=10 sl=64 p99 147,472 us 146,642 us 114,085 us +0.6% +29.3%
bs=1000 sw=10 sl=64 throughput 930 tuples/sec 943 tuples/sec 1,023 tuples/sec -1.4% -9.1%
bs=1000 sw=10 sl=64 MB/s 0.568 MB/s 0.576 MB/s 0.624 MB/s -1.4% -9.0%
bs=1000 sw=10 sl=64 p50 1,067,832 us 1,057,576 us 983,835 us +1.0% +8.5%
bs=1000 sw=10 sl=64 p95 1,109,928 us 1,100,863 us 1,023,777 us +0.8% +8.4%
bs=1000 sw=10 sl=64 p99 1,109,928 us 1,100,863 us 1,053,883 us +0.8% +5.3%
Raw CSV
config_idx,batch_size,schema_width,string_len,num_batches,total_ms,total_tuples,total_bytes,tuples_per_sec,mb_per_sec,lat_p50_us,lat_p95_us,lat_p99_us
0,10,10,64,20,538.04,200,128000,372,0.227,27057.19,35334.64,35334.64
1,100,10,64,20,2532.40,2000,1280000,790,0.482,123543.27,147472.25,147472.25
2,1000,10,64,20,21494.73,20000,12800000,930,0.568,1067831.60,1109927.91,1109927.91

- bound ChunkedInputStream.read by the buffer length, carrying the chunk
  remainder into the next call per the InputStream contract
- drop the ambiguous "(default)" qualifier from the schema-propagation
  test name (the annotation defaultValue disagrees with the field initializer)
@aglinxinyuan aglinxinyuan requested a review from mengw15 July 2, 2026 05:09

@mengw15 mengw15 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@aglinxinyuan aglinxinyuan added this pull request to the merge queue Jul 2, 2026
Merged via the queue into apache:main with commit 3d7486b Jul 2, 2026
23 checks passed
@aglinxinyuan aglinxinyuan deleted the test-operator-branch-coverage branch July 2, 2026 07:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants