test(workflow-operator): add unit test coverage for operator support utilities by aglinxinyuan · Pull Request #6077 · apache/texera

aglinxinyuan · 2026-07-03T07:03:24Z

What changes were proposed in this PR?

Add unit test coverage for three workflow-operator support utilities, selected from the Codecov report. No production-code changes.

File	Codecov before	Missed lines targeted
`TestOperators.scala`	26.3%	56 — every desc-factory helper (CSV/JSONL scans, hash-join, keyword-search, aggregate + group-by, AsterixDB source, python & python-source UDFs); asserts the returned descriptors' fields
`SpecialPhysicalOpFactory.scala`	0%	17 — `newSourcePhysicalOp`: decoded-port identity derivation, layer-name interpolation (incl. dash→underscore), empty input ports, output port + schema propagation, and the no-port-id failure
`FileScanUtils.scala`	62.9%	11 — the `extract == true` archive branch of `createTuplesFromFile`: zip-entry iteration, `__MACOSX` filtering, single-string vs per-line attribute paths

TestOperators is itself a shared test-helper factory (in src/main) used by many specs, but nothing asserted on its output, so its bodies read as uncovered. All tests run in-memory or against local temp files / committed test resources.

Any related issues, documentation, discussions?

Follow-up to the review feedback on #6043: prioritize tests that fill uncovered code paths.

How was this PR tested?

sbt "WorkflowOperator/testOnly *TestOperatorsSpec *SpecialPhysicalOpFactorySpec *FileScanUtilsSpec" — 16 tests, all green
sbt "WorkflowOperator/Test/scalafmtCheck" and sbt "WorkflowOperator/scalafixAll --check" — clean

Was this PR authored or co-authored using generative AI tooling?

Generated-by: Claude Code (Opus 4.8 [1M context])

…utilities Targets Codecov-missed lines: TestOperators (56, the shared desc-factory helper), SpecialPhysicalOpFactory (17, source physical-op wiring), and the extract branch of FileScanUtils.createTuplesFromFile (11). All in-memory or local temp files.

github-actions · 2026-07-03T07:03:39Z

Automated Reviewer Suggestions

Based on the git blame history of the changed files, we recommend the following reviewers:

No candidates found from git blame history.

Copilot

Pull request overview

Adds new ScalaTest unit specs in common/workflow-operator to increase coverage of previously-uncovered support utilities (no production-code changes).

Changes:

Add descriptor-construction assertions for TestOperators helpers (CSV/JSONL scans, join, keyword search, aggregate/group-by, AsterixDB, Python UDFs).
Add coverage for SpecialPhysicalOpFactory.newSourcePhysicalOp, including layer-name derivation and missing-port-id failure behavior.
Add coverage for FileScanUtils.createTuplesFromFile’s extract = true zip-archive branch.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File	Description
common/workflow-operator/src/test/scala/org/apache/texera/amber/operator/TestOperatorsSpec.scala	New unit spec asserting key fields on operator descriptors produced by `TestOperators` helpers.
common/workflow-operator/src/test/scala/org/apache/texera/amber/operator/SpecialPhysicalOpFactorySpec.scala	New unit spec validating `newSourcePhysicalOp` identity/port/schema wiring and error behavior.
common/workflow-operator/src/test/scala/org/apache/texera/amber/operator/source/scan/file/FileScanUtilsSpec.scala	New unit spec exercising zip extraction paths for single-string and per-line tuple creation.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

codecov-commenter · 2026-07-03T07:06:55Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 57.39%. Comparing base (a53d95a) to head (21a2ba9).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files

@@             Coverage Diff              @@
##               main    #6077      +/-   ##
============================================
+ Coverage     57.21%   57.39%   +0.18%     
+ Complexity     3103     3100       -3     
============================================
  Files          1130     1130              
  Lines         43825    43825              
  Branches       4747     4747              
============================================
+ Hits          25075    25154      +79     
+ Misses        17313    17232      -81     
- Partials       1437     1439       +2

Flag	Coverage Δ		*Carryforward flag
access-control-service	`70.00% <ø> (ø)`
agent-service	`44.59% <ø> (ø)`		Carriedforward from fa09f73
amber	`59.95% <ø> (+0.47%)`	⬆️
computing-unit-managing-service	`0.00% <ø> (ø)`		Carriedforward from fa09f73
config-service	`52.30% <ø> (ø)`
file-service	`62.81% <ø> (ø)`
frontend	`50.12% <ø> (ø)`		Carriedforward from fa09f73
notebook-migration-service	`78.57% <ø> (ø)`
pyamber	`91.15% <ø> (ø)`		Carriedforward from fa09f73
python	`90.69% <ø> (ø)`		Carriedforward from fa09f73
workflow-compiling-service	`55.14% <ø> (ø)`

*This pull request uses carry forward flags. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

github-actions · 2026-07-03T07:09:18Z

⚠️ Benchmark changes need a look

🟢 2 better · 🔴 5 worse · ⚪ 8 noise (<±5%) · 0 without baseline

Compared against main a53d95a benchmarked on this same runner, so the delta is largely free of cross-runner hardware noise. The "7d avg" column still reflects the gh-pages dashboard. Treat <±5% as noise unless repeated.

Dashboard · Run

	config	throughput	MB/s	latency	max Δ latest / 7d
🔴	bs=10 sw=10 sl=64	360	0.22	26,582/38,577/38,577 us	🔴 +11.3% / 🔴 +152.4%
🟢	bs=100 sw=10 sl=64	813	0.496	123,011/137,755/137,755 us	🟢 -9.0% / 🔴 +27.1%
⚪	bs=1000 sw=10 sl=64	913	0.557	1,099,530/1,157,884/1,157,884 us	⚪ within ±5% / 🔴 +11.7%

Baseline details

Latest main a53d95a from same runner

config	metric	PR	latest main	7d avg	Δ latest	Δ 7d
bs=10 sw=10 sl=64	throughput	360 tuples/sec	398 tuples/sec	770.95 tuples/sec	-9.5%	-53.3%
bs=10 sw=10 sl=64	MB/s	0.22 MB/s	0.243 MB/s	0.471 MB/s	-9.5%	-53.2%
bs=10 sw=10 sl=64	p50	26,582 us	24,637 us	12,775 us	+7.9%	+108.1%
bs=10 sw=10 sl=64	p95	38,577 us	34,659 us	15,286 us	+11.3%	+152.4%
bs=10 sw=10 sl=64	p99	38,577 us	34,659 us	18,795 us	+11.3%	+105.3%
bs=100 sw=10 sl=64	throughput	813 tuples/sec	806 tuples/sec	976.93 tuples/sec	+0.9%	-16.8%
bs=100 sw=10 sl=64	MB/s	0.496 MB/s	0.492 MB/s	0.596 MB/s	+0.8%	-16.8%
bs=100 sw=10 sl=64	p50	123,011 us	121,438 us	102,557 us	+1.3%	+19.9%
bs=100 sw=10 sl=64	p95	137,755 us	151,388 us	108,383 us	-9.0%	+27.1%
bs=100 sw=10 sl=64	p99	137,755 us	151,388 us	115,249 us	-9.0%	+19.5%
bs=1000 sw=10 sl=64	throughput	913 tuples/sec	928 tuples/sec	1,009 tuples/sec	-1.6%	-9.5%
bs=1000 sw=10 sl=64	MB/s	0.557 MB/s	0.566 MB/s	0.616 MB/s	-1.6%	-9.6%
bs=1000 sw=10 sl=64	p50	1,099,530 us	1,080,227 us	997,695 us	+1.8%	+10.2%
bs=1000 sw=10 sl=64	p95	1,157,884 us	1,103,798 us	1,036,731 us	+4.9%	+11.7%
bs=1000 sw=10 sl=64	p99	1,157,884 us	1,103,798 us	1,069,334 us	+4.9%	+8.3%

Raw CSV

config_idx,batch_size,schema_width,string_len,num_batches,total_ms,total_tuples,total_bytes,tuples_per_sec,mb_per_sec,lat_p50_us,lat_p95_us,lat_p99_us
0,10,10,64,20,555.15,200,128000,360,0.220,26581.96,38577.39,38577.39
1,100,10,64,20,2460.03,2000,1280000,813,0.496,123010.99,137754.73,137754.73
2,1000,10,64,20,21913.51,20000,12800000,913,0.557,1099530.11,1157884.33,1157884.33

- assert the actual decoded entry contents (not just tuple counts) for the extract, __MACOSX-filter, and per-line flat-map cases - fix a stray apostrophe in the SpecialPhysicalOpFactory layerName comment

mengw15

LGTM

Copilot AI review requested due to automatic review settings July 3, 2026 07:03

github-actions Bot assigned aglinxinyuan Jul 3, 2026

github-actions Bot added the common label Jul 3, 2026

Copilot started reviewing on behalf of aglinxinyuan July 3, 2026 07:03 View session

Copilot AI reviewed Jul 3, 2026

View reviewed changes

test(workflow-operator): assert extracted content in FileScanUtils tests

21a2ba9

- assert the actual decoded entry contents (not just tuple counts) for the extract, __MACOSX-filter, and per-line flat-map cases - fix a stray apostrophe in the SpecialPhysicalOpFactory layerName comment

aglinxinyuan requested a review from mengw15 July 3, 2026 07:41

mengw15 approved these changes Jul 3, 2026

View reviewed changes

aglinxinyuan added this pull request to the merge queue Jul 3, 2026

Merged via the queue into apache:main with commit ad6ea02 Jul 3, 2026
38 of 40 checks passed

aglinxinyuan deleted the test-operator-factory-helpers branch July 3, 2026 09:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

test(workflow-operator): add unit test coverage for operator support utilities#6077

test(workflow-operator): add unit test coverage for operator support utilities#6077
aglinxinyuan merged 2 commits into
apache:mainfrom
aglinxinyuan:test-operator-factory-helpers

aglinxinyuan commented Jul 3, 2026

Uh oh!

github-actions Bot commented Jul 3, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov-commenter commented Jul 3, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jul 3, 2026 •

edited

Loading

Uh oh!

mengw15 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

aglinxinyuan commented Jul 3, 2026

What changes were proposed in this PR?

Any related issues, documentation, discussions?

How was this PR tested?

Was this PR authored or co-authored using generative AI tooling?

Uh oh!

github-actions Bot commented Jul 3, 2026

Automated Reviewer Suggestions

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov-commenter commented Jul 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

github-actions Bot commented Jul 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ Benchmark changes need a look

Uh oh!

mengw15 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

codecov-commenter commented Jul 3, 2026 •

edited

Loading

github-actions Bot commented Jul 3, 2026 •

edited

Loading