test(workflow-operator): add unit test coverage for operator support utilities#6077
Conversation
…utilities Targets Codecov-missed lines: TestOperators (56, the shared desc-factory helper), SpecialPhysicalOpFactory (17, source physical-op wiring), and the extract branch of FileScanUtils.createTuplesFromFile (11). All in-memory or local temp files.
Automated Reviewer SuggestionsBased on the
|
There was a problem hiding this comment.
Pull request overview
Adds new ScalaTest unit specs in common/workflow-operator to increase coverage of previously-uncovered support utilities (no production-code changes).
Changes:
- Add descriptor-construction assertions for
TestOperatorshelpers (CSV/JSONL scans, join, keyword search, aggregate/group-by, AsterixDB, Python UDFs). - Add coverage for
SpecialPhysicalOpFactory.newSourcePhysicalOp, including layer-name derivation and missing-port-id failure behavior. - Add coverage for
FileScanUtils.createTuplesFromFile’sextract = truezip-archive branch.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| common/workflow-operator/src/test/scala/org/apache/texera/amber/operator/TestOperatorsSpec.scala | New unit spec asserting key fields on operator descriptors produced by TestOperators helpers. |
| common/workflow-operator/src/test/scala/org/apache/texera/amber/operator/SpecialPhysicalOpFactorySpec.scala | New unit spec validating newSourcePhysicalOp identity/port/schema wiring and error behavior. |
| common/workflow-operator/src/test/scala/org/apache/texera/amber/operator/source/scan/file/FileScanUtilsSpec.scala | New unit spec exercising zip extraction paths for single-string and per-line tuple creation. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #6077 +/- ##
============================================
+ Coverage 57.21% 57.39% +0.18%
+ Complexity 3103 3100 -3
============================================
Files 1130 1130
Lines 43825 43825
Branches 4747 4747
============================================
+ Hits 25075 25154 +79
+ Misses 17313 17232 -81
- Partials 1437 1439 +2
*This pull request uses carry forward flags. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
|
| config | throughput | MB/s | latency | max Δ latest / 7d | |
|---|---|---|---|---|---|
| 🔴 | bs=10 sw=10 sl=64 | 360 | 0.22 | 26,582/38,577/38,577 us | 🔴 +11.3% / 🔴 +152.4% |
| 🟢 | bs=100 sw=10 sl=64 | 813 | 0.496 | 123,011/137,755/137,755 us | 🟢 -9.0% / 🔴 +27.1% |
| ⚪ | bs=1000 sw=10 sl=64 | 913 | 0.557 | 1,099,530/1,157,884/1,157,884 us | ⚪ within ±5% / 🔴 +11.7% |
Baseline details
Latest main a53d95a from same runner
| config | metric | PR | latest main | 7d avg | Δ latest | Δ 7d |
|---|---|---|---|---|---|---|
| bs=10 sw=10 sl=64 | throughput | 360 tuples/sec | 398 tuples/sec | 770.95 tuples/sec | -9.5% | -53.3% |
| bs=10 sw=10 sl=64 | MB/s | 0.22 MB/s | 0.243 MB/s | 0.471 MB/s | -9.5% | -53.2% |
| bs=10 sw=10 sl=64 | p50 | 26,582 us | 24,637 us | 12,775 us | +7.9% | +108.1% |
| bs=10 sw=10 sl=64 | p95 | 38,577 us | 34,659 us | 15,286 us | +11.3% | +152.4% |
| bs=10 sw=10 sl=64 | p99 | 38,577 us | 34,659 us | 18,795 us | +11.3% | +105.3% |
| bs=100 sw=10 sl=64 | throughput | 813 tuples/sec | 806 tuples/sec | 976.93 tuples/sec | +0.9% | -16.8% |
| bs=100 sw=10 sl=64 | MB/s | 0.496 MB/s | 0.492 MB/s | 0.596 MB/s | +0.8% | -16.8% |
| bs=100 sw=10 sl=64 | p50 | 123,011 us | 121,438 us | 102,557 us | +1.3% | +19.9% |
| bs=100 sw=10 sl=64 | p95 | 137,755 us | 151,388 us | 108,383 us | -9.0% | +27.1% |
| bs=100 sw=10 sl=64 | p99 | 137,755 us | 151,388 us | 115,249 us | -9.0% | +19.5% |
| bs=1000 sw=10 sl=64 | throughput | 913 tuples/sec | 928 tuples/sec | 1,009 tuples/sec | -1.6% | -9.5% |
| bs=1000 sw=10 sl=64 | MB/s | 0.557 MB/s | 0.566 MB/s | 0.616 MB/s | -1.6% | -9.6% |
| bs=1000 sw=10 sl=64 | p50 | 1,099,530 us | 1,080,227 us | 997,695 us | +1.8% | +10.2% |
| bs=1000 sw=10 sl=64 | p95 | 1,157,884 us | 1,103,798 us | 1,036,731 us | +4.9% | +11.7% |
| bs=1000 sw=10 sl=64 | p99 | 1,157,884 us | 1,103,798 us | 1,069,334 us | +4.9% | +8.3% |
Raw CSV
config_idx,batch_size,schema_width,string_len,num_batches,total_ms,total_tuples,total_bytes,tuples_per_sec,mb_per_sec,lat_p50_us,lat_p95_us,lat_p99_us
0,10,10,64,20,555.15,200,128000,360,0.220,26581.96,38577.39,38577.39
1,100,10,64,20,2460.03,2000,1280000,813,0.496,123010.99,137754.73,137754.73
2,1000,10,64,20,21913.51,20000,12800000,913,0.557,1099530.11,1157884.33,1157884.33- assert the actual decoded entry contents (not just tuple counts) for the extract, __MACOSX-filter, and per-line flat-map cases - fix a stray apostrophe in the SpecialPhysicalOpFactory layerName comment
What changes were proposed in this PR?
Add unit test coverage for three
workflow-operatorsupport utilities, selected from the Codecov report. No production-code changes.TestOperators.scalaSpecialPhysicalOpFactory.scalanewSourcePhysicalOp: decoded-port identity derivation, layer-name interpolation (incl. dash→underscore), empty input ports, output port + schema propagation, and the no-port-id failureFileScanUtils.scalaextract == truearchive branch ofcreateTuplesFromFile: zip-entry iteration,__MACOSXfiltering, single-string vs per-line attribute pathsTestOperatorsis itself a shared test-helper factory (insrc/main) used by many specs, but nothing asserted on its output, so its bodies read as uncovered. All tests run in-memory or against local temp files / committed test resources.Any related issues, documentation, discussions?
Follow-up to the review feedback on #6043: prioritize tests that fill uncovered code paths.
How was this PR tested?
sbt "WorkflowOperator/testOnly *TestOperatorsSpec *SpecialPhysicalOpFactorySpec *FileScanUtilsSpec"— 16 tests, all greensbt "WorkflowOperator/Test/scalafmtCheck"andsbt "WorkflowOperator/scalafixAll --check"— cleanWas this PR authored or co-authored using generative AI tooling?
Generated-by: Claude Code (Opus 4.8 [1M context])