test(amber): move live-catalog iceberg specs to the integration suite#6045
Conversation
Automated Reviewer SuggestionsBased on the
|
There was a problem hiding this comment.
Pull request overview
Moves Iceberg specs that require a live Iceberg catalog out of common/workflow-core unit tests and into Amber’s integration test suite, so the amber unit CI job no longer implicitly depends on a catalog being provisioned (preparing for the catalog default flip in #6034).
Changes:
- Add
@IntegrationTesttagging toIcebergDocumentSpec,IcebergDocumentConsoleMessagesSpec, andIcebergTableStatsSpecso they run only under the integration filter. - Relocate the shared
VirtualDocumentSpecbase intoamber/src/test/integration/...alongside the Iceberg integration specs. - Keep package names unchanged so this is effectively a move/retag for CI routing rather than a behavior change.
Reviewed changes
Copilot reviewed 3 out of 4 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| amber/src/test/integration/org/apache/texera/amber/storage/result/iceberg/IcebergTableStatsSpec.scala | Tags the spec as @IntegrationTest so it’s routed to the integration job. |
| amber/src/test/integration/org/apache/texera/amber/storage/result/iceberg/IcebergDocumentSpec.scala | Tags the spec as @IntegrationTest so it’s routed to the integration job. |
| amber/src/test/integration/org/apache/texera/amber/storage/result/iceberg/IcebergDocumentConsoleMessagesSpec.scala | Tags the spec as @IntegrationTest so it’s routed to the integration job. |
| amber/src/test/integration/org/apache/texera/amber/core/storage/model/VirtualDocumentSpec.scala | Provides the shared VirtualDocument test behavior used by the moved Iceberg specs. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
| config | throughput | MB/s | latency | max Δ latest / 7d | |
|---|---|---|---|---|---|
| 🔴 | bs=10 sw=10 sl=64 | 386 | 0.236 | 25,428/36,315/36,315 us | 🔴 +16.7% / 🔴 +145.6% |
| ⚪ | bs=100 sw=10 sl=64 | 794 | 0.485 | 124,580/150,705/150,705 us | ⚪ within ±5% / 🔴 +41.0% |
| ⚪ | bs=1000 sw=10 sl=64 | 917 | 0.559 | 1,088,306/1,162,631/1,162,631 us | ⚪ within ±5% / 🔴 +13.6% |
Baseline details
Latest main baca3f9 from same runner
| config | metric | PR | latest main | 7d avg | Δ latest | Δ 7d |
|---|---|---|---|---|---|---|
| bs=10 sw=10 sl=64 | throughput | 386 tuples/sec | 446 tuples/sec | 786.27 tuples/sec | -13.5% | -50.9% |
| bs=10 sw=10 sl=64 | MB/s | 0.236 MB/s | 0.272 MB/s | 0.48 MB/s | -13.2% | -50.8% |
| bs=10 sw=10 sl=64 | p50 | 25,428 us | 21,791 us | 12,495 us | +16.7% | +103.5% |
| bs=10 sw=10 sl=64 | p95 | 36,315 us | 34,730 us | 14,784 us | +4.6% | +145.6% |
| bs=10 sw=10 sl=64 | p99 | 36,315 us | 34,730 us | 18,468 us | +4.6% | +96.6% |
| bs=100 sw=10 sl=64 | throughput | 794 tuples/sec | 820 tuples/sec | 991.49 tuples/sec | -3.2% | -19.9% |
| bs=100 sw=10 sl=64 | MB/s | 0.485 MB/s | 0.5 MB/s | 0.605 MB/s | -3.0% | -19.9% |
| bs=100 sw=10 sl=64 | p50 | 124,580 us | 119,644 us | 100,929 us | +4.1% | +23.4% |
| bs=100 sw=10 sl=64 | p95 | 150,705 us | 156,383 us | 106,894 us | -3.6% | +41.0% |
| bs=100 sw=10 sl=64 | p99 | 150,705 us | 156,383 us | 114,085 us | -3.6% | +32.1% |
| bs=1000 sw=10 sl=64 | throughput | 917 tuples/sec | 946 tuples/sec | 1,023 tuples/sec | -3.1% | -10.4% |
| bs=1000 sw=10 sl=64 | MB/s | 0.559 MB/s | 0.577 MB/s | 0.624 MB/s | -3.1% | -10.5% |
| bs=1000 sw=10 sl=64 | p50 | 1,088,306 us | 1,049,787 us | 983,835 us | +3.7% | +10.6% |
| bs=1000 sw=10 sl=64 | p95 | 1,162,631 us | 1,116,900 us | 1,023,777 us | +4.1% | +13.6% |
| bs=1000 sw=10 sl=64 | p99 | 1,162,631 us | 1,116,900 us | 1,053,883 us | +4.1% | +10.3% |
Raw CSV
config_idx,batch_size,schema_width,string_len,num_batches,total_ms,total_tuples,total_bytes,tuples_per_sec,mb_per_sec,lat_p50_us,lat_p95_us,lat_p99_us
0,10,10,64,20,517.77,200,128000,386,0.236,25428.09,36315.25,36315.25
1,100,10,64,20,2519.43,2000,1280000,794,0.485,124580.33,150704.91,150704.91
2,1000,10,64,20,21819.64,20000,12800000,917,0.559,1088305.87,1162631.23,1162631.232f879d4 to
285dd05
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #6045 +/- ##
============================================
- Coverage 56.92% 56.12% -0.81%
+ Complexity 3029 2986 -43
============================================
Files 1129 1126 -3
Lines 43794 43356 -438
Branches 4743 4669 -74
============================================
- Hits 24931 24333 -598
- Misses 17388 17603 +215
+ Partials 1475 1420 -55
*This pull request uses carry forward flags. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
IcebergDocumentSpec, IcebergDocumentConsoleMessagesSpec and IcebergTableStatsSpec resolve a live Iceberg catalog in their per-test setup (DocumentFactory / IcebergCatalogInstance.getInstance()), so every case needs a catalog. Move them from common/workflow-core's test sources into amber/src/test/integration -- where the @IntegrationTEST filter (AMBER_TEST_FILTER) applies and IcebergRestCatalogIntegrationSpec already lives -- and tag the three concrete specs @IntegrationTEST. Tagging in place would not work: common/workflow-core/build.sbt does not read AMBER_TEST_FILTER. The shared VirtualDocumentSpec base the first two extend moves with them, into amber/src/test/scala -- it is a generic AnyFlatSpec harness (not an integration test), so it sits in the regular test sources rather than the integration set. IcebergUtilSpec stays in common/workflow-core: it has no catalog setup and its cases are pure conversions plus one REST failure check documented as passing with or without Lakekeeper reachable, so it must keep running in the amber unit job. Part of apache#6034.
285dd05 to
cc55ed0
Compare
What changes were proposed in this PR?
Three Iceberg specs —
IcebergDocumentSpec,IcebergDocumentConsoleMessagesSpec, andIcebergTableStatsSpec— resolve a live Iceberg catalog in their per-test setup (beforeEach/beforeAllcreate documents throughDocumentFactory/IcebergCatalogInstance.getInstance()), so none of their cases can run without a catalog. Today the amber unit job runs them against the postgres catalog it provisions; once that job stops provisioning one they would fail.This PR moves them from
common/workflow-core/src/test/scala/...toamber/src/test/integration/...— where the@IntegrationTestfilter (AMBER_TEST_FILTER) already applies andIcebergRestCatalogIntegrationSpecalready lives — and tags the three concrete specs@IntegrationTest.Tagging in place is not sufficient:
common/workflow-core/build.sbtdoes not readAMBER_TEST_FILTER, so@IntegrationTestwould be ignored there.The shared
VirtualDocumentSpecbase (extended by the first two specs) moves with them, intoamber/src/test/scala— it is a genericAnyFlatSpecharness rather than an integration test, so it lives in amber's regular test sources, not the integration set. (Keeping it incommonand exposing it viaWorkflowCore % "test->test"was tried but reverted: WorkflowCore's test scope pulls Hadoop →com.sun.jersey1.19, whose bundled JAX-RS 1.xResponseshadows amber'sjavax.ws.rs-api:2.1.1and breaksHuggingFaceModelResourceSpec. Longer term, generalizingamber-integrationinto a sharedintegrationCI job would let the specs live incommonnext to the code they exercise.)IcebergUtilSpecintentionally stays incommon/workflow-core. Unlike the three above itextends AnyFlatSpecwith no catalog setup and never callsgetInstance(); its cases are pure Texera↔Iceberg type/schema/tuple conversions plus one REST-endpoint failure check whose own comment documents it as passing "with or without Lakekeeper reachable." It is a unit test and must keep running in the amber unit job.Any related issues, documentation, discussions?
Part of #6034 (Task 1 of 3). Does not close it — Tasks 2 (flip the
storage.confcatalog default torest) and 3 (mirror prod in the CI integration suite) follow, and Task 1 must land first.How was this PR tested?
Test-file relocation with no production-code change, so verification is at the compile/placement level:
sbt WorkflowExecutionService/Test/compileandsbt WorkflowCore/Test/compilepass locally (the specs move cleanly and amber's test scope compiles).sbt scalafmtCheckAllpasses.build / amberandbuild / amber-integration(ubuntu + macos) are green on CI, compiling the integration source set and running the specs against a live catalog.amber-integrationjob.Was this PR authored or co-authored using generative AI tooling?
Generated-by: Claude Code (claude-opus-4-8)