Skip to content

test(amber): move live-catalog iceberg specs to the integration suite#6045

Merged
mengw15 merged 2 commits into
apache:mainfrom
mengw15:test-6034-iceberg-specs-to-integration
Jul 1, 2026
Merged

test(amber): move live-catalog iceberg specs to the integration suite#6045
mengw15 merged 2 commits into
apache:mainfrom
mengw15:test-6034-iceberg-specs-to-integration

Conversation

@mengw15

@mengw15 mengw15 commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

What changes were proposed in this PR?

Three Iceberg specs — IcebergDocumentSpec, IcebergDocumentConsoleMessagesSpec, and IcebergTableStatsSpec — resolve a live Iceberg catalog in their per-test setup (beforeEach/beforeAll create documents through DocumentFactory / IcebergCatalogInstance.getInstance()), so none of their cases can run without a catalog. Today the amber unit job runs them against the postgres catalog it provisions; once that job stops provisioning one they would fail.

This PR moves them from common/workflow-core/src/test/scala/... to amber/src/test/integration/... — where the @IntegrationTest filter (AMBER_TEST_FILTER) already applies and IcebergRestCatalogIntegrationSpec already lives — and tags the three concrete specs @IntegrationTest.

Tagging in place is not sufficient: common/workflow-core/build.sbt does not read AMBER_TEST_FILTER, so @IntegrationTest would be ignored there.

The shared VirtualDocumentSpec base (extended by the first two specs) moves with them, into amber/src/test/scala — it is a generic AnyFlatSpec harness rather than an integration test, so it lives in amber's regular test sources, not the integration set. (Keeping it in common and exposing it via WorkflowCore % "test->test" was tried but reverted: WorkflowCore's test scope pulls Hadoop → com.sun.jersey 1.19, whose bundled JAX-RS 1.x Response shadows amber's javax.ws.rs-api:2.1.1 and breaks HuggingFaceModelResourceSpec. Longer term, generalizing amber-integration into a shared integration CI job would let the specs live in common next to the code they exercise.)

IcebergUtilSpec intentionally stays in common/workflow-core. Unlike the three above it extends AnyFlatSpec with no catalog setup and never calls getInstance(); its cases are pure Texera↔Iceberg type/schema/tuple conversions plus one REST-endpoint failure check whose own comment documents it as passing "with or without Lakekeeper reachable." It is a unit test and must keep running in the amber unit job.

Any related issues, documentation, discussions?

Part of #6034 (Task 1 of 3). Does not close it — Tasks 2 (flip the storage.conf catalog default to rest) and 3 (mirror prod in the CI integration suite) follow, and Task 1 must land first.

How was this PR tested?

Test-file relocation with no production-code change, so verification is at the compile/placement level:

  • sbt WorkflowExecutionService/Test/compile and sbt WorkflowCore/Test/compile pass locally (the specs move cleanly and amber's test scope compiles).
  • sbt scalafmtCheckAll passes.
  • build / amber and build / amber-integration (ubuntu + macos) are green on CI, compiling the integration source set and running the specs against a live catalog.
  • Behavior is unchanged; the three specs continue to run in the amber-integration job.

Was this PR authored or co-authored using generative AI tooling?

Generated-by: Claude Code (claude-opus-4-8)

@github-actions

github-actions Bot commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

Automated Reviewer Suggestions

Based on the git blame history of the changed files, we recommend the following reviewers:

  • Contributors with relevant context: @Yicong-Huang, @Ma77Ball, @aglinxinyuan
    You can notify them by mentioning @Yicong-Huang, @Ma77Ball, @aglinxinyuan in a comment.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Moves Iceberg specs that require a live Iceberg catalog out of common/workflow-core unit tests and into Amber’s integration test suite, so the amber unit CI job no longer implicitly depends on a catalog being provisioned (preparing for the catalog default flip in #6034).

Changes:

  • Add @IntegrationTest tagging to IcebergDocumentSpec, IcebergDocumentConsoleMessagesSpec, and IcebergTableStatsSpec so they run only under the integration filter.
  • Relocate the shared VirtualDocumentSpec base into amber/src/test/integration/... alongside the Iceberg integration specs.
  • Keep package names unchanged so this is effectively a move/retag for CI routing rather than a behavior change.

Reviewed changes

Copilot reviewed 3 out of 4 changed files in this pull request and generated no comments.

File Description
amber/src/test/integration/org/apache/texera/amber/storage/result/iceberg/IcebergTableStatsSpec.scala Tags the spec as @IntegrationTest so it’s routed to the integration job.
amber/src/test/integration/org/apache/texera/amber/storage/result/iceberg/IcebergDocumentSpec.scala Tags the spec as @IntegrationTest so it’s routed to the integration job.
amber/src/test/integration/org/apache/texera/amber/storage/result/iceberg/IcebergDocumentConsoleMessagesSpec.scala Tags the spec as @IntegrationTest so it’s routed to the integration job.
amber/src/test/integration/org/apache/texera/amber/core/storage/model/VirtualDocumentSpec.scala Provides the shared VirtualDocument test behavior used by the moved Iceberg specs.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@github-actions

github-actions Bot commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

⚠️ Benchmark changes need a look

🟢 0 better · 🔴 3 worse · ⚪ 12 noise (<±5%) · 0 without baseline

Compared against main baca3f9 benchmarked on this same runner, so the delta is largely free of cross-runner hardware noise. The "7d avg" column still reflects the gh-pages dashboard. Treat <±5% as noise unless repeated.

Dashboard · Run

config throughput MB/s latency max Δ latest / 7d
🔴 bs=10 sw=10 sl=64 386 0.236 25,428/36,315/36,315 us 🔴 +16.7% / 🔴 +145.6%
bs=100 sw=10 sl=64 794 0.485 124,580/150,705/150,705 us ⚪ within ±5% / 🔴 +41.0%
bs=1000 sw=10 sl=64 917 0.559 1,088,306/1,162,631/1,162,631 us ⚪ within ±5% / 🔴 +13.6%
Baseline details

Latest main baca3f9 from same runner

config metric PR latest main 7d avg Δ latest Δ 7d
bs=10 sw=10 sl=64 throughput 386 tuples/sec 446 tuples/sec 786.27 tuples/sec -13.5% -50.9%
bs=10 sw=10 sl=64 MB/s 0.236 MB/s 0.272 MB/s 0.48 MB/s -13.2% -50.8%
bs=10 sw=10 sl=64 p50 25,428 us 21,791 us 12,495 us +16.7% +103.5%
bs=10 sw=10 sl=64 p95 36,315 us 34,730 us 14,784 us +4.6% +145.6%
bs=10 sw=10 sl=64 p99 36,315 us 34,730 us 18,468 us +4.6% +96.6%
bs=100 sw=10 sl=64 throughput 794 tuples/sec 820 tuples/sec 991.49 tuples/sec -3.2% -19.9%
bs=100 sw=10 sl=64 MB/s 0.485 MB/s 0.5 MB/s 0.605 MB/s -3.0% -19.9%
bs=100 sw=10 sl=64 p50 124,580 us 119,644 us 100,929 us +4.1% +23.4%
bs=100 sw=10 sl=64 p95 150,705 us 156,383 us 106,894 us -3.6% +41.0%
bs=100 sw=10 sl=64 p99 150,705 us 156,383 us 114,085 us -3.6% +32.1%
bs=1000 sw=10 sl=64 throughput 917 tuples/sec 946 tuples/sec 1,023 tuples/sec -3.1% -10.4%
bs=1000 sw=10 sl=64 MB/s 0.559 MB/s 0.577 MB/s 0.624 MB/s -3.1% -10.5%
bs=1000 sw=10 sl=64 p50 1,088,306 us 1,049,787 us 983,835 us +3.7% +10.6%
bs=1000 sw=10 sl=64 p95 1,162,631 us 1,116,900 us 1,023,777 us +4.1% +13.6%
bs=1000 sw=10 sl=64 p99 1,162,631 us 1,116,900 us 1,053,883 us +4.1% +10.3%
Raw CSV
config_idx,batch_size,schema_width,string_len,num_batches,total_ms,total_tuples,total_bytes,tuples_per_sec,mb_per_sec,lat_p50_us,lat_p95_us,lat_p99_us
0,10,10,64,20,517.77,200,128000,386,0.236,25428.09,36315.25,36315.25
1,100,10,64,20,2519.43,2000,1280000,794,0.485,124580.33,150704.91,150704.91
2,1000,10,64,20,21819.64,20000,12800000,917,0.559,1088305.87,1162631.23,1162631.23

@mengw15 mengw15 requested a review from Yicong-Huang July 1, 2026 07:29
@mengw15 mengw15 force-pushed the test-6034-iceberg-specs-to-integration branch from 2f879d4 to 285dd05 Compare July 1, 2026 08:34
@github-actions github-actions Bot added dependencies Pull requests that update a dependency file common labels Jul 1, 2026
@codecov-commenter

codecov-commenter commented Jul 1, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 56.12%. Comparing base (baca3f9) to head (c2c54d3).
✅ All tests successful. No failed tests found.

Additional details and impacted files
@@             Coverage Diff              @@
##               main    #6045      +/-   ##
============================================
- Coverage     56.92%   56.12%   -0.81%     
+ Complexity     3029     2986      -43     
============================================
  Files          1129     1126       -3     
  Lines         43794    43356     -438     
  Branches       4743     4669      -74     
============================================
- Hits          24931    24333     -598     
- Misses        17388    17603     +215     
+ Partials       1475     1420      -55     
Flag Coverage Δ *Carryforward flag
access-control-service 70.00% <ø> (ø) Carriedforward from 551cf35
agent-service 44.59% <ø> (ø) Carriedforward from 551cf35
amber 57.27% <ø> (-1.52%) ⬇️
computing-unit-managing-service 0.00% <ø> (ø) Carriedforward from 551cf35
config-service 52.30% <ø> (ø) Carriedforward from 551cf35
file-service 62.81% <ø> (ø) Carriedforward from 551cf35
frontend 49.39% <ø> (-0.71%) ⬇️ Carriedforward from 551cf35
notebook-migration-service 78.57% <ø> (ø) Carriedforward from 551cf35
pyamber 90.20% <ø> (ø) Carriedforward from 551cf35
python 90.76% <ø> (ø) Carriedforward from 551cf35
workflow-compiling-service 55.14% <ø> (ø) Carriedforward from 551cf35

*This pull request uses carry forward flags. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

IcebergDocumentSpec, IcebergDocumentConsoleMessagesSpec and
IcebergTableStatsSpec resolve a live Iceberg catalog in their per-test
setup (DocumentFactory / IcebergCatalogInstance.getInstance()), so every
case needs a catalog. Move them from common/workflow-core's test sources
into amber/src/test/integration -- where the @IntegrationTEST filter
(AMBER_TEST_FILTER) applies and IcebergRestCatalogIntegrationSpec already
lives -- and tag the three concrete specs @IntegrationTEST. Tagging in
place would not work: common/workflow-core/build.sbt does not read
AMBER_TEST_FILTER.

The shared VirtualDocumentSpec base the first two extend moves with them,
into amber/src/test/scala -- it is a generic AnyFlatSpec harness (not an
integration test), so it sits in the regular test sources rather than the
integration set.

IcebergUtilSpec stays in common/workflow-core: it has no catalog setup and
its cases are pure conversions plus one REST failure check documented as
passing with or without Lakekeeper reachable, so it must keep running in
the amber unit job.

Part of apache#6034.
@mengw15 mengw15 force-pushed the test-6034-iceberg-specs-to-integration branch from 285dd05 to cc55ed0 Compare July 1, 2026 09:37
@github-actions github-actions Bot added engine and removed dependencies Pull requests that update a dependency file common labels Jul 1, 2026
@mengw15 mengw15 added this pull request to the merge queue Jul 1, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Jul 1, 2026
@mengw15 mengw15 added this pull request to the merge queue Jul 1, 2026
Merged via the queue into apache:main with commit a1a7eb0 Jul 1, 2026
18 checks passed
@mengw15 mengw15 deleted the test-6034-iceberg-specs-to-integration branch July 1, 2026 19:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants