feat(huggingface): add qa and ranking tasks#5574
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #5574 +/- ##
============================================
+ Coverage 54.09% 54.10% +0.01%
- Complexity 2817 2818 +1
============================================
Files 1103 1104 +1
Lines 42650 42679 +29
Branches 4588 4591 +3
============================================
+ Hits 23070 23090 +20
- Misses 18236 18244 +8
- Partials 1344 1345 +1
*This pull request uses carry forward flags. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
|
/request-review @Ma77Ball |
8507ca5 to
cd08b3a
Compare
|
| config | throughput | MB/s | latency | max Δ latest / 7d | |
|---|---|---|---|---|---|
| 🔴 | bs=10 sw=10 sl=64 | 377 | 0.23 | 25,621/32,668/32,668 us | 🔴 +20.7% / 🔴 -8.3% |
| 🔴 | bs=100 sw=10 sl=64 | 804 | 0.491 | 120,319/172,171/172,171 us | 🔴 +6.8% / 🔴 +23.2% |
| ⚪ | bs=1000 sw=10 sl=64 | 932 | 0.569 | 1,072,268/1,137,669/1,137,669 us | ⚪ within ±5% / 🔴 +11.2% |
Baseline details
Latest main 439ea72 from same runner
| config | metric | PR | latest main | 7d avg | Δ latest | Δ 7d |
|---|---|---|---|---|---|---|
| bs=10 sw=10 sl=64 | throughput | 377 tuples/sec | 442 tuples/sec | 410.82 tuples/sec | -14.7% | -8.2% |
| bs=10 sw=10 sl=64 | MB/s | 0.23 MB/s | 0.27 MB/s | 0.251 MB/s | -14.8% | -8.3% |
| bs=10 sw=10 sl=64 | p50 | 25,621 us | 21,222 us | 23,785 us | +20.7% | +7.7% |
| bs=10 sw=10 sl=64 | p95 | 32,668 us | 34,399 us | 34,980 us | -5.0% | -6.6% |
| bs=10 sw=10 sl=64 | p99 | 32,668 us | 34,399 us | 34,980 us | -5.0% | -6.6% |
| bs=100 sw=10 sl=64 | throughput | 804 tuples/sec | 809 tuples/sec | 891.94 tuples/sec | -0.6% | -9.9% |
| bs=100 sw=10 sl=64 | MB/s | 0.491 MB/s | 0.494 MB/s | 0.544 MB/s | -0.6% | -9.8% |
| bs=100 sw=10 sl=64 | p50 | 120,319 us | 119,336 us | 112,277 us | +0.8% | +7.2% |
| bs=100 sw=10 sl=64 | p95 | 172,171 us | 161,155 us | 139,802 us | +6.8% | +23.2% |
| bs=100 sw=10 sl=64 | p99 | 172,171 us | 161,155 us | 139,802 us | +6.8% | +23.2% |
| bs=1000 sw=10 sl=64 | throughput | 932 tuples/sec | 935 tuples/sec | 1,041 tuples/sec | -0.3% | -10.5% |
| bs=1000 sw=10 sl=64 | MB/s | 0.569 MB/s | 0.571 MB/s | 0.635 MB/s | -0.4% | -10.4% |
| bs=1000 sw=10 sl=64 | p50 | 1,072,268 us | 1,073,831 us | 972,714 us | -0.1% | +10.2% |
| bs=1000 sw=10 sl=64 | p95 | 1,137,669 us | 1,103,851 us | 1,023,057 us | +3.1% | +11.2% |
| bs=1000 sw=10 sl=64 | p99 | 1,137,669 us | 1,103,851 us | 1,023,057 us | +3.1% | +11.2% |
Raw CSV
config_idx,batch_size,schema_width,string_len,num_batches,total_ms,total_tuples,total_bytes,tuples_per_sec,mb_per_sec,lat_p50_us,lat_p95_us,lat_p99_us
0,10,10,64,20,530.56,200,128000,377,0.230,25620.89,32667.95,32667.95
1,100,10,64,20,2488.02,2000,1280000,804,0.491,120319.12,172171.24,172171.24
2,1000,10,64,20,21465.84,20000,12800000,932,0.569,1072268.20,1137668.58,1137668.58|
/request-review @Ma77Ball |
c354b43 to
cb6cf1e
Compare
b81c325 to
ce76952
Compare
ce76952 to
292b800
Compare
There was a problem hiding this comment.
Pull request overview
Adds additional HuggingFace task-family support to the workflow-operator HuggingFace inference operator by extending the Scala descriptor/codegen dispatcher and the shared Python codegen template, plus a small JWT parsing hardening improvement in auth.
Changes:
- Add QA/ranking/classification task family support via new
QaRankingCodegenand new descriptor/context fields (contextColumn,candidateLabels,sentencesColumn). - Extend shared
PythonCodegenBaseto validate new task-specific inputs and to support additional payload/parse paths (including audio/media-gen paths present in this stacked diff). - Harden JWT claim extraction by returning
Optional.empty()when required custom claims are missing, plus add regression tests.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| common/workflow-operator/src/test/scala/org/apache/texera/amber/operator/huggingFace/HuggingFaceInferenceOpDescSpec.scala | Adds dispatcher/codegen coverage tests for QA/ranking (and stacked audio/media-gen cases). |
| common/workflow-operator/src/main/scala/org/apache/texera/amber/operator/huggingFace/HuggingFaceInferenceOpDesc.scala | Adds new operator JSON fields and registers new task-family codegens in the dispatcher. |
| common/workflow-operator/src/main/scala/org/apache/texera/amber/operator/huggingFace/codegen/TaskCodegen.scala | Extends CodegenContext with QA/ranking/audio-related fields. |
| common/workflow-operator/src/main/scala/org/apache/texera/amber/operator/huggingFace/codegen/QaRankingCodegen.scala | Implements payload + response parsing snippets for QA/ranking/classification tasks. |
| common/workflow-operator/src/main/scala/org/apache/texera/amber/operator/huggingFace/codegen/PythonCodegenBase.scala | Extends shared Python template with validation, audio/media handling, and response normalization. |
| common/workflow-operator/src/main/scala/org/apache/texera/amber/operator/huggingFace/codegen/MediaGenCodegen.scala | Adds task-family codegen for prompt-driven media generation parsing. |
| common/workflow-operator/src/main/scala/org/apache/texera/amber/operator/huggingFace/codegen/AudioTaskCodegen.scala | Adds task-family codegen for audio tasks (raw-binary + JSON TTS). |
| common/auth/src/test/scala/org/apache/texera/auth/JwtParserSpec.scala | Adds tests ensuring missing required claims produce empty auth results. |
| common/auth/src/main/scala/org/apache/texera/auth/JwtParser.scala | Introduces claimsToOptionalSessionUser and uses it from parseToken. |
| amber/src/main/scala/org/apache/texera/web/auth/UserAuthenticator.scala | Switches to optional-claims parsing for Dropwizard auth adapter. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
a127288 to
269088f
Compare
Ma77Ball
left a comment
There was a problem hiding this comment.
Overall LGTM. Added small nit below.
269088f to
7322e1f
Compare
Automated Reviewer SuggestionsBased on the
|
7322e1f to
cb2bd44
Compare
## What changes were proposed in this PR?
Adds the QA/ranking/classification task family — 5 HF pipeline tasks —
as a new `TaskCodegen` plugged into the dispatcher established by the
text-generation PR:
QA tasks: `question-answering`, `table-question-answering`
classification/ranking tasks: `zero-shot-classification`,
`sentence-similarity`, `text-ranking`
`codegen/QaRankingCodegen.scala` supplies the per-task payload + parse
Python branches for all 5 tasks.
`CodegenContext` is extended with `contextColumn`, `candidateLabels`,
and `sentencesColumn` (`EncodableString`).
`HuggingFaceInferenceOpDesc.scala` gains 3 new `@JsonProperty` fields
and registers `QaRankingCodegen` in the dispatcher.
`PythonCodegenBase.scala` grows to host the shared QA/ranking
infrastructure:
- Per-row validation for the new column-named fields.
- `question-answering` payload handling with prompt + context.
- `table-question-answering` payload handling with table data.
- `zero-shot-classification` payload handling with candidate labels.
- `sentence-similarity` and `text-ranking` payload handling with
sentence inputs.
- Response parsing for QA/ranking outputs.
User-input strings continue to flow through `pyb"..."` +
`EncodableString` so they reach Python as
`self.decode_python_template('<base64>')` rather than raw literals.
`PythonCodeRawInvalidTextSpec` still passes with 117/117 descriptors
py_compile cleanly.
## Any related issues, documentation, or discussions?
Tracking issue: Add HuggingFace question answering and ranking tasks
apache#5292
Closes apache#5292
Stacked on: PR 4 audio/media generation tasks / `hf/04-audio-mediagen`
Parent issue: Add Hugging Face inference operator apache#5041
Closed sibling issue: Add HuggingFaceModelResource REST endpoints for HF
operator UI apache#5134
## How was this PR tested?
`sbt "WorkflowOperator/compile; WorkflowOperator/Test/compile"` clean.
`sbt "WorkflowOperator/testOnly
org.apache.texera.amber.operator.huggingFace.HuggingFaceInferenceOpDescSpec
org.apache.texera.amber.util.PythonCodeRawInvalidTextSpec"` — 31 focused
tests pass, including HuggingFace QA/ranking task coverage and the raw
Python descriptor scan.
`sbt "WorkflowOperator / scalafmtCheck"` clean.
`sbt "WorkflowOperator / Test / scalafmtCheck"` clean.
`PythonCodeRawInvalidTextSpec` — 117/117 descriptors py_compile cleanly
with the new operator code paths, no marker leaks.
## Was this PR authored or co-authored using generative AI tooling?
Yes, co-authored with generative AI tooling (Codex).
…nent (apache#5566)⚠️ This PR is stacked on apache#5574. Until that lands, the diff below may also include PR 5's QA/ranking task changes depending on which base GitHub is showing. The new code in this PR is the HuggingFaceComponent (task selector + model browser) under frontend/src/app/workspace/component/hugging-face/, plus the formly registration in formly-config.ts and the declaration in app.module.ts. Once PR apache#5574 merges and this PR is retargeted to main, the diff should auto-clean to the PR 6a frontend selector changes only. ### What changes were proposed in this PR? Add `HuggingFaceComponent`, a custom formly field type (`huggingface`) that provides: - A task dropdown listing all supported HuggingFace inference tasks (fetched from the Texera backend's `/huggingface/tasks` endpoint, with a static fallback list) - A paginated model list with client-side search, fetched from the Texera backend's `/huggingface/models` endpoint (which proxies HuggingFace Hub) - Per-task field state preservation — when switching tasks, previously entered values (modelId, promptColumn, etc.) are saved and restored This PR registers the component in `formly-config.ts` and declares it in `AppModule`. The component is not yet wired into the HuggingFace operator's property editor; the `jsonSchemaMapIntercept` mapping that routes the `modelId` field to this component is added in the follow-up property-editor PR (PR 7). ### Any related issues, documentation, discussions? - Tracking issue: apache#5314 - Closes: apache#5314 - Stacked on: PR tracked in issue apache#5292 - Parent issue: apache#5041 ### How was this PR tested? 7 unit tests added in `hugging-face.component.spec.ts` covering: - Static task list is non-empty and contains expected tasks (text-generation, image tasks, audio tasks, QA/ranking tasks) - Task tags are unique - Cache invalidation does not throw Run with `ng test`. ### Was this PR authored or co-authored using generative AI tooling? Co-authored with Claude Opus 4.7 --------- Signed-off-by: Elliot Lin <36275109+ELin2025@users.noreply.github.com> Co-authored-by: Elliot <36275109+Falcons-Royale@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
…nent (apache#5566)⚠️ This PR is stacked on apache#5574. Until that lands, the diff below may also include PR 5's QA/ranking task changes depending on which base GitHub is showing. The new code in this PR is the HuggingFaceComponent (task selector + model browser) under frontend/src/app/workspace/component/hugging-face/, plus the formly registration in formly-config.ts and the declaration in app.module.ts. Once PR apache#5574 merges and this PR is retargeted to main, the diff should auto-clean to the PR 6a frontend selector changes only. ### What changes were proposed in this PR? Add `HuggingFaceComponent`, a custom formly field type (`huggingface`) that provides: - A task dropdown listing all supported HuggingFace inference tasks (fetched from the Texera backend's `/huggingface/tasks` endpoint, with a static fallback list) - A paginated model list with client-side search, fetched from the Texera backend's `/huggingface/models` endpoint (which proxies HuggingFace Hub) - Per-task field state preservation — when switching tasks, previously entered values (modelId, promptColumn, etc.) are saved and restored This PR registers the component in `formly-config.ts` and declares it in `AppModule`. The component is not yet wired into the HuggingFace operator's property editor; the `jsonSchemaMapIntercept` mapping that routes the `modelId` field to this component is added in the follow-up property-editor PR (PR 7). ### Any related issues, documentation, discussions? - Tracking issue: apache#5314 - Closes: apache#5314 - Stacked on: PR tracked in issue apache#5292 - Parent issue: apache#5041 ### How was this PR tested? 7 unit tests added in `hugging-face.component.spec.ts` covering: - Static task list is non-empty and contains expected tasks (text-generation, image tasks, audio tasks, QA/ranking tasks) - Task tags are unique - Cache invalidation does not throw Run with `ng test`. ### Was this PR authored or co-authored using generative AI tooling? Co-authored with Claude Opus 4.7 --------- Signed-off-by: Elliot Lin <36275109+ELin2025@users.noreply.github.com> Co-authored-by: Elliot <36275109+Falcons-Royale@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
What changes were proposed in this PR?
Adds the QA/ranking/classification task family — 5 HF pipeline tasks — as a new
TaskCodegenplugged into the dispatcher established by the text-generation PR:QA tasks:
question-answering,table-question-answeringclassification/ranking tasks:
zero-shot-classification,sentence-similarity,text-rankingcodegen/QaRankingCodegen.scalasupplies the per-task payload + parse Python branches for all 5 tasks.CodegenContextis extended withcontextColumn,candidateLabels, andsentencesColumn(EncodableString).HuggingFaceInferenceOpDesc.scalagains 3 new@JsonPropertyfields and registersQaRankingCodegenin the dispatcher.PythonCodegenBase.scalagrows to host the shared QA/ranking infrastructure:question-answeringpayload handling with prompt + context.table-question-answeringpayload handling with table data.zero-shot-classificationpayload handling with candidate labels.sentence-similarityandtext-rankingpayload handling with sentence inputs.User-input strings continue to flow through
pyb"..."+EncodableStringso they reach Python asself.decode_python_template('<base64>')rather than raw literals.PythonCodeRawInvalidTextSpecstill passes with 117/117 descriptors py_compile cleanly.Any related issues, documentation, or discussions?
Tracking issue: Add HuggingFace question answering and ranking tasks #5292
Closes #5292
Stacked on: PR 4 audio/media generation tasks /
hf/04-audio-mediagenParent issue: Add Hugging Face inference operator #5041
Closed sibling issue: Add HuggingFaceModelResource REST endpoints for HF operator UI #5134
How was this PR tested?
sbt "WorkflowOperator/compile; WorkflowOperator/Test/compile"clean.sbt "WorkflowOperator/testOnly org.apache.texera.amber.operator.huggingFace.HuggingFaceInferenceOpDescSpec org.apache.texera.amber.util.PythonCodeRawInvalidTextSpec"— 31 focused tests pass, including HuggingFace QA/ranking task coverage and the raw Python descriptor scan.sbt "WorkflowOperator / scalafmtCheck"clean.sbt "WorkflowOperator / Test / scalafmtCheck"clean.PythonCodeRawInvalidTextSpec— 117/117 descriptors py_compile cleanly with the new operator code paths, no marker leaks.Was this PR authored or co-authored using generative AI tooling?
Yes, co-authored with generative AI tooling (Codex).