feat(huggingFace): add image task family via ImageTaskCodegen#5320
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #5320 +/- ##
============================================
- Coverage 53.28% 53.27% -0.02%
Complexity 2663 2663
============================================
Files 1095 1098 +3
Lines 42398 42532 +134
Branches 4560 4575 +15
============================================
+ Hits 22590 22657 +67
- Misses 18476 18544 +68
+ Partials 1332 1331 -1
*This pull request uses carry forward flags. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
|
/request-review @Ma77Ball |
|
| config | throughput | MB/s | latency | max Δ latest / 7d | |
|---|---|---|---|---|---|
| 🔴 | bs=10 sw=10 sl=64 | 419 | 0.256 | 21,904/33,847/33,847 us | 🔴 -8.7% / 🟢 -7.9% |
| 🔴 | bs=100 sw=10 sl=64 | 911 | 0.556 | 105,748/141,073/141,073 us | 🔴 +20.0% / 🟢 -5.8% |
| ⚪ | bs=1000 sw=10 sl=64 | 1,096 | 0.669 | 902,642/987,959/987,959 us | ⚪ within ±5% / 🟢 -7.2% |
Baseline details
Latest main 45d28ce from same runner
| config | metric | PR | latest main | 7d avg | Δ latest | Δ 7d |
|---|---|---|---|---|---|---|
| bs=10 sw=10 sl=64 | throughput | 419 tuples/sec | 459 tuples/sec | 410.82 tuples/sec | -8.7% | +2.0% |
| bs=10 sw=10 sl=64 | MB/s | 0.256 MB/s | 0.28 MB/s | 0.251 MB/s | -8.6% | +2.1% |
| bs=10 sw=10 sl=64 | p50 | 21,904 us | 20,145 us | 23,785 us | +8.7% | -7.9% |
| bs=10 sw=10 sl=64 | p95 | 33,847 us | 34,224 us | 34,980 us | -1.1% | -3.2% |
| bs=10 sw=10 sl=64 | p99 | 33,847 us | 34,224 us | 34,980 us | -1.1% | -3.2% |
| bs=100 sw=10 sl=64 | throughput | 911 tuples/sec | 956 tuples/sec | 891.94 tuples/sec | -4.7% | +2.1% |
| bs=100 sw=10 sl=64 | MB/s | 0.556 MB/s | 0.584 MB/s | 0.544 MB/s | -4.8% | +2.1% |
| bs=100 sw=10 sl=64 | p50 | 105,748 us | 103,485 us | 112,277 us | +2.2% | -5.8% |
| bs=100 sw=10 sl=64 | p95 | 141,073 us | 117,584 us | 139,802 us | +20.0% | +0.9% |
| bs=100 sw=10 sl=64 | p99 | 141,073 us | 117,584 us | 139,802 us | +20.0% | +0.9% |
| bs=1000 sw=10 sl=64 | throughput | 1,096 tuples/sec | 1,080 tuples/sec | 1,041 tuples/sec | +1.5% | +5.3% |
| bs=1000 sw=10 sl=64 | MB/s | 0.669 MB/s | 0.659 MB/s | 0.635 MB/s | +1.5% | +5.3% |
| bs=1000 sw=10 sl=64 | p50 | 902,642 us | 925,648 us | 972,714 us | -2.5% | -7.2% |
| bs=1000 sw=10 sl=64 | p95 | 987,959 us | 1,026,714 us | 1,023,057 us | -3.8% | -3.4% |
| bs=1000 sw=10 sl=64 | p99 | 987,959 us | 1,026,714 us | 1,023,057 us | -3.8% | -3.4% |
Raw CSV
config_idx,batch_size,schema_width,string_len,num_batches,total_ms,total_tuples,total_bytes,tuples_per_sec,mb_per_sec,lat_p50_us,lat_p95_us,lat_p99_us
0,10,10,64,20,477.63,200,128000,419,0.256,21903.90,33847.30,33847.30
1,100,10,64,20,2194.36,2000,1280000,911,0.556,105748.27,141073.26,141073.26
2,1000,10,64,20,18251.20,20000,12800000,1096,0.669,902641.62,987958.61,987958.61Plugs the 9-task image family into the dispatcher pattern established
in PR 2:
image-only image-classification, object-detection,
image-segmentation, image-to-text
image + prompt visual-question-answering, document-question-answering,
zero-shot-image-classification, image-text-to-text,
image-to-image
- ImageTaskCodegen supplies payload + parse Python for all 9 tasks
- TaskCodegen trait gains a `tasks: Set[String]` default method so a
single codegen can register under multiple task strings; the
dispatcher map in HuggingFaceInferenceOpDesc is built from
registeredCodegens.tasks.flatMap(...)
- CodegenContext extended with imageInput + inputImageColumn
(EncodableString)
- HuggingFaceInferenceOpDesc gains 2 new @JsonProperty fields and
registers ImageTaskCodegen
PythonCodegenBase grows to host the shared image infrastructure:
- image_only_tasks / image_prompt_tasks / image_tasks tuples and
image_headers in process_table
- per-row image bytes resolution from upload (self._read_image_input)
or input column (self._read_binary_value + self._compress_image_bytes)
- use_raw_binary_body / raw_binary_headers state threaded through
_post_with_fallback (signature extended)
- _post_with_fallback adds the image-text-to-text chat-completions
branch and the model-author vision branch
- _call_provider adds branches for zai-org's custom API, Replicate
predictions + polling, Fal-ai, Wavespeed submit+poll, and image
embedding in OpenAI-compatible / unknown-provider fallbacks
- image-content-type response handling returns data:image URLs
- image helpers added: _read_image_input, _compress_image_bytes,
_image_input_as_base64, _read_binary_value, _looks_like_html,
_html_to_image_bytes, _extract_json_arg, _url_to_data_url
User-input strings continue to flow through pyb"..." + EncodableString
so they reach Python as self.decode_python_template('<base64>') rather
than raw literals. PythonCodeRawInvalidTextSpec still passes
(117/117 descriptors py_compile cleanly).
Frontend integration adds only the HF lines (no agent / dataset
noise from the source branch):
- HuggingFaceImageUploadComponent declared in app.module.ts
- huggingface-image-upload formly type registered in formly-config.ts
- Image upload component .ts/.html/.scss cherry-picked from huggingFace
- HuggingFace.png + sample-image.png assets
PR 3 of a stacked 9-PR series. Stacks on hf/02-operator-textgen.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… tests in HuggingFaceInferenceOpDescSpec for the fixes
dd644d4 to
5e0df3e
Compare
There was a problem hiding this comment.
Pull request overview
Adds HuggingFace image task family support (9 HF pipeline tasks) to the Amber HuggingFace inference operator by introducing a new ImageTaskCodegen, extending the shared Python codegen infrastructure for image payload/response handling and provider routing, and wiring an image-upload field into the Angular property panel.
Changes:
- Add
ImageTaskCodegenand register it for 9 image task strings via the dispatcher (TaskCodegen.tasks). - Extend
PythonCodegenBaseto support image input resolution (upload/column), raw-binary posting, provider-specific image routing, and image response parsing. - Add a frontend
HuggingFaceImageUploadComponentand register it as a Formly field type; add/extend operator spec coverage for image tasks.
Reviewed changes
Copilot reviewed 11 out of 13 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| frontend/src/app/workspace/component/hugging-face-image-upload/hugging-face-image-upload.component.ts | New Formly field component for image upload + client-side compression. |
| frontend/src/app/workspace/component/hugging-face-image-upload/hugging-face-image-upload.component.html | Template for image selection, preview, and clear action. |
| frontend/src/app/workspace/component/hugging-face-image-upload/hugging-face-image-upload.component.scss | Styling for upload UI and preview. |
| frontend/src/app/workspace/component/hugging-face-image-upload/hugging-face-image-upload.component.spec.ts | Unit tests for derived state, rejection paths, and clear behavior. |
| frontend/src/app/common/formly/formly-config.ts | Register huggingface-image-upload Formly type. |
| frontend/src/app/app.module.ts | Include the new upload component in the app module imports. |
| common/workflow-operator/src/main/scala/org/apache/texera/amber/operator/huggingFace/HuggingFaceInferenceOpDesc.scala | Add imageInput/inputImageColumn fields and register ImageTaskCodegen tasks. |
| common/workflow-operator/src/main/scala/org/apache/texera/amber/operator/huggingFace/codegen/TaskCodegen.scala | Add tasks: Set[String] default method and extend CodegenContext with image fields. |
| common/workflow-operator/src/main/scala/org/apache/texera/amber/operator/huggingFace/codegen/PythonCodegenBase.scala | Add shared image task infrastructure, helpers, and provider routing/parse behavior. |
| common/workflow-operator/src/main/scala/org/apache/texera/amber/operator/huggingFace/codegen/ImageTaskCodegen.scala | New per-task payload/parse snippets for all 9 image tasks. |
| common/workflow-operator/src/test/scala/org/apache/texera/amber/operator/huggingFace/HuggingFaceInferenceOpDescSpec.scala | Add image-task routing/behavior tests and dispatcher coverage. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
xuang7
left a comment
There was a problem hiding this comment.
Overall this PR looks good to me. Left one small security-related concern. Please also feel free to apply the Copilot suggestions if they are useful.
…lumn Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> Signed-off-by: Prateek Ganigi <91584519+PG1204@users.noreply.github.com>
…s (data/URL/bytes only)
…, and unused import in codegen
All Copilot suggestions address in de50344 |
xuang7
left a comment
There was a problem hiding this comment.
New changes look good to me! Could we drop http:// (https-only), and possibly block private/metadata IPs and add a size cap?
…te/metadata IPs, cap size
Done, https-only, private/metadata IPs blocked, and a 25 MB size cap, all via one shared fetch helper used by every remote fetch. |
…#5320) ### What changes were proposed in this PR? Adds the image task family — 9 HF pipeline tasks — as the second `TaskCodegen` plugged into the dispatcher established by apache#5278: image-only: image-classification, object-detection, image-segmentation, image-to-text image + prompt: visual-question-answering, document-question-answering, zero-shot-image-classification, image-text-to-text, image-to-image - `codegen/ImageTaskCodegen.scala` supplies the per-task payload + parse Python branches for all 9 tasks. - `TaskCodegen` trait gains a `tasks: Set[String]` default method (defaults to `Set(task)`) so a single codegen can register under multiple task strings; `ImageTaskCodegen` is the first multi-task codegen to use it. - `CodegenContext` extended with `imageInput` + `inputImageColumn` (`EncodableString`). - `HuggingFaceInferenceOpDesc.scala` gains 2 new `@JsonProperty` fields and registers `ImageTaskCodegen` via the new `tasks` flat-map. `PythonCodegenBase.scala` grows to host the shared image infrastructure: - Task-family tuples (`image_only_tasks`, `image_prompt_tasks`, `image_tasks`) + `image_headers` in `process_table`. - Per-row image-bytes resolution from upload or column with `_read_image_input` / `_read_binary_value` / `_compress_image_bytes`. - `_post_with_fallback` extended with `raw_binary_headers` + `use_raw_binary_body`; adds image-text-to-text chat-completions and model-author vision branches. - `_call_provider` gains zai-org, Replicate predictions + polling, Fal-ai, Wavespeed submit+poll branches, and image embedding for OpenAI-compatible / unknown-provider fallbacks. - Image content-type response handling returns `data:image/...;base64,...` URLs. - Image helpers added: `_read_image_input`, `_compress_image_bytes`, `_image_input_as_base64`, `_read_binary_value`, `_looks_like_html`, `_html_to_image_bytes`, `_extract_json_arg`, `_url_to_data_url`. Frontend integration (HF lines only — no agent / dataset noise): `HuggingFaceImageUploadComponent` declared in `app.module.ts`, `huggingface-image-upload` formly type registered, image upload component .ts/.html/.scss + `HuggingFace.png` + `sample-image.png` assets. User-input strings continue to flow through `pyb"..."` + `EncodableString` so they reach Python as `self.decode_python_template('<base64>')` rather than raw literals. `PythonCodeRawInvalidTextSpec` still passes (117/117 descriptors `py_compile` cleanly). ### Any related issues, documentation, or discussions? - Tracking issue: apache#5319 - Closes: apache#5319 - Stacked on: apache#5278 (operator + text-generation — issue apache#5277) - Parent issue: apache#5041 - Closed sibling issue: apache#5134 (REST resource — landed via apache#5124) ### How was this PR tested? - `sbt "WorkflowOperator/compile; WorkflowOperator/Test/compile"` clean. - `sbt scalafmtCheck` clean. - `sbt "WorkflowOperator/testOnly org.apache.texera.amber.operator.huggingFace.HuggingFaceInferenceOpDescSpec"` — 18/18 pass (PR 2's 13 spec tests + 5 new image-task tests: image-only routing, VQA / document-QA payload, image-text-to-text chat-completions, image-to-image data-URL parse, all-9-tasks dispatcher coverage). - `sbt "WorkflowOperator/testOnly org.apache.texera.amber.util.PythonCodeRawInvalidTextSpec"` — 117/117 descriptors `py_compile` cleanly with the new operator code paths, no marker leaks. - Generated Python verified via `python3 -m py_compile` on sample image-task outputs. ### Was this PR authored or co-authored using generative AI tooling? Yes, co-authored with Claude Opus 4.7. --------- Signed-off-by: Prateek Ganigi <91584519+PG1204@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
What changes were proposed in this PR?
Adds the image task family — 9 HF pipeline tasks — as the second
TaskCodegenplugged into the dispatcher established by #5278:image-only: image-classification, object-detection, image-segmentation, image-to-text
image + prompt: visual-question-answering, document-question-answering, zero-shot-image-classification, image-text-to-text, image-to-image
codegen/ImageTaskCodegen.scalasupplies the per-task payload + parse Python branches for all 9 tasks.TaskCodegentrait gains atasks: Set[String]default method (defaults toSet(task)) so a single codegen can register under multiple task strings;ImageTaskCodegenis the first multi-task codegen to use it.CodegenContextextended withimageInput+inputImageColumn(EncodableString).HuggingFaceInferenceOpDesc.scalagains 2 new@JsonPropertyfields and registersImageTaskCodegenvia the newtasksflat-map.PythonCodegenBase.scalagrows to host the shared image infrastructure:image_only_tasks,image_prompt_tasks,image_tasks) +image_headersinprocess_table._read_image_input/_read_binary_value/_compress_image_bytes._post_with_fallbackextended withraw_binary_headers+use_raw_binary_body; adds image-text-to-text chat-completions and model-author vision branches._call_providergains zai-org, Replicate predictions + polling, Fal-ai, Wavespeed submit+poll branches, and image embedding for OpenAI-compatible / unknown-provider fallbacks.data:image/...;base64,...URLs._read_image_input,_compress_image_bytes,_image_input_as_base64,_read_binary_value,_looks_like_html,_html_to_image_bytes,_extract_json_arg,_url_to_data_url.Frontend integration (HF lines only — no agent / dataset noise):
HuggingFaceImageUploadComponentdeclared inapp.module.ts,huggingface-image-uploadformly type registered, image upload component .ts/.html/.scss +HuggingFace.png+sample-image.pngassets.User-input strings continue to flow through
pyb"..."+EncodableStringso they reach Python asself.decode_python_template('<base64>')rather than raw literals.PythonCodeRawInvalidTextSpecstill passes(117/117 descriptors
py_compilecleanly).Any related issues, documentation, or discussions?
How was this PR tested?
sbt "WorkflowOperator/compile; WorkflowOperator/Test/compile"clean.sbt scalafmtCheckclean.sbt "WorkflowOperator/testOnly org.apache.texera.amber.operator.huggingFace.HuggingFaceInferenceOpDescSpec"— 18/18 pass (PR 2's 13 spec tests + 5 new image-task tests: image-only routing, VQA / document-QA payload, image-text-to-text chat-completions, image-to-image data-URL parse, all-9-tasks dispatcher coverage).sbt "WorkflowOperator/testOnly org.apache.texera.amber.util.PythonCodeRawInvalidTextSpec"— 117/117 descriptorspy_compilecleanly with the new operator code paths, no marker leaks.python3 -m py_compileon sample image-task outputs.Was this PR authored or co-authored using generative AI tooling?
Yes, co-authored with Claude Opus 4.7.