Stand up dual stable-ABI build harness and port nms as the first op.#9524
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/vision/9524
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: ✅ You can merge normally! (1 Unrelated Failure)As of commit 8f2660f with merge base 5c6ea42 ( BROKEN TRUNK - The following job failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
Hi @adabeyta! Thank you for your pull request and welcome to our community. Action RequiredIn order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you. ProcessIn order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA. Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks! |
|
Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks! |
|
Hey @NicolasHug! You merged this PR, but no labels were added. |
This PR stands up the stable-ABI build path and ports nms (CPU, CUDA, quantized-CPU, MPS) as the first op. The intent is to validate the harness/per-backend patterns on nms.
Structure: Dual Extension
torchvision._C: the existing extension, now built from all not-yet-ported sources.torchvision._C_stable: the new extension, built only from migrated sources (currently nms).Why two extensions?
For a per op rollout a separate compilation unit helps to scope the pin and keep the tree buildable at every step of an incremental port (torchcodec migrated in place because it converted everything in one shot). Once all ops are stable remove legacy unstable compilation.
Layout Changes
setup.py: splits sources between the two extensions.STABLE_SOURCES + _stable()/_not_stable()helpers split the source globs, so migrated files compile into _C_stable and everything else stays in _C.make_C_stable_extension(): builds the new target with the -DTORCH_TARGET_VERSION pin.ensure_hipified()hoisted to a single idempotent step both extensions call (runs once/with no build-order dependency between them).Registration: schema declaration/per-backend kernel binding.StableABICompat.h: shared compat header one place for ATen ops not yet wrapped in stable ops.h (placeholder until upstream.vision_stable.cpp: extension-root file for_C_stable, the stable mirror ofvision.cpp.PyInit__C_stablestub. This is extension-level glue, not tied to any op, so it lives here rather than in a kernel file.STABLE_SOURCESso it builds into_C_stableand stays out of_C.Missing Pytorch NMS MPS Stable-ABI Port Support
MPS scalar arguments:
The nms schema takes a float iou_threshold, which the Metal kernel needs as a scalar argument (constant float & iou_threshold [[buffer(3)]]). With no scalar-arg setter in the shim, that float is currently carried as a 1-element float32 tensor bound via set_arg_tensor. shim_mps.h exposes only aoti_torch_mps_set_arg_tensor and aoti_torch_mps_set_arg_int
Missing:
TODO: Workaround (for now) packs iou_threshold into a 1-element float32 tensor (new_empty + fill_) and bind it at buffer(3) via set_arg_tensor. The kernel reads the same 4 bytes as constant float&, so it's behavior-identical. File PR for missing mps_set_args.
Metal source isolation.
Made nms Metal source into a standalone nms_metal_shader.h. The shared mps_kernels.h can't be used under the pin because it pulls in ATen/native/mps/OperationUtils.h and instantiates at::native::mps::MetalShaderLibrary. Derives from same isolation rationale as the dual extension (legacy _C keeps using the full mps_kernels.h)
Stable-ABI audit (
torch-abi-audit)Ran
torch-abi-auditagainst the builttorchvisionpackage to verify the migrated extension stays on the stable surface and never reaches intoat::/c10::/torch::jit::internals._C_stable.soaudits asSTABLE— 67 stable-shim symbols, 0 unstable. The legacy_C.soand theunrelated
image.sostayUNSTABLE, which is expected: onlynmshas migrated so far, so thepackage-level verdict remains
UNSTABLEuntil the remaining ops move into_C_stable(samemid-migration shape as torchcodec).