build: pre-flight tag existence check + cleanup on downstream failure

## Problem

**Run:** [tenant-manager · Build Pipeline · `2.2.0-beta.6`](https://github.com/LerianStudio/tenant-manager/actions/runs/27234829990)

This is a two-attempt failure chain — each attempt has a distinct root cause.

---

### Attempt #1 — Cosign signing failed (Rekor 404)

Job: [`build / Build tenant-manager` (attempt 1)](https://github.com/LerianStudio/tenant-manager/actions/runs/27234829990/job/80424032972)

Build and push to both DockerHub and GHCR succeeded. Cosign signing then failed on all 3 retry attempts with:

```
error during command execution: signing [docker.io/lerianstudio/tenant-manager@sha256:49758a04...]:
signing digest: [GET /api/v1/log/entries/{entryUUID}][404] getLogEntryByUuidNotFound
```

Root cause: transient Rekor (Sigstore public transparency log) `404 getLogEntryByUuidNotFound`. The cosign client successfully retrieved the SCT (Signed Certificate Timestamp) but then failed to confirm the entry in Rekor, indicating a momentary inconsistency in the public log service.

There is also a secondary template error at the end of this step:

```
The template is not valid. ...build.yml@v1.31.0 (Line: 372, Col: 28): Unexpected value ''
```

This suggests the `continue-on-error` expression or a similar field on line 372 evaluates to an empty string under some conditions, which is itself a bug.

---

### Attempt #2 — Docker push denied (tag immutability)

Job: [`build / Build tenant-manager` (attempt 2)](https://github.com/LerianStudio/tenant-manager/actions/runs/27234829990/job/80426431165)

The pipeline was re-run to recover from the cosign failure. The full Docker build ran again (~19 s), then:

```
ERROR: failed to solve: failed to push lerianstudio/tenant-manager:2.2.0-beta.6:
denied: requested access to the resource is denied — tag 2.2.0-beta.6 is already
assigned to an image in this repository and cannot be updated due to immutability settings.
```

Root cause: DockerHub has tag immutability enabled. The tag was already published in attempt #1; the re-run had no way to detect this before spending time on a full rebuild.

---

## Proposed Fixes

### Fix 1 — Resilience to transient Rekor failures (attempt #1 root cause)

The 3-attempt retry with exponential backoff exists but is insufficient for Rekor intermittency, which can last several minutes. Options:

- **Increase `cosign_max_attempts` default** from 3 to a higher value (e.g. 5) and increase the backoff ceiling.
- **Add jitter** to the retry delay to avoid thundering-herd if multiple jobs hit Rekor simultaneously.
- **Fix the template error on line 372**: The `Unexpected value ''` error means a field receives an empty string where a boolean or defined value is expected. This should be investigated and fixed — it may also mask error propagation silently in other scenarios.
- **Consider honouring `continue_gitops_on_signing_failure`** more broadly: if Rekor is down, the image is still valid and signed certificates were issued — only the transparency log entry retrieval failed. Blocking the entire pipeline (and forcing a re-run that will fail for a different reason) is a disproportionate response to a Sigstore outage.

---

### Fix 2 — Pre-flight tag existence check (attempt #2 root cause)

Before starting the Docker build, check whether the target tag already exists in each enabled registry. If it does, either skip the build (idempotent re-run behaviour) or fail fast with a clear, early error — not after a full build.

Suggested new input:

```yaml
on_existing_tag: 'fail' | 'skip' | 'warn'   # default: 'fail'
```

Implementation sketch (DockerHub):

```bash
TOKEN=
STATUS="000""000""000"
if [ "$STATUS" = "200" ]; then
  echo "::warning::Tag $TAG already exists — skipping (immutable registry)."
  exit 0
fi
```

For GHCR: `docker manifest inspect ghcr.io/$ORG/$IMAGE:$TAG`.

---

### Fix 3 — Cleanup pushed images on downstream failure (suggestion)

If the build+push succeeds but a later step fails (cosign, GitOps artifact upload, Helm dispatch), the image is left in the registry unsigned and without a GitOps record. No rollback exists today.

Suggestion: optional cleanup job/step running `if: failure()`:

```yaml
cleanup_on_failure: true | false   # default: false
```

- DockerHub: `DELETE /v2/repositories/{namespace}/{repository}/tags/{tag}` (requires delete-scope token).
- GHCR: `gh api -X DELETE /orgs/{org}/packages/container/{package}/versions/{version_id}`.
- Target only tags published in the current run.
- If the registry has immutability and deletion is not possible, emit a warning with the digest and manual remediation steps.

---

## Checklist

- [ ] Investigate and fix `Unexpected value ''` on line 372 of `build.yml`
- [ ] Review `cosign_max_attempts` default and retry backoff ceiling
- [ ] Add jitter to cosign retry delay
- [ ] Add `on_existing_tag` input (`fail` / `skip` / `warn`) with pre-flight check for DockerHub and GHCR
- [ ] Add `cleanup_on_failure` input with registry cleanup on downstream step failure
- [ ] Cleanup emits a warning (not hard failure) when registry deletion is not possible

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

build: pre-flight tag existence check + cleanup on downstream failure #421

Problem

Attempt #1 — Cosign signing failed (Rekor 404)

Attempt #2 — Docker push denied (tag immutability)

Proposed Fixes

Fix 1 — Resilience to transient Rekor failures (attempt #1 root cause)

Fix 2 — Pre-flight tag existence check (attempt #2 root cause)

Fix 3 — Cleanup pushed images on downstream failure (suggestion)

Checklist

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

build: pre-flight tag existence check + cleanup on downstream failure #421

Description

Problem

Attempt #1 — Cosign signing failed (Rekor 404)

Attempt #2 — Docker push denied (tag immutability)

Proposed Fixes

Fix 1 — Resilience to transient Rekor failures (attempt #1 root cause)

Fix 2 — Pre-flight tag existence check (attempt #2 root cause)

Fix 3 — Cleanup pushed images on downstream failure (suggestion)

Checklist

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions