Skip to content

Always display Windows ESP error when software install fails#47522

Draft
getvictor wants to merge 4 commits into
mainfrom
victor/45948-windows-esp-continue-anyway
Draft

Always display Windows ESP error when software install fails#47522
getvictor wants to merge 4 commits into
mainfrom
victor/45948-windows-esp-continue-anyway

Conversation

@getvictor

@getvictor getvictor commented Jun 12, 2026

Copy link
Copy Markdown
Member

Related issue: Resolves #45948

Checklist for submitter

If some of the following don't apply, delete the relevant line.

  • Changes file added for user-visible changes in changes/, orbit/changes/ or ee/fleetd-chrome/changes.
    See Changes files for more information.

  • Input data is properly validated, SELECT * is avoided, SQL injection is prevented (using placeholders for values in statements), JS inline code is prevented especially for url redirects, and untrusted data interpolated into shell scripts/commands is validated against shell metacharacters.

  • Timeouts are implemented and retries are limited to avoid infinite loops

  • If paths of existing endpoints are modified without backwards compatibility, checked the frontend/CLI for any necessary changes

Testing

For unreleased bug fixes in a release candidate, one of:

  • Confirmed that the fix is not expected to adversely impact load test results
  • Alerted the release DRI if additional load testing is needed

Database migrations

  • Checked schema for all modified table for columns that will auto-update timestamps during migration.
  • Confirmed that updating the timestamps is acceptable, and will not cause unwanted side effects.
  • Ensured the correct collation is explicitly set for character columns (COLLATE utf8mb4_unicode_ci).

New Fleet configuration settings

  • Setting(s) is/are explicitly excluded from GitOps

If you didn't check the box above, follow this checklist for GitOps-enabled settings:

  • Verified that the setting is exported via fleetctl generate-gitops
  • Verified the setting is documented in a separate PR to the GitOps documentation
  • Verified that the setting is cleared on the server if it is not supplied in a YAML file (or that it is documented as being optional)
  • Verified that any relevant UI is disabled when GitOps mode is enabled

fleetd/orbit/Fleet Desktop

  • Verified compatibility with the latest released version of Fleet (see Must rule)
  • If the change applies to only one platform, confirmed that runtime.GOOS is used as needed to isolate changes
  • Verified that fleetd runs on macOS, Linux and Windows
  • Verified auto-update works from the released version of component to the new version (see tools/tuf/test)

Summary by CodeRabbit

  • New Features
    • Updated the critical Windows ESP failure text to use clearer “Reset your device to try again…” wording.
    • When not all apps are required, the ESP flow can now present a “Reset PC and Continue Anyway” option and show a continuable error that lists failed software names.
    • Improved user-facing formatting for failed-software lists (supports single/multiple entries and truncation with “N more”).
  • Tests
    • Added and expanded coverage for continuable error text formatting, including edge cases and truncation behavior.

@getvictor

Copy link
Copy Markdown
Member Author

@coderabbitai full review

@getvictor

Copy link
Copy Markdown
Member Author

/agentic_review

@coderabbitai

coderabbitai Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor
✅ Action performed

Full review finished.

@qodo-free-for-open-source-projects

qodo-free-for-open-source-projects Bot commented Jun 12, 2026

Copy link
Copy Markdown

Code Review by Qodo

🐞 Bugs (0) 📘 Rule violations (0) 📎 Requirement gaps (0) 🎨 UX issues (0) 🔗 Cross-repo conflicts (0)

Grey Divider


Action required

1. Unescaped ESP error text ✓ Resolved 🐞 Bug ☼ Reliability
Description
handleESPRelease now builds CustomErrorText from software display names (e.g., "A&B"), but SyncML
<Data> is populated via an innerxml field without escaping. Any name containing XML metacharacters
can break the SyncML payload, causing the device to reject the command and potentially leaving ESP
finalization stuck/misconfigured.
Code

server/service/microsoft_mdm.go[R2617-2623]

+		// List the failed software when we have any; otherwise (require_all=false timeout with nothing failed) show the
+		// timeout text. Both keep the "Continue anyway" option so the user is never stuck.
+		errorText := microsoft_mdm.ESPTimeoutErrorText
+		if hasSoftwareFailure {
+			errorText = microsoft_mdm.ESPSoftwareFailureContinuableErrorText(failedSoftwareNames)
+		}
+		cmds = buildESPBlockCommands(provID, errorText, espBlockButtonsResetAndContinue)
Evidence
The PR introduces dynamic errorText derived from software names, but the SyncML encoder path inserts
Data as raw inner XML (innerxml), so special characters are not escaped automatically and can
break the outgoing XML.

server/service/microsoft_mdm.go[2617-2623]
server/service/microsoft_mdm.go[2783-2795]
server/service/microsoft_mdm.go[3295-3335]
server/fleet/microsoft_mdm.go[1192-1203]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`CustomErrorText` is now built dynamically from software display names, but SyncML command building stores `Data` using an `innerxml` field. This requires manual XML escaping; otherwise names containing `&`, `<`, `>` (or similar) can produce malformed SyncML and lead to command rejection on the Windows client.
## Issue Context
- The new soft-block path uses `ESPSoftwareFailureContinuableErrorText(failedSoftwareNames)` (dynamic, team-configurable names).
- `newSyncMLCmdText` ultimately stores the value into `RawXmlData.Content` which is serialized with `xml:",innerxml"`.
## Fix Focus Areas
- server/service/microsoft_mdm.go[3457-3463]
- server/service/microsoft_mdm.go[3295-3335]
- server/fleet/microsoft_mdm.go[1192-1203]
## Suggested fix
- Update `newSyncMLCmdText` (and ideally other non-XML formats) to XML-escape `cmdDataValue` before storing it in `RawXmlData.Content` (e.g., via `html.EscapeString`).
- Add/adjust a unit test that marshals a SyncML command whose text contains `&`/`<` and asserts the generated XML is well-formed and contains escaped entities (e.g., `AT&amp;T`).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

2. ESP uses raw names ✓ Resolved 🐞 Bug ≡ Correctness
Description
handleESPRelease builds the soft-block CustomErrorText from SetupExperienceStatusResult.Name while
calling ListSetupExperienceResultsByHostUUID with teamID=0, so team-specific display-name enrichment
is skipped. This can surface internal/unexpected names on the ESP failure UI instead of the
configured software display names.
Code

server/service/microsoft_mdm.go[R2514-2516]

+				if r.Name != "" {
+					failedSoftwareNames = append(failedSoftwareNames, r.Name)
+				}
Evidence
The service explicitly passes teamID=0 to the datastore list call, then collects r.Name into
failedSoftwareNames and uses it to render CustomErrorText for the soft-block UI. The datastore
implementation uses the teamID parameter to populate DisplayName for UI display; therefore, the
soft-block message is currently bypassing the intended display-name enrichment.

server/service/microsoft_mdm.go[2466-2474]
server/service/microsoft_mdm.go[2511-2516]
server/service/microsoft_mdm.go[2576-2579]
server/datastore/mysql/setup_experience.go[647-727]
server/fleet/setup_experience.go[37-70]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The new Windows ESP soft-block message lists failed software using `r.Name`, but Stage 3 calls `ListSetupExperienceResultsByHostUUID(..., teamID=0)`, which prevents team-based display-name enrichment. As a result, end users may see internal/raw names rather than the intended display names.
### Issue Context
`ListSetupExperienceResultsByHostUUID` enriches results with `DisplayName` (and icons) based on the `teamID` argument. The ESP soft-block is user-facing, so it should prefer `DisplayName` when available and pass the host’s real team ID into the datastore call (falling back to `0` for “no team”).
### Fix Focus Areas
- server/service/microsoft_mdm.go[2466-2474]
- server/service/microsoft_mdm.go[2511-2516]
- server/service/microsoft_mdm.go[2576-2579]
### Implementation notes
- Compute `teamIDForQuery` for the host (similar to the existing logic used when results are empty).
- Call `ListSetupExperienceResultsByHostUUID(ctx, seHostUUID, teamIDForQuery)` (instead of `0`).
- When collecting failed names, prefer `r.DisplayName` when non-empty, else fall back to `r.Name`.
- Update/extend tests if needed (e.g., add a test where `DisplayName` differs from `Name` to ensure the ESP message uses the display name).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

Qodo Logo

@coderabbitai

coderabbitai Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: b99b3e9d-fd35-4065-948d-0e2ad34b1369

📥 Commits

Reviewing files that changed from the base of the PR and between e9ba8ed and a97ce81.

📒 Files selected for processing (5)
  • server/mdm/microsoft/esp_csp.go
  • server/mdm/microsoft/esp_csp_test.go
  • server/service/microsoft_mdm.go
  • server/service/microsoft_mdm_property_test.go
  • server/service/microsoft_mdm_test.go

Walkthrough

This pull request implements soft-block finalization for Windows setup experience (ESP) when software installations fail without requiring all software to succeed. When a setup-experience item fails and require_all_software_windows is disabled, the system now shows a user-facing error listing the specific failed software names, with "Reset PC and Continue Anyway" recovery buttons instead of a hard reset-only block. The change introduces message-formatting utilities for dynamic error text, refactors the ESP command-selection logic to distinguish hard block, soft block, and release paths, and adds comprehensive test coverage for the new soft-block decision and its UI text behavior.

Possibly related PRs

  • fleetdm/fleet#44740: Modifies server/service/microsoft_mdm.go ESP finalization logic in handleESPRelease, particularly timeout path behavior around canceling and presenting setup-experience outcomes, so command and test expectations may overlap.
🚥 Pre-merge checks | ✅ 2 | ❌ 3

❌ Failed checks (2 warnings, 1 inconclusive)

Check name Status Explanation Resolution
Description check ⚠️ Warning The PR description is largely incomplete, containing only the unchecked template with no substantive explanation of changes, rationale, or specific implementation details beyond the standard checklist. Add a detailed description explaining the key changes: soft-block vs hard-block behavior, dynamic failed software display, the new continuable error text function, and how the changes fulfill the linked issue requirements.
Docstring Coverage ⚠️ Warning Docstring coverage is 57.14% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Out of Scope Changes check ❓ Inconclusive The PR description lacks detail about implementation scope, making it unclear whether changes to error message wording and button behavior represent intended evolution or scope creep. Clarify in the description whether modifications to error message text and button labels align with the original issue scope or represent additional improvements discovered during implementation.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: displaying Windows ESP errors when software installation fails, which is directly related to the changeset's focus on soft-block behavior and failure messaging.
Linked Issues check ✅ Passed The code changes implement the core requirements from issue #45948: soft-block behavior with dynamic failed software name display, distinct error messages for soft vs hard block paths, and dual user experience paths based on the require_all_software setting.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch victor/45948-windows-esp-continue-anyway

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
server/service/microsoft_mdm.go (1)

2342-2344: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Preserve software-failure detection when the 3-hour timeout has already elapsed.

Line 2342 short-circuits the Stage 3 scan whenever timedOut is true, so hasSoftwareFailure stays false even if some setup-experience rows are already in Failure. That makes the require_all_software_windows=false path fall through to silent release instead of surfacing the continuable ESP error, which contradicts the feature goal of always showing an ESP error when software install failures occurred. Re-list or inspect terminal Stage 3 results before deciding shouldWarn/release on timeout so timeout+failure still lands on the failure UI.

Also applies to: 2554-2561

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@server/service/microsoft_mdm.go` around lines 2342 - 2344, The timeout
short-circuits the Stage 3 scan (variable timedOut computed from
device.AwaitingConfigurationAt) and returns before evaluating
hasSoftwareFailure, so setups with terminal ESP rows in Failure are ignored;
change the logic so that when timedOut is true you still inspect/re-list the
Stage 3 setup-experience rows (or otherwise evaluate terminal Stage 3 results)
to set hasSoftwareFailure before deciding shouldWarn/release for the
require_all_software_windows=false path—ensure the check that computes
hasSoftwareFailure runs even when timedOut is true (same fix applied to the
similar block around the code handling lines 2554-2561).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@server/service/microsoft_mdm_test.go`:
- Around line 1920-1922: Update the test expectation so the continuable error
message exactly matches the acceptance criteria: modify the expected string used
in the assertion (the literal compared to errCmd.Items[0].Data.Content or the
variable CustomErrorText used in this test) to include the missing sentence "If
unavailable, contact your IT admin." and restore the exact reset phrasing
required by the spec; ensure the full multi-sentence message matches the product
text character-for-character so the assertion against
errCmd.Items[0].Data.Content passes.

In `@server/service/microsoft_mdm.go`:
- Around line 2566-2578: The switch currently uses
microsoft_mdm.ESPTimeoutErrorText / ESPSoftwareFailureErrorText and
ESPSoftwareFailureContinuableErrorText but the actual user copy is wrong; update
the hard-block branch (shouldBlock) to pass the exact required hard-block string
(or update ESPTimeoutErrorText/ESPSoftwareFailureErrorText constants) and update
the soft-block branch (shouldWarn) so ESPSoftwareFailureContinuableErrorText (or
its constant) includes the final sentence "If unavailable, contact your IT
admin." Ensure you only change the text constants or the strings passed to
buildESPBlockCommands(provID, ..., espBlockButtonsReset) and
buildESPBlockCommands(provID, ..., espBlockButtonsResetAndContinue) so the
displayed copy matches the spec.

---

Outside diff comments:
In `@server/service/microsoft_mdm.go`:
- Around line 2342-2344: The timeout short-circuits the Stage 3 scan (variable
timedOut computed from device.AwaitingConfigurationAt) and returns before
evaluating hasSoftwareFailure, so setups with terminal ESP rows in Failure are
ignored; change the logic so that when timedOut is true you still
inspect/re-list the Stage 3 setup-experience rows (or otherwise evaluate
terminal Stage 3 results) to set hasSoftwareFailure before deciding
shouldWarn/release for the require_all_software_windows=false path—ensure the
check that computes hasSoftwareFailure runs even when timedOut is true (same fix
applied to the similar block around the code handling lines 2554-2561).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: c83690ce-e907-490d-9504-1c65be52f9bb

📥 Commits

Reviewing files that changed from the base of the PR and between e9ba8ed and d273506.

📒 Files selected for processing (5)
  • server/mdm/microsoft/esp_csp.go
  • server/mdm/microsoft/esp_csp_test.go
  • server/service/microsoft_mdm.go
  • server/service/microsoft_mdm_property_test.go
  • server/service/microsoft_mdm_test.go

Comment thread server/service/microsoft_mdm_test.go
Comment thread server/service/microsoft_mdm.go Outdated

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the Windows Enrollment Status Page (ESP) finalization logic so that when setup-experience software installs fail and require_all_software_windows is false, Windows still shows the ESP failure UI with a “Continue anyway” option and a dynamic list of failed software names.

Changes:

  • Add a “soft block” ESP path (failure UI + Continue anyway) when software fails but require_all_software_windows=false.
  • Introduce helper formatting for dynamic failed-software lists with truncation.
  • Expand unit + property-based tests to cover warn/soft-block behavior and invariants.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
server/service/microsoft_mdm.go Adds soft-block decision path and configurable ESP recovery buttons (Reset vs Reset+Continue).
server/service/microsoft_mdm_test.go Adds example-based tests for soft-block behavior, in-flight waiting, and timeout release behavior.
server/service/microsoft_mdm_property_test.go Extends PBT model to include warn/soft-block decision + command-shape expectations.
server/mdm/microsoft/esp_csp.go Adds dynamic error-text builder for continuable software failures (name list + truncation).
server/mdm/microsoft/esp_csp_test.go Adds tests for name-joining and truncation behavior for the new error text helper.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread server/service/microsoft_mdm.go Outdated
Comment thread server/mdm/microsoft/esp_csp.go Outdated
Comment thread server/service/microsoft_mdm_test.go
Comment thread server/mdm/microsoft/esp_csp_test.go Outdated
@codecov

codecov Bot commented Jun 12, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 83.33333% with 11 lines in your changes missing coverage. Please review.
✅ Project coverage is 67.20%. Comparing base (60cd943) to head (ba1c686).
⚠️ Report is 40 commits behind head on main.

Files with missing lines Patch % Lines
server/service/microsoft_mdm.go 77.08% 6 Missing and 5 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #47522      +/-   ##
==========================================
+ Coverage   67.19%   67.20%   +0.01%     
==========================================
  Files        3489     3492       +3     
  Lines      228536   228774     +238     
  Branches    11909    11909              
==========================================
+ Hits       153557   153745     +188     
- Misses      61152    61187      +35     
- Partials    13827    13842      +15     
Flag Coverage Δ
backend 68.84% <83.33%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@getvictor

Copy link
Copy Markdown
Member Author

@coderabbitai full review

@getvictor

Copy link
Copy Markdown
Member Author

/agentic_review

@coderabbitai

coderabbitai Bot commented Jun 14, 2026

Copy link
Copy Markdown
Contributor
✅ Action performed

Full review finished.

@qodo-free-for-open-source-projects

qodo-free-for-open-source-projects Bot commented Jun 14, 2026

Copy link
Copy Markdown

Code review by qodo was updated up to the latest commit a97ce81

Comment thread server/service/microsoft_mdm.go

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Comment thread server/service/microsoft_mdm.go Outdated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Windows setup experience: Cancel setup but still let end users through

2 participants