Summary
After the host-ref canvas-leak fix (#5143) the JavaScript screenshot suite reliably reaches 94/94 matched in ~75% of CI runs, but ~25% of runs hard-stall at ButtonThemeScreenshotTest (the first DualAppearanceBaseTest theme test, ~suite index 85) and time out (exit 5, SUITE:FINISHED never emitted).
Symptom
The worker goes completely silent immediately after CN1SS:INFO:suite starting test=ButtonThemeScreenshotTest — no further console output, not even other scheduler tasks. Because the cooperative scheduler is single-threaded, total silence means a synchronous loop with no yield (a parked host-call would let other tasks keep logging). The in-worker per-test watchdog (awaitTestCompletion) therefore can't fire, so the whole run times out regardless of CN1SS_ALLOWED_MISSING.
Root cause (diagnosis)
The stall is inside DualAppearanceBaseTest.installModernThemeIfRequested() → Resources.open(in) parsing the modern .res, where in comes from the JS port's getBundledAssetAsDataURL asset-read bridge. Deep in the suite the worker↔host bridge intermittently returns a degraded receiver — the chartDocStaleness family where getDocument() / createElement returns null (or a coerced Number), re-wrapping the Document and wiping its cached __class. A degraded asset stream then wedges Resources.open in a parse loop.
This is the same degradation behind the historical chart-tail issues (partially addressed in 08b1248, 5dce6a2) and is orthogonal to the canvas leak fixed in #5143.
Attempts that BACKFIRED (don't repeat)
- Pre-warming the modern
.res at suite start (read+cache once, before pressure): made it worse — the suite-start Resources.open did host-bridge round-trips that wiped the Document class early, turning the one late ButtonTheme wedge into early animation_grid_failed=NullPointerException (hung at index 11/21/25). Reverted.
- Trimming the per-test settle (1500→700ms): perturbed animation-grid capture timing → same early grid NPE. Reverted.
Real lever
Fix the worker↔host Document/canvas wrapper degradation itself (in browser_bridge.js / parparvm_runtime.js — wrapJsObject class-wipe on null re-resolve, and/or the host bridge returning a Number/degenerate for getDocument/getContext under load). This has resisted multiple prior partial fixes and needs a dedicated investigation.
Follow-up to #5143.
Summary
After the host-ref canvas-leak fix (#5143) the JavaScript screenshot suite reliably reaches 94/94 matched in ~75% of CI runs, but ~25% of runs hard-stall at
ButtonThemeScreenshotTest(the firstDualAppearanceBaseTesttheme test, ~suite index 85) and time out (exit 5,SUITE:FINISHEDnever emitted).Symptom
The worker goes completely silent immediately after
CN1SS:INFO:suite starting test=ButtonThemeScreenshotTest— no further console output, not even other scheduler tasks. Because the cooperative scheduler is single-threaded, total silence means a synchronous loop with no yield (a parked host-call would let other tasks keep logging). The in-worker per-test watchdog (awaitTestCompletion) therefore can't fire, so the whole run times out regardless ofCN1SS_ALLOWED_MISSING.Root cause (diagnosis)
The stall is inside
DualAppearanceBaseTest.installModernThemeIfRequested()→Resources.open(in)parsing the modern.res, whereincomes from the JS port'sgetBundledAssetAsDataURLasset-read bridge. Deep in the suite the worker↔host bridge intermittently returns a degraded receiver — thechartDocStalenessfamily wheregetDocument()/createElementreturns null (or a coerced Number), re-wrapping theDocumentand wiping its cached__class. A degraded asset stream then wedgesResources.openin a parse loop.This is the same degradation behind the historical chart-tail issues (partially addressed in 08b1248, 5dce6a2) and is orthogonal to the canvas leak fixed in #5143.
Attempts that BACKFIRED (don't repeat)
.resat suite start (read+cache once, before pressure): made it worse — the suite-startResources.opendid host-bridge round-trips that wiped theDocumentclass early, turning the one late ButtonTheme wedge into earlyanimation_grid_failed=NullPointerException(hung at index 11/21/25). Reverted.Real lever
Fix the worker↔host
Document/canvas wrapper degradation itself (inbrowser_bridge.js/parparvm_runtime.js—wrapJsObjectclass-wipe on null re-resolve, and/or the host bridge returning a Number/degenerate forgetDocument/getContextunder load). This has resisted multiple prior partial fixes and needs a dedicated investigation.Follow-up to #5143.