feat(doctor): probe agent auth, install source, and version freshness#762
feat(doctor): probe agent auth, install source, and version freshness#762matt2e wants to merge 16 commits into
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 9a76a94816
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| let (auth_status, auth_output) = match info.auth_status_command { | ||
| Some(cmd) => match execute_command_blocking(cmd) { |
There was a problem hiding this comment.
Require the main CLI before running auth probes
For agents with a separate main_command (Claude, Codex, Amp), this branch is reached whenever the ACP bridge is found, even if the main CLI is missing. In that bridge-only state, the auth probe shells out to commands such as codex login status; a command-not-found error is then reported as “Installed, not authenticated” with an auth fix (codex login) instead of telling the user to install the missing main CLI, so the offered fix cannot work. Gate auth probing on main_path being present or return an install-main warning for this scenario.
Useful? React with 👍 / 👎.
| if let (Some(readout), Some(path)) = (&check.main, check.path.as_deref()) { | ||
| let (package_id, latest_source) = | ||
| resolve_package(&check.id, readout.install_source.as_ref()); |
There was a problem hiding this comment.
Key freshness lookups by binary slot
When a split agent's main CLI and bridge have the same install source, this lookup cannot tell which binary is being probed. For example, a npm-installed Codex CLI under main is resolved with the ai-agent-codex + Npm package entry, which package_ids.rs defines for @zed-industries/codex-acp, so the main readout reads the wrong package id/latest version (and package.json name checks may fail entirely). Include the slot/command in the package mapping instead of using only check id and install source.
Useful? React with 👍 / 👎.
Phase 1 of the doctor reunification plan — purely additive schema changes so downstream consumers can begin depending on the new fields ahead of Phase 2's behavior changes. - Add AuthStatus and InstallSource enums. - Add Auth variant to FixType (no-op match arm in lookup_fix_command). - Extend DoctorCheck with auth_status, installed_version, latest_version, update_available, install_source — all Option-defaulting-None. - Extend ResolvedBinary with install_source (Option, default None). - Extend AgentCheckInfo with auth_command and auth_status_command, initialized to None for every existing AI_AGENT_CHECKS entry. The staged Tauri app compiles unchanged because every new field is Option<...> defaulting to None — no behavior change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Matt Toohey <contact@matttoohey.com>
Phase 2 of the doctor reunification plan — data-only widening of the AI agent table. Switch the Claude bridge to its canonical npm package name `@agentclientprotocol/claude-agent-acp` (the `@zed-industries/...` package is deprecated/renamed, not a competing fork). Add ai-agent-copilot and ai-agent-cursor entries, and populate auth_command / auth_status_command for Claude, Codex, Amp, Copilot, and Cursor. The runtime does not yet read these — Phase 3 wires them. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Matt Toohey <contact@matttoohey.com>
Phase 2 follow-up — switch Claude's auth_command from `claude /login` to `claude auth login` so the string is byte-equivalent to goose-internal's agent_setup.rs, keeping Phase 6 reunification a true no-op rather than a behavior change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Matt Toohey <contact@matttoohey.com>
Phase 3a of the doctor reunification plan — runtime auth probing. For auth-aware agents, bridge presence is now necessary-but-not-sufficient for Pass: a failed auth probe downgrades the check to Warn with AuthStatus::NotAuthenticated, and surfaces the agent's auth_command via fix_type/fix_command so the UI can offer a one-click fix. Agents with an auth_command but no auth_status_command (copilot) report NotApplicable; agents with neither (goose, pi) keep auth_status = None. execute_fix(_, FixType::Auth) is now wired through lookup_fix_command, which returns the matching agent's auth_command for the new arm. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Matt Toohey <contact@matttoohey.com>
Phase 3b of the doctor reunification plan — closes a binary-presence resolution gap so doctor sees what goose-internal's ACP inventory sees. Adds a third resolve strategy after login-shell `which` and the hardcoded /opt/homebrew, /usr/local, etc. list: ~/.npm-global/bin, ~/.npm/bin, all ~/.nvm/versions/node/*/bin entries, the macOS Homebrew nvm symlink at /opt/homebrew/opt/nvm/versions/node/*/bin, and finally `npm prefix -g`/bin as a one-subprocess last resort. Mirrors the dirs in goose-internal's services::path_env (and upstream goose's SearchPaths::with_npm) so npm-only bridge installs no longer surface as false "agent missing" warnings. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Matt Toohey <contact@matttoohey.com>
Phase 3c of the doctor reunification plan — populate `InstallSource` on resolved binaries. Path-prefix heuristics only (Brew, Cargo, Mise, Asdf, Npm, System) — no subprocess probes for v1; ambiguous targets like ~/.local/bin curl-pipe installs report Unknown rather than risk a false positive. `detect_install_source` runs on every successful `resolve_binary` hit and the value flows through `ResolvedBinary` into each `DoctorCheck` site (git, gh, git-lfs, clonefile, AI agents — bridge preferred over main). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Matt Toohey <contact@matttoohey.com>
Phase 3d of the doctor reunification plan — installed/latest version detection gated behind RunChecksOptions::check_freshness. Default run_checks() is unchanged: no version probes, no network, no new subprocess fan-out — the staged Tauri app sees the same latency profile as before this commit. Latest-version sources covered: brew (brew info --json=v2), npm (npm view <pkg> version), and crates.io (HTTP GET via reqwest). Results are cached under <cache_dir>/doctor/freshness.json with a 1-hour TTL; read once at the start of run_checks_with_options, persisted atomically at the end. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Matt Toohey <contact@matttoohey.com>
Phase 5 of the doctor reunification plan.
execute_fix is now a |_| {} wrapper over execute_fix_streaming, which
invokes a callback once per stdout/stderr line so downstream Tauri apps
can emit per-line install/auth progress (goose-internal's existing
agent-setup:output event pattern). Line ordering across stdout/stderr
is best-effort — within each stream lines arrive in order, but
interleaving depends on scheduling.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Matt Toohey <contact@matttoohey.com>
…ommands Restore BOT-686 / goose-internal PR #358 parity that the better-doctor migration dropped: direct npmjs.org access is blocked by Cloudflare WARP, so npm install/view must be routed through Block's internal Artifactory proxy via --registry=<url>. This wires an optional, caller-supplied registry through the crate without baking in any Block-specific URL. - Add `npm_registry: Option<String>` to RunChecksOptions (default None). - Add `apply_npm_registry(command, registry)` helper that appends ` --registry=<url>` only to npm-backed commands (npm install / npm view); curl-pipe installers, auth commands, and the git-clonefile fix pass through unchanged. - Apply the registry to the displayed fix_command: threaded through run_checks_with_options -> collect_base_report -> check_single_ai_agent so the shown command matches what runs. - Apply the registry to the freshness probe: latest_npm now adds --registry <url>, threaded via populate_freshness -> fetch_version_info. - Apply the registry to the execute path: new execute_fix_with_options / execute_fix_streaming_with_options resolve the command, run it through apply_npm_registry, then shell out. The existing execute_fix / execute_fix_streaming delegate with None. Purely additive: every new field/param is Option-defaulting-None, so RunChecksOptions::default() reproduces today's exact commands and the staged Tauri app compiles and behaves unchanged. The caller (goose-internal) supplies the Block Artifactory URL. Not in this commit (follow-ups): per-OS/platform install recipes (amp's CLI installer) and per-package name overrides. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Matt Toohey <contact@matttoohey.com>
Replace the split closure `|c: char| c == '-' || c == '+' || c == ' '` with the char-array pattern `['-', '+', ' ']`. Same delimiters, same behavior — clears the new clippy::manual_pattern_char_comparison lint (Rust 1.93.0) that was failing crates-lint in the pre-push hook. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Matt Toohey <contact@matttoohey.com>
Phase A of the per-agent install-method plan — fix install-source detection for curl/native installs (Claude native, Cursor, Amp) that land in ~/.local/bin (or ~/bin) and currently collapse to Unknown. Path-only heuristics can't classify these, so add two layers on top of the existing path-prefix detector, both keeping Unknown as the honest fallback: - Filesystem fingerprinting (resolve.rs): when path-prefix detection yields Unknown for a binary in a user-local bin dir, fingerprint_curl_pipe recognises a curl/native install via two low-false-positive signals — a known installer footprint marker under $HOME (CURL_INSTALLER_FOOTPRINTS for claude/cursor-agent/amp), or a symlink into a versioned install dir under $HOME (the Cursor/Claude native layout). Only read_link/exists/ canonicalize — no subprocess or network. detect_install_source now upgrades Unknown to CurlPipe on a match; the pure path-only inner (detect_install_source_with_home) is unchanged. - Per-tool override hook (agents.rs): AgentCheckInfo gains an optional install_source_override, applied only when the resolved binary's source is Unknown (apply_install_source_override). Claude, Amp, and Cursor declare CurlPipe; a positively-detected source (Brew/Npm) and a missing binary (None) are left untouched, so registry installs are never mislabelled. Purely additive: the new field is Option-defaulting-None and latest-version lookup still returns None for CurlPipe (Phase B territory), so default run_checks() behaviour and the staged Tauri app are unchanged. Adds unit tests for both the fingerprint signals and the override precedence. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Matt Toohey <contact@matttoohey.com>
Phase B of the per-agent install-method plan — give curl/native installs a "latest version" path and stop nagging on tools that update themselves. latest_version previously returned None for any source other than Brew/Npm/Cargo, so CurlPipe/Unknown/Mise/Asdf installs (the very tools Phase A just learned to fingerprint) had no freshness signal at all. This adds: - GitHub Releases fetcher: latest_github_releases(owner/repo) mirrors the existing latest_crates_io reqwest pattern — GET releases/latest, parse tag_name (leading v stripped). Degrades gracefully to None on missing token, rate limits, network errors, or unparseable payloads; an optional GITHUB_TOKEN/GH_TOKEN bearer credential relaxes the anonymous rate limit but is never required. JSON parsing is split into parse_github_release_tag for unit testing without a network call. - Explicit per-package latest source: PACKAGE_IDS entries now carry a LatestSource (Brew/Npm/CratesIo/GitHubReleases) alongside the install source, so the fetcher no longer infers the mechanism from InstallSource. This lets Cursor's CurlPipe install resolve to a GitHub-releases lookup (best-effort repo slug, report-only) without overloading CurlPipe to always mean GitHub. Claude native keeps a clear TODO: its channel manifest isn't parsed yet, so it stays report-only. - SelfUpdating marker: is_self_updating(source) treats CurlPipe installs (Claude native, Cursor, Amp-curl) as tool-managed. Per the resolved decision, these are report-only — installed/latest are surfaced for display but update_available is forced to None (never a stale-version nag), and the new DoctorCheck.self_updating flag carries "managed by tool". A brew/npm install of the same agent is not CurlPipe, so it stays user-managed and actionable. Additive and gated behind check_freshness: default run_checks() still does no network/subprocess version probes (self_updating defaults None), so the staged Tauri app's latency profile is unchanged. Adds unit tests for the GitHub tag parser, the self-updating predicate, and the Cursor→GitHubReleases wiring. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Matt Toohey <contact@matttoohey.com>
Phase C of the per-agent install-method plan — stop collapsing an agent's
own CLI and its ACP bridge into a single install_source/version. Per the
resolved decision, surface both independently.
- Add `AgentVersionInfo { install_source, installed_version, latest_version,
update_available, self_updating }` (Default, camelCase) and two
Option-defaulting-None fields on `DoctorCheck`: `main` (the agent CLI / the
single binary for agents with no separate bridge) and `bridge` (the ACP
bridge, when distinct). Non-agent checks and absent binaries leave the
corresponding readout `None`.
- `check_single_ai_agent` already resolves both binaries; it previously kept
only the bridge's source and discarded the main CLI's. It now builds both
readouts: when the agent has a separate `main_command`, the curl/native
install-source override applies to the main CLI while the bridge keeps its
positively-detected source; when there is no separate bridge, the single
resolved binary is reported under `main`.
- Freshness fans out to both readouts: `populate_freshness` builds per-readout
targets (Main/Bridge) keyed by each readout's own install source, so e.g.
Codex probes brew for its main CLI and npm for its bridge in the same pass.
`PACKAGE_IDS` already carried the multiple `(source, package_id, LatestSource)`
pairs this needs — no table redesign. Non-agent checks keep a Flat target.
- Backward compat: the flat version fields stay populated by mirroring the
headline readout (bridge if present, else main) — byte-identical to the
pre-split pass. Existing consumers (and the staged Tauri app) are unaffected.
Additive and gated behind `check_freshness`: default `run_checks()` still does
no network/subprocess version probes (readouts carry only install_source on the
cheap path), so the staged app's latency profile is unchanged. Adds unit tests
for the two-readout split, override-on-main-only, single-binary-as-main, the
unresolved case, and the no-freshness invariant on the readouts.
The TypeScript `DoctorCheck` mirror lives in goose-internal and is updated when
that repo re-pins the crate (Phases D/E) — out of scope here.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Matt Toohey <contact@matttoohey.com>
Fix the freshness review finding that `populate_freshness` hardcoded `--version` as the installed-version probe for every target. npm- distributed ACP bridges are `node` scripts that don't honor it (`codex-acp --version` errors with a clap "unexpected argument"; `amp-acp --version` exits 0 but prints nothing), so installed_version stayed None while latest_version was still fetched — leaving the bridge readout with a latest and no installed counterpart, and update_available uncomputable. Make the installed-version mechanism data-driven off each readout's already-known install source: - freshness.rs: add `InstalledProbe` (Cli(args) vs NpmPackageJson) and a pure-filesystem `installed_version_from_package_json` that canonicalizes the binary (resolving npm's bin/ symlink), then walks up a bounded six levels to the owning `node_modules/<pkg>/package.json`. When the package id is known the file's `name` must match, so a nested dependency's package.json is never picked up; scoped packages resolve correctly. No subprocess, no network. `fetch_version_info` now takes an `InstalledProbe` instead of `&[&str]`, branching to the package.json reader for npm targets and the existing `--version` CLI probe otherwise. Add `select_installed_probe(install_source, package_id)` so the choice is unit-testable. - lib.rs: carry each readout's `install_source` into `FreshnessTarget` (from `readout.install_source` for Main/Bridge, `check.install_source` for Flat) and select the probe via `select_installed_probe`. Codex's bridge (npm) now reads package.json while its main CLI (brew) keeps `--version`, in the same pass — the Phase C per-readout split already gives each readout the right source. Purely under `check_freshness`; the default `run_checks()` path does no version probing, so its latency profile is unchanged, and no PACKAGE_IDS redesign is needed. Adds unit tests for the package.json reader (symlink resolution, mismatched-name walk-past, scoped packages, missing file) and the probe selector (npm vs brew/curl/cargo/none). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Matt Toohey <contact@matttoohey.com>
…pawn failures; use resolved bin dir in auth PATH
Four tightly-coupled bugs that all surface in the Claude Code row when
goose-internal (or any embedder) is launched with a restricted PATH
— the macOS Finder/launchd case where the child shell only sees
/usr/bin:/bin:/usr/sbin:/sbin. Bundled because the intermediate states
would each leave the Claude row showing different misleading output.
1. Add main-vs-bridge role to PACKAGE_IDS lookup. The table was keyed
only by (check_id, InstallSource); Claude's main CLI under nvm
(tagged Npm) collided with the bridge entry and produced the wrong
latest_version (and None installed_version, since the package.json
name mismatched in freshness.rs). Each entry now carries a `Role`
(Main/Bridge/Any) and `lookup_package_id` takes the readout's role
as a third argument. `ReadoutSlot::Main`/`Bridge`/`Flat` in
populate_freshness map straight to `Role::Main`/`Bridge`/`Any`.
Registered `@anthropic-ai/claude-code` as Claude main on npm and
kept `@agentclientprotocol/claude-agent-acp` as Claude bridge.
Codex/amp have distinct install sources for main vs bridge so their
flat lookups were already unambiguous; they're tagged for clarity
but otherwise unchanged.
2. Prepend the main CLI's resolve trace to `raw_output` on the happy
path. `check_single_ai_agent` previously only included
`resolved_main.search_output` on the "bridge missing" and "neither
found" branches, so a successful Claude row showed a main `path`
with no trace explaining how it was found. Now happy-path
raw_output renders both traces with `# main CLI (claude):` and `#
ACP bridge (claude-agent-acp):` section headers, main first.
3. Distinguish "command not found / spawn failure" from "not
authenticated". Added `AuthStatus::Unknown` and a new
`execute_command_with_path_prefix` returning an `ExecOutcome`
(`Ok` / `Spawn(io::Error)` / `Exit { code, stderr }`). The auth
probe maps `Exit { code: Some(127), .. }` and `Spawn(_)` to
`Unknown` (PATH-shadowed agents shouldn't read as signed out); any
other non-zero exit stays `NotAuthenticated`. `fix_command` /
`fix_type` are no longer surfaced for `Unknown` — `claude auth
login` doesn't help if claude isn't on PATH. Raw output labels the
not-found case explicitly. Downstream TS mirrors should add a
`unknown` arm to their AuthStatus union and UI handling.
4. Build a PATH prefix from the resolved binary directories
(`resolved_main.path` parent + each `resolved_cmds[*].path`
parent, deduped) and thread it into the auth-probe exec. The
shared login-shell setup is now in `build_shell_command`, which
adds the prefix dirs ahead of a conservative fallback
`/usr/local/bin:/opt/homebrew/bin:/usr/bin:/bin:/usr/sbin:/sbin`
when supplied. Existing streaming callers pass an empty prefix and
see the same env layout as before.
Follow-up: the same PATH prefix isn't yet plumbed to `execute_fix` /
`execute_fix_streaming` for `FixType::Auth`, because the fix-execution
path doesn't currently re-resolve the binary or carry resolved-bin
info. `claude auth login` has the same PATH problem and will benefit
from this once the resolved bin dir is plumbed through `execute_fix`.
Tests cover: the new role-tagged package_id lookups (claude main and
bridge resolving to different packages despite sharing InstallSource),
`installed_version_from_package_json` finding
`@anthropic-ai/claude-code` for a typical npm layout, exit-127
mapping to `Unknown` semantics via `execute_command_with_path_prefix`,
a scripted command findable only via the supplied PATH prefix, and
the main/bridge resolve-trace ordering in `raw_output`.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Matt Toohey <contact@matttoohey.com>
Clicking *Update* on an AI agent today invokes the doctor crate with `FixType::Command`, which looks up the static `install_command` from `AI_AGENT_CHECKS`. For Claude that's the native curl-pipe installer — so if the user has Claude installed via npm under nvm, clicking Update silently side-loads `~/.local/bin/claude` with a fresh native install while the npm install (the one we meant to update) stays stale. Bridge has the same shape. Derive the update command from each readout's already-known `(install_source, package_id)` instead, and route execution through an explicit `command_override` so the per-readout update string isn't forced through the static install-time recipe table. - `FixType` gains `UpdateMain` / `UpdateBridge` variants for the per-readout update slots. Serde `rename_all` switches to camelCase so the new variants serialize as `updateMain` / `updateBridge`; the existing single-word variants are identical under both rules. - `AgentVersionInfo` gains `update_command` + `update_fix_type` (Option, default None). Both populated together by the freshness pass when an update is computable and actionable. - `derive_update_command(install_source, package_id)` in `agents.rs` maps Npm/Brew/Cargo to their canonical update recipes (`npm install -g <pkg>@latest`, `brew upgrade <pkg>`, `cargo install --force <pkg>`); CurlPipe/Mise/Asdf/Unknown/System and missing package ids return None. `apply_npm_registry` runs over the final command downstream, so the npm form picks up the Block Artifactory registry append automatically. - `apply_freshness` (in `lib.rs`) now takes the readout's slot and package id alongside the freshness result. When the readout is Main/Bridge, the update is actionable (`update_available == Some(true)` and not self-updating), and `derive_update_command` returns Some, the readout's `update_command` + `update_fix_type` are filled in. Flat (non-agent) readouts never get an update command — out of scope. - `execute_fix_with_options` / `execute_fix_streaming_with_options` gain a `command_override: Option<String>` parameter. When Some, the static `lookup_fix_command` is bypassed and the override is run verbatim (still threaded through `apply_npm_registry`). The Update variants have no static recipe — they can only dispatch through the override path — so `lookup_fix_command` returns None for them. The convenience wrappers (`execute_fix`, `execute_fix_streaming`) pass None and behave as before. Tests cover: `derive_update_command` per InstallSource variant, `lookup_fix_command` returning None for the Update variants, `apply_freshness` emitting npm/brew update commands on actionable Main/Bridge readouts, CurlPipe never producing an update command (belt-and-braces with the existing self-updating suppression), no update available leaving fields None, and the executor running an override for `UpdateMain` (where the static lookup has nothing) vs. erroring out without an override. The downstream goose-internal pin bump, TypeScript mirror update, and frontend rewire are a separate session. Signed-off-by: Matt Toohey <contact@matttoohey.com>
Substantially expands the
doctorcrate to give a richer, more accurate readout of AI agent CLIs and their install state.What's new
package.json.execute_fixnow streams subprocess output.DoctorCheckfields for auth/version/install-source, plus supportingpackage_ids,checks, andfreshnessmodules.Notes
manual_pattern_char_comparison) in freshness.