[WIP]sql: EXPLAIN ANALYZE with Routine with plan and count by ZhouXing19 · Pull Request #171362 · cockroachdb/cockroach

ZhouXing19 · 2026-06-02T16:25:53Z

No description provided.

Extract the body-building loop from `buildRoutine` into a standalone method `buildSQLRoutineBodyStmts`. No behavioral change. A follow-up commit will add a deferred-build path that skips this call and instead captures the ASTs for later building at execution time. Release note: None

Add a `RoutineBodyBuilder` interface in `memo`, modeled after `PostQueryBuilder`, to defer building of SQL routine body statements to execution time. Add a `BodyBuilder` field to `UDFDefinition`. Implement `sqlRoutineBodyBuilder` in `optbuilder/routine.go`, which captures metadata at plan time (parameter types, privilege context, statement tree snapshot, ASTs) and builds body RelExprs in a fresh Builder at execution time, following the `buildTriggerCascadeHelper` pattern used for FK cascades and AFTER triggers. Add `GetInitFnForDeferredRoutine` to `statementTree`. Unlike `GetInitFnForPostQuery` which excludes the current stack level, this captures ALL levels. The difference is that post-queries are children of the current-level mutation (e.g. a cascade triggered by a DELETE), while deferred routines are siblings. For example: UPDATE t SET x = my_udf(); -- my_udf() body: INSERT INTO t VALUES (1) Both the outer UPDATE and the UDF body mutate `t`. Without capturing the current level, the deferred body would see an empty statement tree and miss the conflict. Add nil-Body guards across execbuilder, memo formatter, and norm factory so existing code tolerates a deferred-build `UDFDefinition`. Nothing uses these yet — no behavioral change. Release note: None

With deferred UDF body optimization, body statements are not built into RelExprs at plan time. The execution layer needs to know whether a routine can mutate before execution to choose between LeafTxn and RootTxn (via PlanFlagContainsMutation). Fix this by computing the CanMutate property at CREATE FUNCTION time from the optimizer's transitive Relational().CanMutate logical property (which covers direct DML, mutations in CTEs/subqueries, and nested mutating UDF calls) and persisting it on the function descriptor. At query time, the persisted value is read through the Overload and UDFDefinition and used to set PlanFlagContainsMutation without needing to build the body. The descriptor field uses a three-way enum (UNKNOWN_CAN_MUTATE, CAN_MUTATE, CANNOT_MUTATE) rather than a bool. The zero value UNKNOWN_CAN_MUTATE means "not yet determined" and causes consumers to fall back to inspecting the eagerly-built body RelExprs. This handles pre-existing function descriptors created before this field was introduced without requiring a migration: they naturally have the zero value, which triggers the correct fallback behavior. Functions created or replaced after the version gate is active get CAN_MUTATE or CANNOT_MUTATE, allowing consumers to skip the body inspection. For anonymous routines (DO blocks and trigger functions), CanMutate is derived directly from the body expression at build time, since these have no descriptor. The version gate on writing CanMutate is needed for rollback safety: if the field were written before finalization and the cluster rolled back, old binaries would not reset it during CREATE OR REPLACE, leaving stale values that could cause correctness issues after re-upgrade. Release note: None

Enable deferred body building for SQL routines: body RelExprs are now built at execution time rather than plan time. Two cases still require eager build: - AnyTuple return type (RECORD without OUT params), because the actual return type must be inferred from the body. - Inlineable UDFs (single-statement, non-volatile, non-set-returning), because expression indexes and partial index predicates depend on the inlined body at plan time. Without this, CREATE INDEX on an IMMUTABLE UDF expression would fail. This restriction is overly conservative for regular DML queries — cockroachdb#169459 tracks loosening it to only force eager build in contexts that actually require plan-time inlining. EXPLAIN respects the deferred execution flow: rather than forcing eager build, a BuildDeferredBody callback on ExprFmtCtx builds deferred bodies during formatting, showing the full plan structure inline. For EXPLAIN (OPT, ENV), table refs from deferred body memos are unioned into the outer metadata so schemas and stats are collected. A side effect of deferred build is that privilege checks now match PostgreSQL: EXECUTE on the function is checked before SELECT on tables referenced in the body (previously reversed because eager build resolved table refs first). Release note (performance improvement): SQL routine (UDF/procedure) body statements are now built at execution time rather than plan time.

…bodies When SQL routine body building is deferred to execution time, the plan-time memo lacks body RelExprs and table references. This causes EXPLAIN ANALYZE (DEBUG) bundles to miss optimizer detail and table stats/schema for tables referenced inside deferred routines. This commit propagates execution-time metadata back to the bundle collector: - Add DeferredRoutineOptPlans and DeferredRoutineTableRefs fields to eval.Context, initialized when bundle collection is active. - After deferred body building in buildRoutinePlanGenerator, capture the formatted optimizer plan (opt-vv level with redaction markers) and all table references from the execution-time memo. - In the bundle collector, emit opt-vv-deferred-<func>.txt files and union deferred table refs with plan-time metadata for stats/schema collection. Note: EXPLAIN ANALYZE already uses deferred build with no special handling needed. The conn_executor intercepts the ExplainAnalyze AST before the optbuilder runs, strips the EXPLAIN ANALYZE wrapper, and passes the inner statement through the normal build path where deferred build is active. Output is generated after execution by walking the explain.Plan tree (not the memo), so deferred bodies are transparent. Release note: None

With deferred routine body building, volatile UDF bodies are not built at plan time — test output previously showed `body (deferred)` with raw AST text instead of the full RelExpr plan. This created a test coverage gap for deferred routine body plans. Set the `BuildDeferredBody` callback in `OptTester.FormatExpr` so that deferred bodies are built during formatting and tests show full plan structure inline. The callback builds the body into the outer memo's factory so column IDs are globally unique across the outer query and all UDF bodies. This follows the same pattern used for post-query (cascade/trigger) test formatting in `OptTester.PostQueries`, which also passes the outer factory to `Build()` for the same reason. Note that production code (both EXPLAIN and normal execution) correctly uses a fresh memo since the outer memo may be cached or shared. Also move `checkExpectedRules` from `postProcess` into a new `FormatAndCheck` method that runs after formatting, so that rules fired during deferred body building (e.g. `NormalizeArrayFlattenToAgg`) are tracked in `appliedRules` before `expect=`/`expect-not=` are checked. Test data changes fall into two categories: 1. Column ID renumbering: deferred UDF bodies previously showed body- memo column IDs starting from :1 (which could collide with outer query columns). Now that bodies are built into the outer memo, body column IDs continue from where the outer memo left off, producing globally unique IDs. 2. Outer query column renumbering: with eager build, body columns were allocated before some outer query columns, affecting the outer column numbering. With deferred build, the outer query columns are allocated first (body isn't built yet), so outer columns may get lower IDs than before. Release note: None

The udf_mutations subtest in logprops/udf relied on the opt test catalog providing a correct CanMutate value on function overloads. With deferred UDF body building, the test catalog can no longer derive this from the body RelExprs, and the test catalog doesn't persist CanMutate (it bypasses the optbuilder's buildCreateFunction). Move the test to a logic test where the production pipeline (DSC with CanMutate on the descriptor) handles it correctly. Epic: none Release note: None Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Previously, EXPLAIN ANALYZE's "rows decoded from KV" line only reflected KV reads from the outer plan's trace spans. Reads from SQL routine body executions (UDFs, stored procedures) were silently dropped because inner plans run through runPlanInsidePlan with separate flow metadata that is never added to the outer plan's distSQLFlowInfos. Fix this by tracking routine KV stats (rows and bytes read) separately on the planner. After each routine body statement executes, its stats are accumulated and added to the trace-derived queryLevelStats before rendering the EXPLAIN ANALYZE output. Note that this fix only updates the top-level "rows decoded from KV" aggregate. Per-node stats in the expanded plan tree still reflect only the outer plan's reads — individual routine invocations are *not* yet surfaced in the plan output (will be covered in follow-up PR). Additionally, `KVPairsRead` and `BatchRequestsIssued` for inner routines remain untracked because the `ProducerMetadata.Metrics` proto does not carry those fields. Fixes: cockroachdb#170398 Release note (bug fix): Fixed a bug where EXPLAIN ANALYZE's "rows decoded from KV" line did not include KV reads performed inside UDF and stored procedure bodies, causing the reported count to be lower than actual.

Previously, EXPLAIN ANALYZE only showed the top-level query plan with no visibility into the plans used by non-inlined SQL routine bodies (volatile UDFs, PL/pgSQL functions, stored procedures). This made it difficult to understand performance characteristics of queries that invoke routines, since the routine body plans are built at execution time (deferred building) and were invisible to the explain infrastructure. This commit captures routine body explain plans during execution and renders them as additional "routine" sections beneath the main plan tree in EXPLAIN ANALYZE output. Each routine section shows the routine name and the plan tree for each body statement, including the SQL text. The implementation wraps exec.Factory with explain.Factory during routine body building in buildRoutinePlanGenerator, captures the explain nodes, and stores them on eval.Context. After execution completes, instrumentationHelper.populateRoutinePlans() transfers the captured plans into the explain.Plan for rendering. A dedup mechanism using {routineName, planGistVector} keys ensures each unique plan variant is shown only once, while genuinely different plans (e.g., from NULL vs non-NULL arguments) each get their own section. Resolves: cockroachdb#170448 Epic: CRDB-42655 Release note (sql change): EXPLAIN ANALYZE now shows the execution plans of non-inlined SQL routine bodies (volatile UDFs, PL/pgSQL functions, stored procedures) as additional "routine" sections beneath the main query plan. Each section displays the routine name and the plan tree for each body statement, making it easier to diagnose performance issues in queries that invoke routines. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ALYZE Wire up execution stats (KV time, rows decoded, actual row count, etc.) for routine body plan nodes in EXPLAIN ANALYZE output. Two fixes: 1. Set associateNodeWithComponents on the inner PlanningCtx in runPlanInsidePlan so routine body exec.Nodes get mapped to component IDs in the shared trace metadata. 2. Call populateRoutinePlans before annotateExplain (and walk the routine plans during annotation) so stats are attached to the already-populated plan nodes. This is WIP: currently only the first execution's stats are shown for each {name, gist} combo. A follow-up commit will aggregate stats across all invocations. Epic: CRDB-46498 Release note: None Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…LAIN ANALYZE Routine body plans in EXPLAIN ANALYZE now show how many times each routine variant was invoked (`invocations: N`), and when a routine has multiple distinct plan shapes (e.g. point lookup vs norows due to NULL args), each is labeled with `plan variant: X of Y`. The implementation changes CapturedRoutineGists from a set to a counter map, increments on every invocation, and populates InvocationCount and variant numbering in populateRoutinePlans. Routine plans are also sorted by name for deterministic output ordering. Epic: none Release note (sql change): EXPLAIN ANALYZE now shows invocation counts and plan variant labels for routine body plans. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

trunk-io · 2026-06-02T16:25:58Z

Merging to master in this repository is managed by Trunk.

To merge this pull request, check the box to the left or comment /trunk merge below.

After your PR is submitted to the merge queue, this comment will be automatically updated with its status. If the PR fails, failure details will also be posted here

blathers-crl · 2026-06-02T16:26:00Z

Your pull request contains more than 1000 changes. It is strongly encouraged to split big PRs into smaller chunks.

_{🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.}

cockroach-teamcity · 2026-06-02T16:26:44Z

This change is

ZhouXing19 and others added 12 commits May 15, 2026 10:44

some fix

d3812c0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[WIP]sql: EXPLAIN ANALYZE with Routine with plan and count #171362

[WIP]sql: EXPLAIN ANALYZE with Routine with plan and count #171362
ZhouXing19 wants to merge 12 commits into
cockroachdb:masterfrom
ZhouXing19:udf-explain-count

ZhouXing19 commented Jun 2, 2026

Uh oh!

trunk-io Bot commented Jun 2, 2026

Uh oh!

blathers-crl Bot commented Jun 2, 2026

Uh oh!

cockroach-teamcity commented Jun 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

ZhouXing19 commented Jun 2, 2026

Uh oh!

trunk-io Bot commented Jun 2, 2026

Uh oh!

blathers-crl Bot commented Jun 2, 2026

Uh oh!

cockroach-teamcity commented Jun 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants