Skip to content

fix(query): refresh ATTACH table schema in system.columns/statistics#19966

Draft
TCeason wants to merge 2 commits into
databendlabs:mainfrom
TCeason:attach_system
Draft

fix(query): refresh ATTACH table schema in system.columns/statistics#19966
TCeason wants to merge 2 commits into
databendlabs:mainfrom
TCeason:attach_system

Conversation

@TCeason

@TCeason TCeason commented Jun 5, 2026

Copy link
Copy Markdown
Collaborator

I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/

Summary

system.columns, system.statistics and information_schema.columns don't reflect columns added to a read-only ATTACH table's source table. DESC shows them, but these system tables keep the schema frozen at ATTACH time.

system.tables keeps refresh disabled on purpose: it only needs meta-server level table info and never reads the snapshot.

Add setting enable_table_schema_refresh (default 0). When enabled, system.columns / system.statistics re-fetches the schema from storage for each ATTACH table, picking up source-side schema changes.

Tests

  • Unit Test
  • Logic Test
  • Benchmark Test
  • No Test - Explain why

Type of change

  • Bug Fix (non-breaking change which fixes an issue)
  • New Feature (non-breaking change which adds functionality)
  • Breaking Change (fix or feature that could cause existing functionality not to work as expected)
  • Documentation Update
  • Refactoring
  • Performance Improvement
  • Other (please describe):

This change is Reviewable

system.columns, system.statistics and the information_schema.columns view
expose per-column metadata via table.schema(). For read-only ATTACH tables
the schema is not persisted on the meta server; it is derived from the source
table's latest snapshot and only becomes current after a refresh. dump_tables
disabled refresh for performance, so these tables reported the schema frozen
at ATTACH time, hiding columns added to the source afterwards (while DESC,
which refreshes, showed them).

Naively dropping disable_catalog_refresh would regress the resilience added in
"avoid SHOW TABLES refresh failures": a single unreachable ATTACH table makes
its whole-database refresh fail, dropping the columns of healthy sibling
tables. So keep listing through the refresh-disabled catalog (fast, resilient,
zero S3 for normal tables), then refresh ATTACH tables individually through
the original catalog, falling back to the cached schema with a warning on
failure so one broken table never drops its siblings' columns.

system.tables keeps refresh disabled on purpose: it only needs meta-server
level table info and never reads the snapshot.
@github-actions github-actions Bot added the pr-bugfix this PR patches a bug in codebase label Jun 5, 2026
@TCeason TCeason requested review from dantengsky and wubx June 5, 2026 04:47
@TCeason TCeason marked this pull request as draft June 5, 2026 11:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-bugfix this PR patches a bug in codebase

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant