Skip to content

fix #849 fix #1428 need rebind replica zk path when path already exists explicitly to cover some corner cases#1440

Open
Slach wants to merge 7 commits into
masterfrom
fix/1428-rebind-replica-path-opt-in
Open

fix #849 fix #1428 need rebind replica zk path when path already exists explicitly to cover some corner cases#1440
Slach wants to merge 7 commits into
masterfrom
fix/1428-rebind-replica-path-opt-in

Conversation

@Slach

@Slach Slach commented Jun 25, 2026

Copy link
Copy Markdown
Collaborator

fix #849 fix #1428

The #849 stale-path detection (commit 0d8279e, v2.7.1) rebound a restored ReplicatedMergeTree to default_replica_path whenever the table-level ZK path still had children but our own replica entry was absent. In an HA restore the live sibling replica's children are observationally identical to stale leftovers, so the second replica was wrongly rebound -> split-brain with total_replicas=1 on both nodes (#1428).

Stale/foreign leftovers (#849) and a live sibling (#1428) cannot be told apart from ZK state — the discriminator is operator intent. So default to the HA-safe join behavior and put the #849 rebind behind an explicit opt-in: clickhouse.rebind_replica_path_if_exists (env CLICKHOUSE_REBIND_REPLICA_PATH_IF_EXISTS, CLI --rebind-replica-path-if-exists on restore/restore_remote).

Add TestRebindReplicaPathIfExists covering both branches.

@Slach Slach added this to the 2.7.3 milestone Jun 25, 2026
…le (#1428)

The #849 stale-path detection (commit 0d8279e, v2.7.1) rebound a restored
ReplicatedMergeTree to default_replica_path whenever the table-level ZK path
had children but our own replica entry was absent. In an HA restore the live
sibling's children look identical to stale leftovers, so the second replica was
wrongly rebound -> split-brain with total_replicas=1 on both nodes (#1428).

The discriminator is whether a *different local* table already uses that exact
resolved path: system.replicas is node-local, so a hit means a foreign/renamed
table occupies the path on THIS node (the real #849 case) and we rebind
automatically; a genuine HA sibling lives on another node and is absent from
the local system.replicas, so we join (HA-safe). The residual case (ZK children
but no local table -- async-stale after DROP, indistinguishable from a
briefly-offline sibling) stays behind the opt-in flag:
clickhouse.rebind_replica_path_if_exists (env CLICKHOUSE_REBIND_REPLICA_PATH_IF_EXISTS,
CLI --rebind-replica-path-if-exists on restore/restore_remote).

Add TestRebindReplicaPathIfExists covering auto-rebind, HA-join and flag cases.
@Slach Slach force-pushed the fix/1428-rebind-replica-path-opt-in branch from 39aaada to 1c8499f Compare June 25, 2026 19:33
Slach and others added 6 commits June 26, 2026 06:27
Signed-off-by: slach <bloodjazman@gmail.com>
…ints (#1428)

Wire the rebind-replica-path-if-exists override into POST /backup/restore
and /backup/restore_remote: both now read the per-request reloaded config
and set ClickHouse.RebindReplicaPathIfExists=true when the
rebind_replica_path_if_exists / rebind-replica-path-if-exists query param
is present, mirroring the CLI flag override semantics (only forces true,
never false).

The restore handler previously built the Backuper from the stale api.config
instead of the freshly reloaded cfg; switch it to cfg. Also fix
httpCleanHandler to reload config from disk per request like every other
HTTP handler and its actionsCleanHandler sibling, using a closure defer so
the recorded status captures errors from both ReloadConfig and Clean.

Document both query params in ReadMe.md and add testAPIRebindReplicaPath to
the integration TestServerAPI suite, exercising flag-off (join shared ZK
path) vs flag-on (rebind to default_replica_path) through the API.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The SYNC keyword for DROP TABLE only lands in ClickHouse 21.x; on 20.8
it fails with 'Syntax error ... Expected one of: NO DELAY'. Mirror the
version gate already used in replicationPath_test.go: the test is gated
to >= 20.8, where 'DROP TABLE ... NO DELAY' provides the same synchronous
drop needed to deregister the replica ZK entry before restore.
TestFIPS intermittently fails on the first create_remote with 'another
operation is currently running' (ErrAPILocked, HTTP 500). Root cause: the
fips server runs ResumeOperationsAfterRestart on startup, which scans
/var/lib/clickhouse/backup/*/*.state2 and resumes any leftover upload/download
state, holding the API lock while the test fires its first action. env.Cleanup
does not remove resumable state, so it leaks between tests sharing a pooled env.

Add a diagnostic check at the start of TestFIPS (before it creates any backup)
that lists *.state2 and fails with the offending paths, so the leaking test
and backup name are identified instead of surfacing as a flaky lock error.
@coveralls

Copy link
Copy Markdown

Coverage Report for CI Build 28297144703

Warning

Build has drifted: This PR's base is out of sync with its target branch, so coverage data may include unrelated changes.
Quick fix: rebase this PR. Learn more →

Coverage increased (+0.08%) to 65.104%

Details

  • Coverage increased (+0.08%) from the base build.
  • Patch coverage: 12 uncovered changes across 2 files (48 of 60 lines covered, 80.0%).
  • 7 coverage regressions across 4 files.

Uncovered Changes

File Changed Covered %
pkg/backup/restore.go 24 17 70.83%
pkg/server/server.go 28 23 82.14%
Total (3 files) 60 48 80.0%

Coverage Regressions

7 previously-covered lines in 4 files lost coverage.

File Lines Losing Coverage Coverage
pkg/backup/create.go 2 74.72%
pkg/backup/list.go 2 57.58%
pkg/clickhouse/clickhouse.go 2 78.86%
pkg/server/server.go 1 59.64%

Coverage Stats

Coverage Status
Relevant Lines: 21656
Covered Lines: 14099
Line Coverage: 65.1%
Coverage Strength: 37104.56 hits per line

💛 - Coveralls

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment