Skip to content

fix(db): retry initial Mongo connection to avoid worker crash-loop#559

Merged
neSpecc merged 5 commits into
masterfrom
fix/mongo-connect-retry
Jun 12, 2026
Merged

fix(db): retry initial Mongo connection to avoid worker crash-loop#559
neSpecc merged 5 commits into
masterfrom
fix/mongo-connect-retry

Conversation

@Kuchizu

@Kuchizu Kuchizu commented Jun 3, 2026

Copy link
Copy Markdown
Member

DatabaseController.connect() now retries the initial handshake instead of throwing on the first failure.

@neSpecc neSpecc requested a review from Copilot June 4, 2026 19:03

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the MongoDB DatabaseController.connect() startup behavior so workers don’t crash-loop when MongoDB is temporarily unreachable, by retrying the initial connection handshake with a fixed delay and a bounded server-selection timeout.

Changes:

  • Add configurable retry loop for the initial MongoDB connection attempt (MONGO_RECONNECT_TRIES, MONGO_RECONNECT_INTERVAL).
  • Add serverSelectionTimeoutMS to ensure each attempt fails fast during outages.
  • Adjust connect() to return the existing Db instance when already connected.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread lib/db/controller.ts Outdated
Comment thread lib/db/controller.ts Outdated
Comment thread lib/db/controller.ts
neSpecc
neSpecc previously approved these changes Jun 10, 2026
Comment thread lib/db/controller.ts Outdated
@neSpecc neSpecc merged commit 330cad9 into master Jun 12, 2026
5 checks passed
@neSpecc neSpecc deleted the fix/mongo-connect-retry branch June 12, 2026 15:57
Kuchizu added a commit that referenced this pull request Jun 12, 2026
* feat(grouper): add slow handle diagnostics (#549)

* feat(grouper): add slow handle diagnostics

* refactor(grouper): extract slow handle diagnostics into session

* fix(grouper): use monotonic time and exclusive timings in slow handle diagnostics

* fix(db): retry initial Mongo connection to avoid worker crash-loop (#559)

* fix(db): retry initial Mongo connection to avoid worker crash-loop

* fix(db): clamp Mongo reconnect env vars and test retry loop

* refactor(db): move positiveIntEnv to utils

* fix(task-manager): use event._id instead of groupHash in issue event URL (#569)

* Initial plan

* fix: use event._id instead of groupHash in task-manager event URL

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>

---------

Co-authored-by: Kuchizu <70284260+Kuchizu@users.noreply.github.com>
Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants