fix: allow app to start when read replica is unreachable#3136
Open
danoswaltCL wants to merge 4 commits into
Open
fix: allow app to start when read replica is unreachable#3136danoswaltCL wants to merge 4 commits into
danoswaltCL wants to merge 4 commits into
Conversation
- Replace Promise.all with Promise.allSettled in typeormLoader so a replica connection failure no longer prevents the app from booting. Main DB failure still throws (critical). Replica failure is logged and the app continues; affected analytics/export requests will error at query time rather than at startup. - Resolve two TODO comments by wiring up UpgradeLogger (winstonLoader runs before typeormLoader so Winston is always ready). - Add 22 unit tests covering all key scenarios: both succeed, replica only fails, main only fails (DB_UNREACHABLE), migration fails (MIGRATION_ERROR), both fail, and edge cases (undefined settings, synchronize mode, ECS mode). Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Contributor
There was a problem hiding this comment.
Pull request overview
This PR updates the backend TypeORM bootstrap logic so a failure to connect to the read-replica/export DataSource no longer prevents the application from starting, and adds unit tests to validate the new startup behavior.
Changes:
- Make replica/export DataSource initialization non-fatal by logging failures instead of aborting boot.
- Add structured error logging in
typeormLoaderfor DB startup failures. - Add unit tests covering main DB failures, replica failures, and migration failure classification.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
packages/backend/src/loaders/typeormLoader.ts |
Switches replica init handling to be optional (log on failure) and adds loader-level logging/error classification. |
packages/backend/test/unit/loaders/typeormLoader.test.ts |
Adds focused unit tests for startup behavior across main DB vs replica failures and migration errors. |
Comments suppressed due to low confidence (1)
packages/backend/src/loaders/typeormLoader.ts:115
- The catch block logs
{ message: 'Database connection failed' }for all failures, including migration errors (code === '42P07') and any other runtime errors thrown in the loader. This makes the log message misleading and harder to triage. Consider logging a message that matches the classified error (e.g., "Database migration failed" for 42P07) or a more general "Database initialization failed".
const error = err as any;
log.error({ message: 'Database connection failed', error });
if (error.code === 'ECONNREFUSED') {
error.type = SERVER_ERROR.DB_UNREACHABLE;
throw error;
} else if (error.code === '42P07') {
error.type = SERVER_ERROR.MIGRATION_ERROR;
throw error;
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+77
to
+93
| const [mainResult, replicaResult] = await Promise.allSettled([ | ||
| appDataSourceInstance.initialize(), | ||
| exportDataSourceInstance.initialize(), | ||
| ]); | ||
|
|
||
| // Main DB is required — rethrow so the outer catch can classify and re-throw. | ||
| if (mainResult.status === 'rejected') { | ||
| throw mainResult.reason; | ||
| } | ||
|
|
||
| // Read replica is optional — log the failure but let the app start. | ||
| // Requests that attempt to use the replica connection will receive an error | ||
| // at query time instead of preventing the whole app from booting. | ||
| if (replicaResult.status === 'rejected') { | ||
| const replicaErr = replicaResult.reason as any; | ||
| log.error({ message: 'Read replica connection failed — continuing without replica', error: replicaErr }); | ||
| } |
Collaborator
Author
There was a problem hiding this comment.
@bcb37 i think this suggestion makes sense, but I'm not sure how to actually test it, or if it is worth fussing over to try to test. worst case scenario is a slight delay in startup. what do you think.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
when read-replica env var is not empty and has a bad value and can't connect, the app doesn't start. this has caused issues in staging with name changes; this should be a rare occurrence but it could still definitely happen in prod some day, so we can allow the app to start and log in background. any calls that would normally use read-replica would fail (instead of re-routing to main db, which may be undesirable).