Skip to content

fix: Avoid error in organization delete operation if done just after creation#1542

Merged
AgustinBettati merged 7 commits intomasterfrom
CLOUDP-332705
Jan 21, 2026
Merged

fix: Avoid error in organization delete operation if done just after creation#1542
AgustinBettati merged 7 commits intomasterfrom
CLOUDP-332705

Conversation

@AgustinBettati
Copy link
Copy Markdown
Member

@AgustinBettati AgustinBettati commented Jan 20, 2026

Proposed changes

Jira ticket: CLOUDP-332705

Context

Our CI contract tests of organization resource have been failing consistently (reference), specifically in contract_create_delete, where a DELETE HTTP 500 Internal Server Error is encountered.

Upstream team confirmed this error is expected if delete is called just after creation, and advise waiting some seconds after creation before trying the delete. There is no intention on changing the error message either.

Solution

To unblock contract tests (which create and delete instantly), and cover this potential use case for any customer, as per triage we agreed to add a retry request if initial request fails.

CI contract tests are now passing, and also verified the organization creation/deletion creating a stack with a private publish:
Screenshot 2026-01-21 at 12 31 19

Type of change:

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as
    expected)
  • This change requires a documentation update
  • If changes include removal or addition of 3rd party GitHub actions, I updated our internal document. Reach out to the APIx Integration slack channel to get access to the internal document.

Manual QA performed:

  • cfn invoke for each of CRUDL/cfn test
  • Updated resource in example
  • Published to AWS private registry
  • Used the template in example to create and update a stack in AWS
  • Deleted stack to ensure resources are deleted
  • Created multiple resources in same stack
  • Validated in Atlas UI
  • Included screenshots

Required Checklist:

  • I have signed the MongoDB CLA
  • I have added tests that prove my fix is effective or that my feature works
  • I have checked that this change does not generate any credentials and that they are NOT accidentally logged anywhere.
  • I have added any necessary documentation (if appropriate)
  • I have run make fmt and formatted my code
  • For CFN Resources: I have released by changes in the private registry and proved by change
    works in Atlas

Further comments

@AgustinBettati AgustinBettati changed the title chore: Ensuring successful delete operation in organization resource fix: Ensuring successful delete operation in organization resource Jan 21, 2026
@AgustinBettati AgustinBettati changed the title fix: Ensuring successful delete operation in organization resource fix: Avoid error in organization delete operation if done just after creation Jan 21, 2026
@AgustinBettati AgustinBettati marked this pull request as ready for review January 21, 2026 11:37
@AgustinBettati AgustinBettati requested a review from a team as a code owner January 21, 2026 11:37
Copilot AI review requested due to automatic review settings January 21, 2026 11:37
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Implements a retry mechanism for organization deletion to handle transient HTTP 500 errors that occur when deleting an organization immediately after creation.

Changes:

  • Added retry logic with a 20-second wait before reattempting deletion on HTTP 500 errors
  • Refactored delete logic into a separate runDelete function to enable reuse for retry attempts
  • Added auto-generated type configuration file

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
cfn-resources/organization/cmd/resource/resource.go Refactored delete operation to support retry on HTTP 500 errors by extracting core deletion logic into runDelete function
cfn-resources/organization/cmd/resource/config.go Auto-generated type configuration boilerplate

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread cfn-resources/organization/cmd/resource/config.go
Comment thread cfn-resources/organization/cmd/resource/resource.go
Comment thread cfn-resources/organization/cmd/resource/resource.go
Copy link
Copy Markdown
Contributor

@EspenAlbert EspenAlbert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TY!

if responseMsg.Error != nil {
// Retry once on transient server error, waiting 20 seconds before retrying request
// This covers case of contract tests which create and delete an org within seconds, encountering an error while deleting the org
if responseMsg.Response != nil && responseMsg.Response.StatusCode == http.StatusInternalServerError {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May want to add a log here

@AgustinBettati AgustinBettati added this pull request to the merge queue Jan 21, 2026
Merged via the queue into master with commit 276eb28 Jan 21, 2026
41 checks passed
@AgustinBettati AgustinBettati deleted the CLOUDP-332705 branch January 21, 2026 15:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants