Fixes #21259: Don't store null secrets in Secrets Manager by aji-aju · Pull Request #27839 · open-metadata/OpenMetadata

aji-aju · 2026-04-30T07:08:09Z

Summary

Closes #21259. Reopened from #27838 (which was based on a fork — switched to upstream branch to match the repo's contribution pattern).

When a user clears a password field on a service backed by an external Secrets Manager (managed-aws, GCP, Azure KV, Kubernetes), the previous code stored the literal string \"null\" in the secrets backing store instead of removing the secret. Downstream consumers (e.g. the Snowflake driver) then read back the four-character string \"null\" as a real password, breaking authentication — most visibly for users who authenticate via private key and never wanted a password configured at all.

This PR fixes the root cause and updates the tests that codified the buggy contract.

The bug, demonstrated

Before the fix, with SECRET_MANAGER=managed-aws:

# Create a service with a real password, then clear it via the UI (type something, delete it, save).
$ aws secretsmanager get-secret-value \
    --secret-id /<prefix>/<cluster>/database/<service>/authtype/password \
    --query SecretString --output text
null   # ← the literal four ASCII characters, not the JSON null

After the fix, the same call returns ResourceNotFoundException (the secret has been deleted) and the entity's password field is null, not a stale secret:/ reference.

Approach

The fix is three small, layered changes — each layer hands a clean null to the next:

ExternalSecretsManager.upsertSecret() — when value is null/empty, the user's intent is "remove this credential", so delete the backing secret instead of writing \"null\". This also satisfies the GCP/Kubernetes empty-string rejection constraint from Fixes #25167: Secrets Manager Empty String Sanitization #25224 by simply not calling upsert with an empty value.
ExternalSecretsManager.storeValue() — return null for null/empty values so the entity does not carry a stale secret:/ reference to a deleted secret. The null/empty guard sits at the top of the method (above the isSecret(value) call) so the function is self-defending against null inputs from any caller.
SecretsManager.encryptPasswordFields() — propagate the null returned by storeValue() through to the entity field instead of attempting to fernet.encrypt(null).

Constraint check vs. #25224

#25224 introduced the null→\"null\" conversion specifically because GCP and Kubernetes Secrets Manager reject empty-string upserts. This PR satisfies that constraint by exclusion: we never call upsert with an empty value because we short-circuit to delete instead. The flipped Kubernetes test (testEmptySecretValueShouldNotBeStored, renamed from testEmptySecretValueShouldBeStoredAsNullString) verifies createNamespacedSecret is never called.

Tests

New: testIssue21259ClearingPasswordDoesNotStoreNullString exercises the exact user scenario (create service with password → clear password → save) and asserts the entity field is null, not a Fernet-wrapped secret:/ reference. Runs against every provider that subclasses ExternalSecretsManagerTest (AWS, AWS-SSM, GCP, Kubernetes).
New: testStoreValueWithNullDoesNotThrowNpe regression-protects the null-input defensive path.
Flipped: testEncryptDatabaseConnectionWithEmptyPassword was asserting the buggy contract; now asserts the corrected one (entity field is null).
Flipped & renamed: KubernetesSecretsManagerTest.testEmptySecretValueShouldBeStoredAsNullString → testEmptySecretValueShouldNotBeStored. The previous test name itself encoded the bug.

All 83 tests in *SecretsManager*Test pass.

Out of scope (follow-ups)

Read-side cleanup of existing dirty data: customers who already have services with \"null\" strings stored in their SM will continue to read those literal \"null\" strings until those services are re-saved. A passive read-side wrapper that maps fetched \"null\" → null (and optionally the existing migrate-secrets CLI tool to actively sweep) can land in a separate PR.
cleanNullOrEmpty() becomes effectively dead code in the upsert path. Left in place to keep this PR small and reviewable; can be removed in a follow-up.

Test plan

mvn test -pl openmetadata-service -Dtest='*SecretsManager*Test' (83 tests pass)
mvn spotless:apply (no formatting changes needed)
Manually reproduced the bug locally with managed-aws against AWS SM (output: literal null); verified the fix changes the output to ResourceNotFoundException
Manual regression sanity (set initial password, update password, clear password, recreate service, bot JWT lifecycle, test-connection workflow) — in progress

🤖 Generated with Claude Code

When a user clears a password field on a service that uses an external Secrets Manager (managed-aws, GCP, Azure KV, Kubernetes), the previous behavior stored the literal string "null" in the secrets backing store instead of removing the secret. Downstream consumers (e.g. the Snowflake driver) then read back the four-character string "null" as a real password, breaking authentication for users who actually authenticate via private key and don't want a password configured at all. The fix is three small, layered changes: 1. ExternalSecretsManager.upsertSecret() — when the value is null/empty, the user's intent is "remove this credential", so delete the backing secret instead of writing "null". This also satisfies the GCP/K8s empty-string rejection constraint from PR #25224 by simply not calling upsert with an empty value. 2. ExternalSecretsManager.storeValue() — return null for null/empty values so the entity does not carry a stale secret:/ reference to a deleted secret. Without this, the next read would attempt to follow the reference and either get a ResourceNotFoundException or (pre-fix) the literal "null" string. 3. SecretsManager.encryptPasswordFields() — propagate the null returned by storeValue() through to the entity field instead of attempting to fernet-encrypt null. Tests: - New: testIssue21259ClearingPasswordDoesNotStoreNullString — exercises the full clear-password scenario and asserts the entity field is null, not a Fernet-wrapped secret:/ reference. - Flipped: testEncryptDatabaseConnectionWithEmptyPassword — was asserting the buggy contract; now asserts the corrected one. - Flipped: KubernetesSecretsManagerTest.testEmptySecretValueShouldNotBeStored (renamed from testEmptySecretValueShouldBeStoredAsNullString) — the previous name codified the bug; the new test verifies storeValue with empty value does NOT call createNamespacedSecret and returns null. All 80 tests in *SecretsManager*Test pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

… NPE The previous version of storeValue() placed the null/empty guard AFTER the isSecret() call. Today the path is unreachable because encryptPasswordFields() upstream-guards on `obj != null`, so null never actually reaches storeValue() — but the new null-handling branch was technically dead code for the null case (isSecret() does `string.startsWith(SECRET_FIELD_PREFIX)` which NPEs on null input). Move the guard to the top so storeValue() is self-defending against null inputs from any caller, present or future. Add testStoreValueWithNullDoesNotThrowNpe regression test (runs across all ExternalSecretsManager subclasses via the abstract test class). All 83 tests in *SecretsManager*Test pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

github-actions · 2026-04-30T07:08:42Z

Hi there 👋 Thanks for your contribution!

The OpenMetadata team will review the PR shortly! Once it has been labeled as safe to test, the CI workflows
will start executing and we'll be able to make sure everything is working as expected.

Let us know if you need any help!

gitar-bot · 2026-04-30T07:12:14Z

@@ -33,6 +33,16 @@ protected ExternalSecretsManager(
  @Override
  protected String storeValue(String fieldName, String value, String secretId, boolean store) {
    String fieldSecretId = buildSecretId(false, secretId, fieldName.toLowerCase(Locale.ROOT));


💡 Edge Case: storeValue with store=false and null value still computes fieldSecretId

When storeValue is called with a null value, the method still calls buildSecretId(false, secretId, fieldName.toLowerCase(Locale.ROOT)) on line 35 before hitting the null guard on line 40. This is harmless (just a wasted string concatenation), but moving the null guard above the buildSecretId call would be marginally cleaner and make the early-return intent clearer.

_{Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion}

gitar-bot · 2026-04-30T07:12:16Z

+    // Issue #21259: a null/empty value means "no credential". Handle this case BEFORE
+    // calling isSecret() — isSecret() invokes startsWith() on the value and would NPE
+    // on null. Returning null here ensures the entity does not carry a stale secret:/
+    // reference to a deleted secret.


💡 Quality: Verbose comments explaining the fix — prefer self-documenting code

Per the project's coding guidelines, code should be self-documenting with comments reserved for complex business logic or workarounds. The four-line block comments at lines 36-39, 58-61, and 414-416 essentially re-state what the code already makes clear. Consider trimming these to a single-line issue reference (e.g., // Fix #21259: null/empty means "remove credential") and letting the code speak for itself.

_{Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion}

Copilot

Pull request overview

Fixes issue #21259 in the OpenMetadata service secrets subsystem by ensuring cleared (null/empty) secret values are treated as “remove credential” for external secrets managers, preventing the literal "null" string from being persisted and later misinterpreted as a real password.

Changes:

Update external secret storage to delete secrets (and return null to callers) when the incoming value is null/empty.
Propagate null through encryption flow to avoid attempting Fernet encryption on cleared values.
Update and add unit tests to assert the corrected contract (no "null" string storage; no stale secret: references).

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File	Description
openmetadata-service/src/main/java/org/openmetadata/service/secrets/ExternalSecretsManager.java	Treat null/empty as delete intent; `storeValue()` returns null and `upsertSecret()` deletes instead of storing `"null"`.
openmetadata-service/src/main/java/org/openmetadata/service/secrets/SecretsManager.java	Ensure encryption flow propagates `null` instead of encrypting it.
openmetadata-service/src/test/java/org/openmetadata/service/secrets/ExternalSecretsManagerTest.java	Flip prior expectations and add regressions for null/empty handling and issue #21259 scenario.
openmetadata-service/src/test/java/org/openmetadata/service/secrets/KubernetesSecretsManagerTest.java	Update Kubernetes behavior test to assert empty values are not upserted (no create call) and return null.

Copilot · 2026-04-30T07:13:49Z

+    // "null". This also satisfies the GCP/Kubernetes empty-string rejection constraint
+    // (PR #25224) by simply not calling upsert with an empty value.
+    if (Objects.isNull(secretValue) || secretValue.isEmpty()) {
+      if (existSecret(secretName)) {
+        deleteSecretInternal(secretName);
+        sleep();
+      }


In upsertSecret, the null/empty branch checks existSecret(secretName) before deleting. For most providers, ExternalSecretsManager.existSecret() is implemented by calling getSecret() and swallowing any exception (returning false). That means a transient network/permission error can be treated as “secret doesn’t exist”, so the delete is silently skipped and the old secret becomes orphaned even though the entity field is cleared. Consider changing the existence check to only suppress true not-found cases and propagate other exceptions (or make deleteSecretInternal idempotent across providers and delete without a pre-check) so credential clearing doesn’t silently leave stale secrets behind.

Suggested change

// "null". This also satisfies the GCP/Kubernetes empty-string rejection constraint

// (PR #25224) by simply not calling upsert with an empty value.

if (Objects.isNull(secretValue) || secretValue.isEmpty()) {

if (existSecret(secretName)) {

deleteSecretInternal(secretName);

sleep();

}

// "null". Avoid a pre-delete existence check here because existSecret() relies on

// getSecret() and suppresses all exceptions, which can incorrectly turn transient

// provider errors into a silent "does not exist" result and leave stale secrets

// orphaned even though the entity field has been cleared.

if (Objects.isNull(secretValue) || secretValue.isEmpty()) {

deleteSecretInternal(secretName);

sleep();

gitar-bot · 2026-04-30T09:32:37Z

Code Review 👍 Approved with suggestions 0 resolved / 2 findings

Prevents null secrets from being stored in Secrets Manager as requested. Consider removing verbose comments and refactoring the redundant secret ID computation in storeValue.

💡 Edge Case: storeValue with store=false and null value still computes fieldSecretId

📄 openmetadata-service/src/main/java/org/openmetadata/service/secrets/ExternalSecretsManager.java:35 📄 openmetadata-service/src/main/java/org/openmetadata/service/secrets/ExternalSecretsManager.java:40-44

When storeValue is called with a null value, the method still calls buildSecretId(false, secretId, fieldName.toLowerCase(Locale.ROOT)) on line 35 before hitting the null guard on line 40. This is harmless (just a wasted string concatenation), but moving the null guard above the buildSecretId call would be marginally cleaner and make the early-return intent clearer.

💡 Quality: Verbose comments explaining the fix — prefer self-documenting code

📄 openmetadata-service/src/main/java/org/openmetadata/service/secrets/ExternalSecretsManager.java:36-39 📄 openmetadata-service/src/main/java/org/openmetadata/service/secrets/ExternalSecretsManager.java:58-61 📄 openmetadata-service/src/main/java/org/openmetadata/service/secrets/SecretsManager.java:414-416

Per the project's coding guidelines, code should be self-documenting with comments reserved for complex business logic or workarounds. The four-line block comments at lines 36-39, 58-61, and 414-416 essentially re-state what the code already makes clear. Consider trimming these to a single-line issue reference (e.g., // Fix #21259: null/empty means "remove credential") and letting the code speak for itself.

🤖 Prompt for agents

Code Review: Prevents null secrets from being stored in Secrets Manager as requested. Consider removing verbose comments and refactoring the redundant secret ID computation in storeValue.

1. 💡 Edge Case: storeValue with store=false and null value still computes fieldSecretId
   Files: openmetadata-service/src/main/java/org/openmetadata/service/secrets/ExternalSecretsManager.java:35, openmetadata-service/src/main/java/org/openmetadata/service/secrets/ExternalSecretsManager.java:40-44

   When `storeValue` is called with a null `value`, the method still calls `buildSecretId(false, secretId, fieldName.toLowerCase(Locale.ROOT))` on line 35 before hitting the null guard on line 40. This is harmless (just a wasted string concatenation), but moving the null guard above the `buildSecretId` call would be marginally cleaner and make the early-return intent clearer.

2. 💡 Quality: Verbose comments explaining the fix — prefer self-documenting code
   Files: openmetadata-service/src/main/java/org/openmetadata/service/secrets/ExternalSecretsManager.java:36-39, openmetadata-service/src/main/java/org/openmetadata/service/secrets/ExternalSecretsManager.java:58-61, openmetadata-service/src/main/java/org/openmetadata/service/secrets/SecretsManager.java:414-416

   Per the project's coding guidelines, code should be self-documenting with comments reserved for complex business logic or workarounds. The four-line block comments at lines 36-39, 58-61, and 414-416 essentially re-state what the code already makes clear. Consider trimming these to a single-line issue reference (e.g., `// Fix #21259: null/empty means "remove credential"`) and letting the code speak for itself.

Options

Display: compact → Showing less information.

Comment with these commands to change:

`Compact`
`gitar display:verbose`

_{Was this helpful? React with 👍 / 👎 | Gitar}

pmbrull · 2026-04-30T09:39:10Z

+    // reference to a deleted secret.
+    if (Objects.isNull(value) || value.isEmpty()) {
+      if (store) {
+        upsertSecret(fieldSecretId, value);


mhmh I think this is hard to follow. Why would we say upsertSecret(..., null), instead of explicitly handling this as deleteSecret. IMO upsert is add or update, but should not internally delete. Let's make the process bit easier to follow

sonarqubecloud · 2026-04-30T10:36:55Z

Quality Gate passed for 'open-metadata-ingestion'

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

github-actions · 2026-04-30T11:58:52Z

🟡 Playwright Results — all passed (12 flaky)

✅ 3986 passed · ❌ 0 failed · 🟡 12 flaky · ⏭️ 86 skipped

Shard	Passed	Flaky	Skipped
🟡 Shard 1	298	1	4
🟡 Shard 2	750	4	8
🟡 Shard 3	745	1	7
🟡 Shard 4	774	1	18
✅ Shard 5	687	0	41
🟡 Shard 6	732	5	8

🟡 12 flaky test(s) (passed on retry)

Pages/AuditLogs.spec.ts › should apply both User and EntityType filters simultaneously (shard 1, 1 retry)
Features/ActivityAPI.spec.ts › Activity event is created when description is updated (shard 2, 1 retry)
Features/ActivityAPI.spec.ts › Activity event shows the actor who made the change (shard 2, 1 retry)
Features/Glossary/GlossaryHierarchy.spec.ts › should cancel move operation (shard 2, 1 retry)
Features/Glossary/GlossaryWorkflow.spec.ts › should start term as Draft when glossary has reviewers (shard 2, 1 retry)
Features/Permissions/DataProductPermissions.spec.ts › Data Product deny operations (shard 3, 1 retry)
Pages/CustomProperties.spec.ts › Email (shard 4, 1 retry)
Pages/Lineage/LineageFilters.spec.ts › Verify lineage schema filter selection (shard 6, 1 retry)
Pages/Lineage/LineageRightPanel.spec.ts › Verify custom properties tab IS visible for supported type: searchIndex (shard 6, 1 retry)
Pages/Tag.spec.ts › Verify Owner Add Delete (shard 6, 1 retry)
Pages/UserDetails.spec.ts › Create team with domain and verify visibility of inherited domain in user profile after team removal (shard 6, 1 retry)
Pages/Users.spec.ts › Check permissions for Data Steward (shard 6, 1 retry)

📦 Download artifacts

How to debug locally

# Download playwright-test-results-<shard> artifact and unzip
npx playwright show-trace path/to/trace.zip    # view trace

aji-aju and others added 3 commits April 30, 2026 11:52

Merge branch 'main' into fix/21259-null-secrets-test

087a01c

Copilot AI review requested due to automatic review settings April 30, 2026 07:08

Copilot started reviewing on behalf of aji-aju April 30, 2026 07:08 View session

gitar-bot Bot reviewed Apr 30, 2026

View reviewed changes

Copilot AI reviewed Apr 30, 2026

View reviewed changes

Merge branch 'main' into fix/21259-null-secrets-test

a7cd611

aji-aju self-assigned this Apr 30, 2026

aji-aju added the safe to test Add this label to run secure Github workflows on PRs label Apr 30, 2026

pmbrull requested changes Apr 30, 2026

View reviewed changes

aji-aju temporarily deployed to test April 30, 2026 09:42 — with GitHub Actions Inactive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixes #21259: Don't store null secrets in Secrets Manager#27839

Fixes #21259: Don't store null secrets in Secrets Manager#27839
aji-aju wants to merge 4 commits intomainfrom
fix/21259-null-secrets-test

aji-aju commented Apr 30, 2026

Uh oh!

github-actions Bot commented Apr 30, 2026

Uh oh!

gitar-bot Bot Apr 30, 2026

Uh oh!

gitar-bot Bot Apr 30, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 30, 2026

Uh oh!

gitar-bot Bot commented Apr 30, 2026 •

edited

Loading

Uh oh!

pmbrull Apr 30, 2026

Uh oh!

sonarqubecloud Bot commented Apr 30, 2026

Uh oh!

github-actions Bot commented Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

aji-aju commented Apr 30, 2026

Summary

The bug, demonstrated

Approach

Constraint check vs. #25224

Tests

Out of scope (follow-ups)

Test plan

Uh oh!

github-actions Bot commented Apr 30, 2026

Uh oh!

gitar-bot Bot Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

gitar-bot Bot Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

gitar-bot Bot commented Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pmbrull Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

sonarqubecloud Bot commented Apr 30, 2026

Quality Gate passed for 'open-metadata-ingestion'

Uh oh!

github-actions Bot commented Apr 30, 2026

🟡 Playwright Results — all passed (12 flaky)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

gitar-bot Bot commented Apr 30, 2026 •

edited

Loading