Fixes #21259: Don't store null secrets in Secrets Manager#27839
Fixes #21259: Don't store null secrets in Secrets Manager#27839
Conversation
When a user clears a password field on a service that uses an external Secrets Manager (managed-aws, GCP, Azure KV, Kubernetes), the previous behavior stored the literal string "null" in the secrets backing store instead of removing the secret. Downstream consumers (e.g. the Snowflake driver) then read back the four-character string "null" as a real password, breaking authentication for users who actually authenticate via private key and don't want a password configured at all. The fix is three small, layered changes: 1. ExternalSecretsManager.upsertSecret() — when the value is null/empty, the user's intent is "remove this credential", so delete the backing secret instead of writing "null". This also satisfies the GCP/K8s empty-string rejection constraint from PR #25224 by simply not calling upsert with an empty value. 2. ExternalSecretsManager.storeValue() — return null for null/empty values so the entity does not carry a stale secret:/ reference to a deleted secret. Without this, the next read would attempt to follow the reference and either get a ResourceNotFoundException or (pre-fix) the literal "null" string. 3. SecretsManager.encryptPasswordFields() — propagate the null returned by storeValue() through to the entity field instead of attempting to fernet-encrypt null. Tests: - New: testIssue21259ClearingPasswordDoesNotStoreNullString — exercises the full clear-password scenario and asserts the entity field is null, not a Fernet-wrapped secret:/ reference. - Flipped: testEncryptDatabaseConnectionWithEmptyPassword — was asserting the buggy contract; now asserts the corrected one. - Flipped: KubernetesSecretsManagerTest.testEmptySecretValueShouldNotBeStored (renamed from testEmptySecretValueShouldBeStoredAsNullString) — the previous name codified the bug; the new test verifies storeValue with empty value does NOT call createNamespacedSecret and returns null. All 80 tests in *SecretsManager*Test pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… NPE The previous version of storeValue() placed the null/empty guard AFTER the isSecret() call. Today the path is unreachable because encryptPasswordFields() upstream-guards on `obj != null`, so null never actually reaches storeValue() — but the new null-handling branch was technically dead code for the null case (isSecret() does `string.startsWith(SECRET_FIELD_PREFIX)` which NPEs on null input). Move the guard to the top so storeValue() is self-defending against null inputs from any caller, present or future. Add testStoreValueWithNullDoesNotThrowNpe regression test (runs across all ExternalSecretsManager subclasses via the abstract test class). All 83 tests in *SecretsManager*Test pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Hi there 👋 Thanks for your contribution! The OpenMetadata team will review the PR shortly! Once it has been labeled as Let us know if you need any help! |
| @@ -33,6 +33,16 @@ protected ExternalSecretsManager( | |||
| @Override | |||
| protected String storeValue(String fieldName, String value, String secretId, boolean store) { | |||
| String fieldSecretId = buildSecretId(false, secretId, fieldName.toLowerCase(Locale.ROOT)); | |||
There was a problem hiding this comment.
💡 Edge Case: storeValue with store=false and null value still computes fieldSecretId
When storeValue is called with a null value, the method still calls buildSecretId(false, secretId, fieldName.toLowerCase(Locale.ROOT)) on line 35 before hitting the null guard on line 40. This is harmless (just a wasted string concatenation), but moving the null guard above the buildSecretId call would be marginally cleaner and make the early-return intent clearer.
Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion
| // Issue #21259: a null/empty value means "no credential". Handle this case BEFORE | ||
| // calling isSecret() — isSecret() invokes startsWith() on the value and would NPE | ||
| // on null. Returning null here ensures the entity does not carry a stale secret:/ | ||
| // reference to a deleted secret. |
There was a problem hiding this comment.
💡 Quality: Verbose comments explaining the fix — prefer self-documenting code
Per the project's coding guidelines, code should be self-documenting with comments reserved for complex business logic or workarounds. The four-line block comments at lines 36-39, 58-61, and 414-416 essentially re-state what the code already makes clear. Consider trimming these to a single-line issue reference (e.g., // Fix #21259: null/empty means "remove credential") and letting the code speak for itself.
Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion
There was a problem hiding this comment.
Pull request overview
Fixes issue #21259 in the OpenMetadata service secrets subsystem by ensuring cleared (null/empty) secret values are treated as “remove credential” for external secrets managers, preventing the literal "null" string from being persisted and later misinterpreted as a real password.
Changes:
- Update external secret storage to delete secrets (and return
nullto callers) when the incoming value is null/empty. - Propagate
nullthrough encryption flow to avoid attempting Fernet encryption on cleared values. - Update and add unit tests to assert the corrected contract (no
"null"string storage; no stalesecret:references).
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| openmetadata-service/src/main/java/org/openmetadata/service/secrets/ExternalSecretsManager.java | Treat null/empty as delete intent; storeValue() returns null and upsertSecret() deletes instead of storing "null". |
| openmetadata-service/src/main/java/org/openmetadata/service/secrets/SecretsManager.java | Ensure encryption flow propagates null instead of encrypting it. |
| openmetadata-service/src/test/java/org/openmetadata/service/secrets/ExternalSecretsManagerTest.java | Flip prior expectations and add regressions for null/empty handling and issue #21259 scenario. |
| openmetadata-service/src/test/java/org/openmetadata/service/secrets/KubernetesSecretsManagerTest.java | Update Kubernetes behavior test to assert empty values are not upserted (no create call) and return null. |
| // "null". This also satisfies the GCP/Kubernetes empty-string rejection constraint | ||
| // (PR #25224) by simply not calling upsert with an empty value. | ||
| if (Objects.isNull(secretValue) || secretValue.isEmpty()) { | ||
| if (existSecret(secretName)) { | ||
| deleteSecretInternal(secretName); | ||
| sleep(); | ||
| } |
There was a problem hiding this comment.
In upsertSecret, the null/empty branch checks existSecret(secretName) before deleting. For most providers, ExternalSecretsManager.existSecret() is implemented by calling getSecret() and swallowing any exception (returning false). That means a transient network/permission error can be treated as “secret doesn’t exist”, so the delete is silently skipped and the old secret becomes orphaned even though the entity field is cleared. Consider changing the existence check to only suppress true not-found cases and propagate other exceptions (or make deleteSecretInternal idempotent across providers and delete without a pre-check) so credential clearing doesn’t silently leave stale secrets behind.
| // "null". This also satisfies the GCP/Kubernetes empty-string rejection constraint | |
| // (PR #25224) by simply not calling upsert with an empty value. | |
| if (Objects.isNull(secretValue) || secretValue.isEmpty()) { | |
| if (existSecret(secretName)) { | |
| deleteSecretInternal(secretName); | |
| sleep(); | |
| } | |
| // "null". Avoid a pre-delete existence check here because existSecret() relies on | |
| // getSecret() and suppresses all exceptions, which can incorrectly turn transient | |
| // provider errors into a silent "does not exist" result and leave stale secrets | |
| // orphaned even though the entity field has been cleared. | |
| if (Objects.isNull(secretValue) || secretValue.isEmpty()) { | |
| deleteSecretInternal(secretName); | |
| sleep(); |
Code Review 👍 Approved with suggestions 0 resolved / 2 findingsPrevents null secrets from being stored in Secrets Manager as requested. Consider removing verbose comments and refactoring the redundant secret ID computation in storeValue. 💡 Edge Case: storeValue with store=false and null value still computes fieldSecretId📄 openmetadata-service/src/main/java/org/openmetadata/service/secrets/ExternalSecretsManager.java:35 📄 openmetadata-service/src/main/java/org/openmetadata/service/secrets/ExternalSecretsManager.java:40-44 When 💡 Quality: Verbose comments explaining the fix — prefer self-documenting code📄 openmetadata-service/src/main/java/org/openmetadata/service/secrets/ExternalSecretsManager.java:36-39 📄 openmetadata-service/src/main/java/org/openmetadata/service/secrets/ExternalSecretsManager.java:58-61 📄 openmetadata-service/src/main/java/org/openmetadata/service/secrets/SecretsManager.java:414-416 Per the project's coding guidelines, code should be self-documenting with comments reserved for complex business logic or workarounds. The four-line block comments at lines 36-39, 58-61, and 414-416 essentially re-state what the code already makes clear. Consider trimming these to a single-line issue reference (e.g., 🤖 Prompt for agentsOptionsDisplay: compact → Showing less information. Comment with these commands to change:
Was this helpful? React with 👍 / 👎 | Gitar |
| // reference to a deleted secret. | ||
| if (Objects.isNull(value) || value.isEmpty()) { | ||
| if (store) { | ||
| upsertSecret(fieldSecretId, value); |
There was a problem hiding this comment.
mhmh I think this is hard to follow. Why would we say upsertSecret(..., null), instead of explicitly handling this as deleteSecret. IMO upsert is add or update, but should not internally delete. Let's make the process bit easier to follow
|
🟡 Playwright Results — all passed (12 flaky)✅ 3986 passed · ❌ 0 failed · 🟡 12 flaky · ⏭️ 86 skipped
🟡 12 flaky test(s) (passed on retry)
How to debug locally# Download playwright-test-results-<shard> artifact and unzip
npx playwright show-trace path/to/trace.zip # view trace |



Summary
Closes #21259. Reopened from #27838 (which was based on a fork — switched to upstream branch to match the repo's contribution pattern).
When a user clears a password field on a service backed by an external Secrets Manager (
managed-aws, GCP, Azure KV, Kubernetes), the previous code stored the literal string\"null\"in the secrets backing store instead of removing the secret. Downstream consumers (e.g. the Snowflake driver) then read back the four-character string\"null\"as a real password, breaking authentication — most visibly for users who authenticate via private key and never wanted a password configured at all.This PR fixes the root cause and updates the tests that codified the buggy contract.
The bug, demonstrated
Before the fix, with
SECRET_MANAGER=managed-aws:After the fix, the same call returns
ResourceNotFoundException(the secret has been deleted) and the entity's password field isnull, not a stalesecret:/reference.Approach
The fix is three small, layered changes — each layer hands a clean null to the next:
ExternalSecretsManager.upsertSecret()— when value is null/empty, the user's intent is "remove this credential", so delete the backing secret instead of writing\"null\". This also satisfies the GCP/Kubernetes empty-string rejection constraint from Fixes #25167: Secrets Manager Empty String Sanitization #25224 by simply not calling upsert with an empty value.ExternalSecretsManager.storeValue()— returnnullfor null/empty values so the entity does not carry a stalesecret:/reference to a deleted secret. The null/empty guard sits at the top of the method (above theisSecret(value)call) so the function is self-defending against null inputs from any caller.SecretsManager.encryptPasswordFields()— propagate thenullreturned bystoreValue()through to the entity field instead of attempting tofernet.encrypt(null).Constraint check vs. #25224
#25224 introduced the null→
\"null\"conversion specifically because GCP and Kubernetes Secrets Manager reject empty-string upserts. This PR satisfies that constraint by exclusion: we never call upsert with an empty value because we short-circuit to delete instead. The flipped Kubernetes test (testEmptySecretValueShouldNotBeStored, renamed fromtestEmptySecretValueShouldBeStoredAsNullString) verifiescreateNamespacedSecretis never called.Tests
testIssue21259ClearingPasswordDoesNotStoreNullStringexercises the exact user scenario (create service with password → clear password → save) and asserts the entity field isnull, not a Fernet-wrappedsecret:/reference. Runs against every provider that subclassesExternalSecretsManagerTest(AWS, AWS-SSM, GCP, Kubernetes).testStoreValueWithNullDoesNotThrowNperegression-protects the null-input defensive path.testEncryptDatabaseConnectionWithEmptyPasswordwas asserting the buggy contract; now asserts the corrected one (entity field is null).KubernetesSecretsManagerTest.testEmptySecretValueShouldBeStoredAsNullString→testEmptySecretValueShouldNotBeStored. The previous test name itself encoded the bug.All 83 tests in
*SecretsManager*Testpass.Out of scope (follow-ups)
\"null\"strings stored in their SM will continue to read those literal\"null\"strings until those services are re-saved. A passive read-side wrapper that maps fetched\"null\"→null(and optionally the existingmigrate-secretsCLI tool to actively sweep) can land in a separate PR.cleanNullOrEmpty()becomes effectively dead code in the upsert path. Left in place to keep this PR small and reviewable; can be removed in a follow-up.Test plan
mvn test -pl openmetadata-service -Dtest='*SecretsManager*Test'(83 tests pass)mvn spotless:apply(no formatting changes needed)managed-awsagainst AWS SM (output: literalnull); verified the fix changes the output toResourceNotFoundException🤖 Generated with Claude Code