fix: Perform cleanup of intermediary data safely#19650
Conversation
FrankChen021
left a comment
There was a problem hiding this comment.
| Severity | Findings |
|---|---|
| P0 | 0 |
| P1 | 1 |
| P2 | 1 |
| P3 | 0 |
| Total | 2 |
Found 2 issues.
Reviewed 8 of 8 changed files.
This is an automated review by Codex GPT-5.5
|
|
||
| @Inject | ||
| public DeepStorageIntermediaryDataManager(DataSegmentPusher dataSegmentPusher) | ||
| public DeepStorageIntermediaryDataManager(DataSegmentPusher dataSegmentPusher, DataSegmentKiller dataSegmentKiller) |
There was a problem hiding this comment.
[P1] S3 deepstore mode can no longer provision this manager
This constructor now requires a DataSegmentKiller anywhere DeepStorageIntermediaryDataManager is selected. On middleManager-style injectors that killer is PolyBind-selected by druid.storage.type, but S3 sets that to s3 while S3StorageDruidModule registers its killer under s3_zip. With druid.processing.intermediaryData.storage.type=deepstore on S3, Guice fails to provision this manager instead of starting the service.
There was a problem hiding this comment.
Hmm, this seems to be true. I was under the assumption that DataSegmentKiller is always bound to OmniDataSegmentKiller at the top level. But that is applicable only for peons (via CliPeon) and Overlord. MMs do not bind to OmniDataSegmentKiller and use the extension provided DataSegmentKiller itself.
So, when using S3, MMs would not have any initialized binding for DataSegmentKiller.
However, MMs do use IntermediaryDataManager.
I will revert this change and add javadocs to avoid further confusion.
There was a problem hiding this comment.
Trying out an optional dependency first.
There was a problem hiding this comment.
Thanks for checking. I don't think @Nullable makes this dependency optional in this path: DataSegmentKiller is still bound through LocalDataStorageDruidModule's PolyBind.createChoice, and PolyBind.ConfiggedProvider#get() throws a ProvisionException when druid.storage.type=s3 cannot be found in the killer map, which still only has s3_zip. That happens while Guice resolves the constructor argument, before this constructor can receive null, so the MiddleManager deepstore provisioning failure should still reproduce.
Reviewed 17 of 17 changed files.
| // Delegate kill to the currently bound deep storage implementation only. | ||
| // It is unnecessary and also potentially unsafe to kill all files under the | ||
| // same folder name on inactive deep storage bindings provided by other extensions. | ||
| final Supplier<DataSegmentKiller> killer = killers.get(deepStorageType); |
There was a problem hiding this comment.
[P2] Recursive cleanup looks up killers with the wrong key
The killers map is keyed by segment loadSpec killer type, not always by druid.storage.type. S3 and OSS use storage types s3 and oss but killer keys s3_zip and oss_zip, so this lookup throws during recursive cleanup for those configured stores instead of preserving the prior no-op behavior for unsupported recursive cleanup.
There was a problem hiding this comment.
Ugh!
So this means that we are always forced to use the OmniDataSegmentKiller since S3DataSegmentKiller is never truly bound as a DataSegmentKiller.
The S3DataSegmentKiller is forced to use type s3_zip since the S3LoadSpec uses s3_zip. For backwards compat reading old segments, neither of these can change.
Workaround option would be to register S3DataSegmentKiller under the s3 key as well, which should be okay since the S3DataSegmentKiller does not keep any state, so it is fine to end up with 2 S3DataSegmentKiller, one bound to key s3, the other to s3_zip.
There was a problem hiding this comment.
I think we can live with this for now since the killRecursively() method is not implemented for S3 or OSS anyway. Going to add some comments to that effect.
Description
#19187 introduced cleanup of intermediary data in
ParallelIndexSupervisorTask.cleanup()step.It invokes the
DataSegmentKiller(bound on peons toOmniDataSegmentKiller) and cleans up all files under the directory specified by the supervisor task id. TheOmniDataSegmentKillertries to do a recursive kill on all possible implementations ofDataSegmentKiller.Bug
This can lead to task failures in cases where an extension (say Azure or GCS) has been loaded on the cluster but is not configured to work as a deep store.
#19647 was meant to mitigate this issue by ignoring such exceptions and allowing the task to succeed.
The exception would still show up as a warning in the task logs.
There is also an inherent risk with this approach since it is possible (however unlikely) that the same path prefix may be under use in other deep storage environments for completely different purposes.
This is especially a concern in cases where an extension (say Azure) is loaded only to be used as an input source.
For the same reason,
OmniDataSegmentKiller.killAll()has not been implemented yet either.Changes
OmniDataSegmentKiller.killRecursively()to only invoke the currently bound deep storeIntermediaryDataManagerinParallelIndexSupervisorTaskinstead of invokingDataSegmentKillerdirectly.This allows use of a cleaner interface at the task level and also provides symmetry with
DataSegmentPusherwithinDeepStorageIntermediaryDataManager.deletePartitionsand invokeDataSegmentKiller.killRecursively()DataSegmentKiller.killAll()since it is not used anymore and is a potentially risky operationThis PR has: