Add AWS S3 Multi-Region Access Point (MRAP) support to the S3 extension

### Description

Add support for AWS S3 Multi-Region Access Points (MRAPs) and S3 Access Point ARNs in Druid's S3 extension.

Currently, the bucket field in Druid's S3 configuration only accepts standard DNS-compliant bucket names. AWS Access Point ARNs (eg., `arn:aws:s3::123456789123:accesspoint:bucket.mrap`) are rejected at construction time in `CloudObjectLocation` because they fail the URL-encoding equality check used to enforce DNS naming rules. Additionally, some tools produce ARNs with a slash separator (accesspoint/alias) instead of the colon-delimited form (accesspoint:alias) expected by the AWS SDK, causing further failures downstream. 
 
This change: 
- Relaxes the bucket name validation in CloudObjectLocation to permit valid S3 Access Point ARNs alongside DNS-compliant names. 
- Adds S3Utils.normalizeBucketName() to canonicalize the slash-delimited form to the colon-delimited form at ingestion points (S3DataSegmentPusherConfig, S3LoadSpec). 
- Supports both regional Access Point ARNs (`arn:aws:s3:<region>:<account>:accesspoint:<name>`) and MRAP ARNs (`arn:aws:s3::<account>:accesspoint:<name>.mrap`). 
 
No API surface changes; the bucket configuration field continues to accept plain bucket names unchanged.

### Motivation

**Use case**

AWS Multi-Region Access Points provide a single global S3 endpoint that routes requests to the nearest healthy bucket replica across regions. Operators use MRAPs for:

- Active-active multi-region Druid deployments backed by S3 Cross-Region Replication (CRR).
- Disaster recovery setups where deep storage must remain accessible during a regional outage.
- Simplifying Druid configuration across regions — one ARN in druid.storage.bucket instead of per-region overrides.
- Access Point ARNs more broadly (single-region) are also used to enforce fine-grained IAM access controls on shared buckets without exposing the bucket name.

Why the current behavior blocks this

CloudObjectLocation enforces:
```java
Preconditions.checkArgument(
this.bucket.equals(StringUtils.urlEncode(this.bucket)),
"bucket must follow DNS-compliant naming conventions"
);
```

An ARN like `arn:aws:s3::123456789123:accesspoint:bucket.mrap` URL-encodes to `arn:aws:s3::123456789123:accesspoint:bucket.mrap`, so the check always fails. There is no escape hatch. Users who configure an MRAP ARN as the Druid storage bucket receive an IllegalArgumentException at startup with no workaround short of patching the code.

The AWS SDK for Java (v1 and v2) accepts ARN strings wherever a bucket name is expected, so no SDK-level changes are required. The fix is purely a validation relaxation and a normalization helper.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add AWS S3 Multi-Region Access Point (MRAP) support to the S3 extension #19608

Description

Motivation

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Add AWS S3 Multi-Region Access Point (MRAP) support to the S3 extension #19608

Description

Description

Motivation

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions