Added FsspecFileIO method for OSS, virtual hosted style default to true, standardized key configurations for OSS#1788
Added FsspecFileIO method for OSS, virtual hosted style default to true, standardized key configurations for OSS#1788helmiazizm wants to merge 5 commits intoapache:mainfrom
FsspecFileIO method for OSS, virtual hosted style default to true, standardized key configurations for OSS#1788Conversation
| "aws_secret_access_key": properties.get(OSS_ACCESS_KEY_SECRET), | ||
| "aws_session_token": properties.get(OSS_SESSION_TOKEN), | ||
| } | ||
| config_kwargs = {"s3": {"addressing_style": "virtual"}, "signature_version": "v4"} |
There was a problem hiding this comment.
Should wel also make virtual addressing configurable? Similar to S3_FORCE_VIRTUAL_ADDRESSING?
| config_kwargs = {"s3": {"addressing_style": "virtual"}, "signature_version": "v4"} | |
| config_kwargs = {"s3": {"addressing_style": "virtual"}, "signature_version": "v4"} |
There was a problem hiding this comment.
Since the init function for OSS is already separated from S3, and users aren't able to interact with OSS without virtual addressing set to true anyway, I think it makes more sense to just set it to true by default to make it simpler.
There was a problem hiding this comment.
But we've already crossed that bridge with PyArrowFileIO: https://github.com/apache/iceberg-python/pull/1788/files#diff-8d5e63f2a87ead8cebe2fd8ac5dcf2198d229f01e16bb9e06e21f7277c328abdL423-L424
I think it would also be good to add a configuration option for it. we don't know if people are using it, and just removing it feels wrong.
There was a problem hiding this comment.
To be honest, I'm not entirely sure which config_kwargs keys supported by s3fs that can work with OSS connection. The documentation for boto3 compatibility only mentioned setting several keys: https://www.alibabacloud.com/help/en/oss/developer-reference/use-amazon-s3-sdks-to-access-oss#section-jmf-a67-hat
Do you think it'll make sense to copy the config_kwargs setting for S3 but with oss.config-example parameter?

This pull request introduced
FsspecFileIofor OSS configuration method as a backup whenPyArrowFileIOfail. UsingS3FileSystemclass, the method should work as long as the virtual hosted style is invoked. For both method, virtual hosted style is set to true as default.Also, since OSS configuration has its own section, I think it makes more sense to standardize the configuration keys with
oss.certain-configinstead of still relying withs3key.