feat: implement UploadPartCopy#64
Conversation
📝 Walkthrough""" WalkthroughThis update implements multipart upload part copy functionality for both S3 and Swift backends, including robust validation and error handling. It introduces a new test suite for the part copy flow, updates the Helm chart and image version to 0.7.0, and adds a new configuration key for the default bucket. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant S3/Swift Backend
participant Storage Service
Client->>S3/Swift Backend: PUT /bucket/object?partNumber=N&uploadId=... (with x-amz-copy-source)
S3/Swift Backend->>Storage Service: GET source object (or byte range)
Storage Service-->>S3/Swift Backend: Source data
S3/Swift Backend->>Storage Service: PUT part data as part N
Storage Service-->>S3/Swift Backend: Success/ETag
S3/Swift Backend->>S3/Swift Backend: Update multipart session file (JSON)
S3/Swift Backend-->>Client: XML CopyPartResult with ETag and LastModified
Estimated code review effort🎯 4 (Complex) | ⏱️ ~40 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Note ⚡️ Unit Test Generation is now available in beta!Learn more here, or try it out under "Finishing Touches" below. ✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (2)
tests/swift/basic/uploadpartcopy_test.ts (1)
54-54: Consider using consistent file sizes across tests.The test creates a 10MB source file but only copies the first 5MB. Consider documenting why this specific size was chosen or using a constant for better maintainability.
+ const SOURCE_FILE_SIZE_MB = 10; + const COPY_SIZE_BYTES = 5 * 1024 * 1024; // 5MB + // Create a source object to copy from await t.step("Create Source Object", async () => { - const tempFile = await createTempFile(10); // 10MB file + const tempFile = await createTempFile(SOURCE_FILE_SIZE_MB);src/backends/swift/objects.ts (1)
1595-1601: Verify the size calculation logic.The size calculation has multiple fallbacks which could lead to incorrect values. Consider simplifying:
- const size = parseInt( - sourceSuccessResponse.headers.get("Content-Length") || "0", - 10, - ) || - (endByte !== undefined - ? (endByte - startByte + 1) - : sourceContent.byteLength); + // Prefer Content-Length header, fallback to calculated or actual size + const contentLengthHeader = sourceSuccessResponse.headers.get("Content-Length"); + const size = contentLengthHeader + ? parseInt(contentLengthHeader, 10) + : (endByte !== undefined + ? (endByte - startByte + 1) + : sourceContent.byteLength);
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
chart/Chart.yaml(1 hunks)chart/values.yaml(2 hunks)src/backends/s3/objects.ts(2 hunks)src/backends/swift/objects.ts(8 hunks)tests/swift/basic/uploadpartcopy_test.ts(1 hunks)
🧰 Additional context used
🧠 Learnings (4)
📓 Common learnings
Learnt from: destifo
PR: expnt/herald#54
File: src/backends/swift/objects.ts:0-0
Timestamp: 2025-07-04T07:25:29.056Z
Learning: In src/backends/swift/objects.ts, the multipart upload system was redesigned to use individual per-upload session files (stored as <uploadId>.json) instead of a centralized index.json file. This design choice was made intentionally to avoid conflicts and race conditions that occur when multiple operations try to update the same index file simultaneously. The N+1 HTTP requests approach is preferred over the single index file approach for data consistency, even though it has performance implications.
Learnt from: destifo
PR: expnt/herald#10
File: src/backends/s3/buckets.ts:0-0
Timestamp: 2025-01-28T17:36:22.149Z
Learning: The `routeQueryParamedRequest` and `headBucket` functions in `src/backends/s3/buckets.ts` already implement retry logic for replica failover when the primary request fails, supporting both S3 and Swift replicas.
tests/swift/basic/uploadpartcopy_test.ts (3)
Learnt from: destifo
PR: #54
File: src/backends/swift/objects.ts:0-0
Timestamp: 2025-07-04T07:25:29.056Z
Learning: In src/backends/swift/objects.ts, the multipart upload system was redesigned to use individual per-upload session files (stored as .json) instead of a centralized index.json file. This design choice was made intentionally to avoid conflicts and race conditions that occur when multiple operations try to update the same index file simultaneously. The N+1 HTTP requests approach is preferred over the single index file approach for data consistency, even though it has performance implications.
Learnt from: destifo
PR: #49
File: tests/_others/_task_queue_test.ts:7-12
Timestamp: 2025-06-26T12:47:53.021Z
Learning: The test in tests/_others/_task_queue_test.ts is specifically designed to test the task store functionality with an S3 backend. The direct type assertion globalConfig.task_store_backend as S3Config is safe because the test is intentionally S3-specific and uses S3 APIs throughout.
Learnt from: destifo
PR: #10
File: src/backends/s3/buckets.ts:0-0
Timestamp: 2025-01-28T17:36:22.149Z
Learning: The routeQueryParamedRequest and headBucket functions in src/backends/s3/buckets.ts already implement retry logic for replica failover when the primary request fails, supporting both S3 and Swift replicas.
src/backends/s3/objects.ts (6)
Learnt from: destifo
PR: #10
File: src/backends/s3/buckets.ts:0-0
Timestamp: 2025-01-28T17:36:22.149Z
Learning: The routeQueryParamedRequest and headBucket functions in src/backends/s3/buckets.ts already implement retry logic for replica failover when the primary request fails, supporting both S3 and Swift replicas.
Learnt from: destifo
PR: #54
File: src/backends/swift/objects.ts:0-0
Timestamp: 2025-07-04T07:25:29.056Z
Learning: In src/backends/swift/objects.ts, the multipart upload system was redesigned to use individual per-upload session files (stored as .json) instead of a centralized index.json file. This design choice was made intentionally to avoid conflicts and race conditions that occur when multiple operations try to update the same index file simultaneously. The N+1 HTTP requests approach is preferred over the single index file approach for data consistency, even though it has performance implications.
Learnt from: destifo
PR: #49
File: src/backends/task_store.ts:170-228
Timestamp: 2025-06-26T12:48:25.255Z
Learning: In src/backends/task_store.ts, the TaskStore class creates Request objects with localhost URLs that are not actually used for HTTP requests. Instead, these Request objects serve as structured data containers that get passed to backend storage implementations (s3PutObject/swiftPutObject) which extract the necessary information (bucket, key, body, headers) and make the actual requests to the configured storage backends.
Learnt from: destifo
PR: #41
File: src/backends/mirror.ts:166-174
Timestamp: 2025-06-16T14:42:52.174Z
Learning: In src/backends/mirror.ts, when creating replica requests that reuse originalRequest.headers, the authentication headers (Authorization, x-amz-date, etc.) are automatically flushed/replaced down the request pipeline by functions like s3.putObject, so there's no signature mismatch issue with forwarding the original headers.
Learnt from: destifo
PR: #49
File: tests/_others/_task_queue_test.ts:7-12
Timestamp: 2025-06-26T12:47:53.021Z
Learning: The test in tests/_others/_task_queue_test.ts is specifically designed to test the task store functionality with an S3 backend. The direct type assertion globalConfig.task_store_backend as S3Config is safe because the test is intentionally S3-specific and uses S3 APIs throughout.
Learnt from: destifo
PR: #41
File: src/backends/swift/mod.ts:318-323
Timestamp: 2025-06-16T14:40:12.564Z
Learning: In conversion functions that return Result<Response, Error>, the Result represents whether the conversion process succeeded, not the HTTP semantics. A function like convertSwiftUploadPartToS3Response should return createOk(errorResponse) when it successfully converts to a proper S3 error response, and createErr(conversionError) only when the conversion process itself fails.
src/backends/swift/objects.ts (5)
Learnt from: destifo
PR: #54
File: src/backends/swift/objects.ts:0-0
Timestamp: 2025-07-04T07:25:29.056Z
Learning: In src/backends/swift/objects.ts, the multipart upload system was redesigned to use individual per-upload session files (stored as .json) instead of a centralized index.json file. This design choice was made intentionally to avoid conflicts and race conditions that occur when multiple operations try to update the same index file simultaneously. The N+1 HTTP requests approach is preferred over the single index file approach for data consistency, even though it has performance implications.
Learnt from: destifo
PR: #10
File: src/backends/s3/buckets.ts:0-0
Timestamp: 2025-01-28T17:36:22.149Z
Learning: The routeQueryParamedRequest and headBucket functions in src/backends/s3/buckets.ts already implement retry logic for replica failover when the primary request fails, supporting both S3 and Swift replicas.
Learnt from: destifo
PR: #41
File: src/backends/swift/mod.ts:318-323
Timestamp: 2025-06-16T14:40:12.564Z
Learning: In conversion functions that return Result<Response, Error>, the Result represents whether the conversion process succeeded, not the HTTP semantics. A function like convertSwiftUploadPartToS3Response should return createOk(errorResponse) when it successfully converts to a proper S3 error response, and createErr(conversionError) only when the conversion process itself fails.
Learnt from: destifo
PR: #35
File: src/backends/swift/mod.ts:0-0
Timestamp: 2025-06-11T20:21:26.693Z
Learning: For the Swift backend's PUT Object path (convertSwiftPutObjectToS3Response), Swift can return only 404 (container missing), 411 (length required), and 422 (checksum mismatch). It will not surface a 408 timeout here, so mapping all handled error cases to HTTP 400 in the S3 response is correct.
Learnt from: destifo
PR: #49
File: src/backends/task_store.ts:170-228
Timestamp: 2025-06-26T12:48:25.255Z
Learning: In src/backends/task_store.ts, the TaskStore class creates Request objects with localhost URLs that are not actually used for HTTP requests. Instead, these Request objects serve as structured data containers that get passed to backend storage implementations (s3PutObject/swiftPutObject) which extract the necessary information (bucket, key, body, headers) and make the actual requests to the configured storage backends.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: test-full
🔇 Additional comments (9)
chart/Chart.yaml (1)
4-4: LGTM!The Helm chart version update to 0.7.0 aligns with the new upload part copy feature implementation.
chart/values.yaml (2)
24-24: Clarify the purpose ofdefault_bucketconfiguration.The new
default_bucketconfiguration key is introduced but its purpose and usage are not documented. Please clarify how this configuration is used and consider adding a comment explaining its purpose.
40-40: LGTM!The image tag update to v0.7.0 is consistent with the Helm chart version bump.
src/backends/s3/objects.ts (1)
11-20: LGTM!The imports are properly organized and necessary for the new uploadPartCopy function.
tests/swift/basic/uploadpartcopy_test.ts (1)
106-106: LGTM!The byte range calculation is correct. Copying "bytes=0-5242880" results in 5242881 bytes (inclusive range), which matches the expected content length assertion.
Also applies to: 166-166
src/backends/swift/objects.ts (4)
827-827: LGTM!Good practice to explicitly set Content-Type headers for JSON operations. This ensures proper content negotiation with the Swift backend.
Also applies to: 1000-1000, 1321-1321, 1371-1371, 1624-1624
1039-1057: Excellent error handling improvements!The enhanced validation of session file content type and JSON parsing with detailed error logging improves robustness and debuggability.
1456-1466: Good byte range parsing implementation.The copy source range parsing correctly handles the byte range format and extracts start/end bytes for the copy operation.
1081-1089: Good addition of debug logging.The debug logging for manifest operations will help with troubleshooting multipart upload issues in production.
Also applies to: 1123-1127, 1234-1234
| export async function uploadPartCopy( | ||
| reqCtx: RequestContext, | ||
| req: Request, | ||
| bucketConfig: Bucket, | ||
| ): Promise<Result<Response, Error>> { | ||
| const logger = reqCtx.logger; | ||
| logger.info("[S3 backend] Proxying Upload Part Copy Request..."); | ||
|
|
||
| const { bucket, objectKey, queryParams } = s3Utils.extractRequestInfo(req); | ||
| if (!bucket) { | ||
| return createOk( | ||
| new Response("Bucket information missing from the request", { | ||
| status: 400, | ||
| }), | ||
| ); | ||
| } | ||
|
|
||
| const config = bucketConfig.config as S3Config; | ||
| const headers = new Headers(req.headers); | ||
| headers.set("Host", config.endpoint.replace(/^https?:\/\//, "")); | ||
|
|
||
| // Convert queryParams to URLSearchParams | ||
| const params = new URLSearchParams(); | ||
| for (const [key, values] of Object.entries(queryParams)) { | ||
| for (const value of values) { | ||
| params.append(key, value); | ||
| } | ||
| } | ||
|
|
||
| const reqUrl = | ||
| `${config.endpoint}/${bucket}/${objectKey}?${params.toString()}`; | ||
|
|
||
| const fetchFunc = async () => { | ||
| return await fetch(reqUrl, { | ||
| method: "PUT", | ||
| headers: headers, | ||
| // Use the request body directly | ||
| body: req.body, | ||
| }); | ||
| }; | ||
|
|
||
| const response = await retryWithExponentialBackoff( | ||
| fetchFunc, | ||
| ); | ||
|
|
||
| if (!isOk(response)) { | ||
| const errRes = unwrapErr(response); | ||
| logger.warn( | ||
| `Upload Part Copy Failed. Failed to connect with Object Storage: ${errRes.message}`, | ||
| ); | ||
| return response; | ||
| } | ||
|
|
||
| const successResponse = unwrapOk(response) as Response; | ||
| if (successResponse.status !== 200) { | ||
| const errMessage = `Upload Part Copy Failed: ${successResponse.statusText}`; | ||
| logger.warn(errMessage); | ||
| reportToSentry(errMessage); | ||
| } else { | ||
| logger.info(`Upload Part Copy Successful: ${successResponse.statusText}`); | ||
| } | ||
|
|
||
| return createOk(successResponse); | ||
| } |
There was a problem hiding this comment.
Verify the request body handling for upload part copy.
The uploadPartCopy function forwards the request body directly, but for S3 UploadPartCopy operations, the body is typically empty or contains XML with copy instructions. The actual copy source is specified via the x-amz-copy-source header, which should be validated and forwarded.
Consider verifying:
- Whether the body should be forwarded or set to null/empty
- The presence and validation of required headers like
x-amz-copy-source
export async function uploadPartCopy(
reqCtx: RequestContext,
req: Request,
bucketConfig: Bucket,
): Promise<Result<Response, Error>> {
const logger = reqCtx.logger;
logger.info("[S3 backend] Proxying Upload Part Copy Request...");
const { bucket, objectKey, queryParams } = s3Utils.extractRequestInfo(req);
if (!bucket) {
return createOk(
new Response("Bucket information missing from the request", {
status: 400,
}),
);
}
+ // Validate required headers for copy operation
+ const copySource = req.headers.get("x-amz-copy-source");
+ if (!copySource) {
+ return createOk(
+ new Response("x-amz-copy-source header is required", {
+ status: 400,
+ }),
+ );
+ }
+
const config = bucketConfig.config as S3Config;
const headers = new Headers(req.headers);
headers.set("Host", config.endpoint.replace(/^https?:\/\//, ""));Also, consider whether other 2xx status codes (like 201) should be accepted as success:
const successResponse = unwrapOk(response) as Response;
- if (successResponse.status !== 200) {
+ if (successResponse.status < 200 || successResponse.status >= 300) {
const errMessage = `Upload Part Copy Failed: ${successResponse.statusText}`;
logger.warn(errMessage);
reportToSentry(errMessage);
} else {
logger.info(`Upload Part Copy Successful: ${successResponse.statusText}`);
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| export async function uploadPartCopy( | |
| reqCtx: RequestContext, | |
| req: Request, | |
| bucketConfig: Bucket, | |
| ): Promise<Result<Response, Error>> { | |
| const logger = reqCtx.logger; | |
| logger.info("[S3 backend] Proxying Upload Part Copy Request..."); | |
| const { bucket, objectKey, queryParams } = s3Utils.extractRequestInfo(req); | |
| if (!bucket) { | |
| return createOk( | |
| new Response("Bucket information missing from the request", { | |
| status: 400, | |
| }), | |
| ); | |
| } | |
| const config = bucketConfig.config as S3Config; | |
| const headers = new Headers(req.headers); | |
| headers.set("Host", config.endpoint.replace(/^https?:\/\//, "")); | |
| // Convert queryParams to URLSearchParams | |
| const params = new URLSearchParams(); | |
| for (const [key, values] of Object.entries(queryParams)) { | |
| for (const value of values) { | |
| params.append(key, value); | |
| } | |
| } | |
| const reqUrl = | |
| `${config.endpoint}/${bucket}/${objectKey}?${params.toString()}`; | |
| const fetchFunc = async () => { | |
| return await fetch(reqUrl, { | |
| method: "PUT", | |
| headers: headers, | |
| // Use the request body directly | |
| body: req.body, | |
| }); | |
| }; | |
| const response = await retryWithExponentialBackoff( | |
| fetchFunc, | |
| ); | |
| if (!isOk(response)) { | |
| const errRes = unwrapErr(response); | |
| logger.warn( | |
| `Upload Part Copy Failed. Failed to connect with Object Storage: ${errRes.message}`, | |
| ); | |
| return response; | |
| } | |
| const successResponse = unwrapOk(response) as Response; | |
| if (successResponse.status !== 200) { | |
| const errMessage = `Upload Part Copy Failed: ${successResponse.statusText}`; | |
| logger.warn(errMessage); | |
| reportToSentry(errMessage); | |
| } else { | |
| logger.info(`Upload Part Copy Successful: ${successResponse.statusText}`); | |
| } | |
| return createOk(successResponse); | |
| } | |
| export async function uploadPartCopy( | |
| reqCtx: RequestContext, | |
| req: Request, | |
| bucketConfig: Bucket, | |
| ): Promise<Result<Response, Error>> { | |
| const logger = reqCtx.logger; | |
| logger.info("[S3 backend] Proxying Upload Part Copy Request..."); | |
| const { bucket, objectKey, queryParams } = s3Utils.extractRequestInfo(req); | |
| if (!bucket) { | |
| return createOk( | |
| new Response("Bucket information missing from the request", { | |
| status: 400, | |
| }), | |
| ); | |
| } | |
| // Validate required headers for copy operation | |
| const copySource = req.headers.get("x-amz-copy-source"); | |
| if (!copySource) { | |
| return createOk( | |
| new Response("x-amz-copy-source header is required", { | |
| status: 400, | |
| }), | |
| ); | |
| } | |
| const config = bucketConfig.config as S3Config; | |
| const headers = new Headers(req.headers); | |
| headers.set("Host", config.endpoint.replace(/^https?:\/\//, "")); | |
| // Convert queryParams to URLSearchParams | |
| const params = new URLSearchParams(); | |
| for (const [key, values] of Object.entries(queryParams)) { | |
| for (const value of values) { | |
| params.append(key, value); | |
| } | |
| } | |
| const reqUrl = | |
| `${config.endpoint}/${bucket}/${objectKey}?${params.toString()}`; | |
| const fetchFunc = async () => { | |
| return await fetch(reqUrl, { | |
| method: "PUT", | |
| headers: headers, | |
| // Use the request body directly | |
| body: req.body, | |
| }); | |
| }; | |
| const response = await retryWithExponentialBackoff( | |
| fetchFunc, | |
| ); | |
| if (!isOk(response)) { | |
| const errRes = unwrapErr(response); | |
| logger.warn( | |
| `Upload Part Copy Failed. Failed to connect with Object Storage: ${errRes.message}`, | |
| ); | |
| return response; | |
| } | |
| const successResponse = unwrapOk(response) as Response; | |
| if (successResponse.status < 200 || successResponse.status >= 300) { | |
| const errMessage = `Upload Part Copy Failed: ${successResponse.statusText}`; | |
| logger.warn(errMessage); | |
| reportToSentry(errMessage); | |
| } else { | |
| logger.info(`Upload Part Copy Successful: ${successResponse.statusText}`); | |
| } | |
| return createOk(successResponse); | |
| } |
🤖 Prompt for AI Agents
In src/backends/s3/objects.ts around lines 544 to 607, the uploadPartCopy
function incorrectly forwards the request body directly, but for S3
UploadPartCopy, the body should typically be empty or contain specific XML, and
the copy source must be specified via the 'x-amz-copy-source' header. Update the
code to validate the presence of the 'x-amz-copy-source' header and forward it
properly. Also, set the request body to null or empty as appropriate instead of
forwarding the original body. Additionally, modify the success status check to
accept other 2xx status codes like 201 as successful responses.
| export function uploadPartCopy( | ||
| _reqCtx: RequestContext, | ||
| _req: Request, | ||
| _bucketConfig: Bucket, | ||
| ): Result<Response, Error> { | ||
| return createOk(NotImplementedException()); | ||
| reqCtx: RequestContext, | ||
| req: Request, | ||
| bucketConfig: Bucket, | ||
| ): Promise<Result<Response, Error>> { | ||
| const logger = reqCtx.logger; |
There was a problem hiding this comment.
Make the uploadPartCopy function properly async.
The function signature is not marked as async but it returns a Promise and uses async operations internally via an IIFE.
-export function uploadPartCopy(
+export async function uploadPartCopy(
reqCtx: RequestContext,
req: Request,
bucketConfig: Bucket,
): Promise<Result<Response, Error>> {📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| export function uploadPartCopy( | |
| _reqCtx: RequestContext, | |
| _req: Request, | |
| _bucketConfig: Bucket, | |
| ): Result<Response, Error> { | |
| return createOk(NotImplementedException()); | |
| reqCtx: RequestContext, | |
| req: Request, | |
| bucketConfig: Bucket, | |
| ): Promise<Result<Response, Error>> { | |
| const logger = reqCtx.logger; | |
| export async function uploadPartCopy( | |
| reqCtx: RequestContext, | |
| req: Request, | |
| bucketConfig: Bucket, | |
| ): Promise<Result<Response, Error>> { | |
| const logger = reqCtx.logger; |
🤖 Prompt for AI Agents
In src/backends/swift/objects.ts around lines 1413 to 1418, the uploadPartCopy
function is not declared as async but returns a Promise and uses async
operations inside an immediately invoked function expression (IIFE). To fix
this, mark the uploadPartCopy function itself as async and remove the IIFE,
directly using await inside the function body to handle asynchronous operations
properly.
| return (async () => { | ||
| const config: SwiftConfig = bucketConfig.config as SwiftConfig; | ||
| const res = reqCtx.heraldContext.keystoneStore.getConfigAuthMeta(config); | ||
| const { storageUrl: swiftUrl, token: authToken } = res; | ||
| const headers = getSwiftRequestHeaders(authToken); | ||
|
|
||
| // Destination path for the part | ||
| const destUrl = `${swiftUrl}/${bucket}/${object}/${partNumber}`; | ||
|
|
||
| // Source object URL | ||
| const sourceUrl = `${swiftUrl}/${sourceBucket}/${sourceKey}`; | ||
|
|
||
| // First, fetch the source object (with range if specified) | ||
| if (copySourceRange) { | ||
| headers.set("Range", copySourceRange.replace("bytes=", "bytes=")); | ||
| } | ||
|
|
||
| const fetchSourceFunc = async () => { | ||
| return await fetch(sourceUrl, { | ||
| method: "GET", | ||
| headers: headers, | ||
| }); | ||
| }; | ||
|
|
||
| const sourceResponse = await retryWithExponentialBackoff(fetchSourceFunc); | ||
|
|
||
| if (!isOk(sourceResponse)) { | ||
| const errRes = unwrapErr(sourceResponse); | ||
| logger.warn( | ||
| `Upload Part Copy Failed. Failed to fetch source object: ${errRes.message}`, | ||
| ); | ||
| return sourceResponse; | ||
| } | ||
|
|
||
| const sourceSuccessResponse = unwrapOk(sourceResponse); | ||
| if ( | ||
| sourceSuccessResponse.status !== 200 && | ||
| sourceSuccessResponse.status !== 206 | ||
| ) { | ||
| const errMessage = | ||
| `Upload Part Copy Failed: Source object not found or inaccessible: ${sourceSuccessResponse.statusText}`; | ||
| logger.warn(errMessage); | ||
| reportToSentry(errMessage); | ||
| return createOk(InvalidRequestException(errMessage)); | ||
| } | ||
|
|
||
| // Get the source content | ||
| const sourceContent = await sourceSuccessResponse.arrayBuffer(); | ||
|
|
||
| // Put the content to the destination part | ||
| const putHeaders = getSwiftRequestHeaders(authToken); | ||
| const fetchFunc = async () => { | ||
| return await fetch(destUrl, { | ||
| method: "PUT", | ||
| headers: putHeaders, | ||
| body: sourceContent, | ||
| }); | ||
| }; | ||
|
|
||
| const response = await retryWithExponentialBackoff(fetchFunc); | ||
|
|
||
| if (!isOk(response)) { | ||
| const errRes = unwrapErr(response); | ||
| logger.warn( | ||
| `Upload Part Copy Failed. Failed to connect with Object Storage: ${errRes.message}`, | ||
| ); | ||
| return response; | ||
| } | ||
|
|
||
| const successResponse = unwrapOk(response); | ||
| const lastModified = new Date().toISOString(); | ||
| if (successResponse.status !== 201) { | ||
| const errMessage = | ||
| `Upload Part Copy Failed: ${successResponse.statusText}`; | ||
| logger.warn(errMessage); | ||
| reportToSentry(errMessage); | ||
| return createOk(InvalidRequestException(errMessage)); | ||
| } else { | ||
| logger.info(`Upload Part Copy Successful: ${successResponse.statusText}`); | ||
|
|
||
| // Update the multipart upload session file | ||
| const multipartSessionPath = `${MULTIPART_UPLOADS_PATH}/${uploadId}.json`; | ||
| const multipartSessionUrl = | ||
| `${swiftUrl}/${bucket}/${multipartSessionPath}`; | ||
|
|
||
| try { | ||
| const headers = getSwiftRequestHeaders(authToken); | ||
| // Fetch the current session JSON | ||
| const getSessionFunc = async () => { | ||
| return await fetch(multipartSessionUrl, { | ||
| method: "GET", | ||
| headers: headers, | ||
| }); | ||
| }; | ||
|
|
||
| const getSessionResponse = await retryWithExponentialBackoff( | ||
| getSessionFunc, | ||
| ); | ||
|
|
||
| if ( | ||
| !isOk(getSessionResponse) || | ||
| unwrapOk(getSessionResponse).status === 404 | ||
| ) { | ||
| logger.error( | ||
| `Multipart upload session file not found for uploadId ${uploadId} at ${multipartSessionPath}`, | ||
| ); | ||
| return createOk(NoSuchUploadException()); | ||
| } | ||
|
|
||
| const sessionResponse = unwrapOk(getSessionResponse); | ||
|
|
||
| // Check content type to ensure we're getting JSON | ||
| const contentType = sessionResponse.headers.get("Content-Type"); | ||
| if (!contentType || !contentType.includes("application/json")) { | ||
| logger.error( | ||
| `Invalid content type for multipart session file: ${contentType}. Expected application/json for uploadId ${uploadId}`, | ||
| ); | ||
| // Continue with response generation without updating session | ||
| } else { | ||
| try { | ||
| const sessionJson = await sessionResponse.json(); | ||
|
|
||
| // Prepare part metadata | ||
| const eTag = successResponse.headers.get("etag") || | ||
| successResponse.headers.get("ETag") || ""; | ||
|
|
||
| // Get content length from the source response | ||
| const size = parseInt( | ||
| sourceSuccessResponse.headers.get("Content-Length") || "0", | ||
| 10, | ||
| ) || | ||
| (endByte !== undefined | ||
| ? (endByte - startByte + 1) | ||
| : sourceContent.byteLength); | ||
|
|
||
| const partMeta = { | ||
| partNumber: Array.isArray(partNumber) | ||
| ? partNumber[0] | ||
| : partNumber, | ||
| eTag, | ||
| size, | ||
| lastModified, | ||
| }; | ||
|
|
||
| // Update or create the parts array | ||
| if (!Array.isArray(sessionJson.parts)) sessionJson.parts = []; | ||
|
|
||
| // Remove any existing entry for this partNumber | ||
| sessionJson.parts = sessionJson.parts.filter( | ||
| (p: { partNumber: string }) => | ||
| p.partNumber !== partMeta.partNumber, | ||
| ); | ||
|
|
||
| sessionJson.parts.push(partMeta); | ||
|
|
||
| // Save the updated session JSON | ||
| headers.set("Content-Type", "application/json"); | ||
| const putSessionFunc = async () => { | ||
| return await fetch(multipartSessionUrl, { | ||
| method: "PUT", | ||
| headers: headers, | ||
| body: JSON.stringify(sessionJson), | ||
| }); | ||
| }; | ||
|
|
||
| const putSessionResponse = await retryWithExponentialBackoff( | ||
| putSessionFunc, | ||
| ); | ||
|
|
||
| if (!isOk(putSessionResponse) || !unwrapOk(putSessionResponse).ok) { | ||
| logger.error( | ||
| `Failed to update multipart upload session file for uploadId ${uploadId} at ${multipartSessionPath}`, | ||
| ); | ||
| } else { | ||
| logger.info( | ||
| `Updated multipart upload session file for uploadId ${uploadId} with part ${partMeta.partNumber}`, | ||
| ); | ||
| } | ||
| } catch (error) { | ||
| logger.error( | ||
| `Error parsing multipart upload session file for uploadId ${uploadId}: ${ | ||
| error instanceof Error ? error.message : String(error) | ||
| }`, | ||
| ); | ||
| // Continue with response generation without updating session | ||
| } | ||
| } | ||
| } catch (error) { | ||
| logger.error( | ||
| `Error updating multipart upload session file for uploadId ${uploadId}: ${ | ||
| error instanceof Error ? error.message : String(error) | ||
| }`, | ||
| ); | ||
| } | ||
| } | ||
|
|
||
| // Generate S3 CopyPartResult XML response | ||
| const eTag = successResponse.headers.get("etag") || | ||
| successResponse.headers.get("ETag") || ""; | ||
|
|
||
| const xmlResponse = `<?xml version="1.0" encoding="UTF-8"?> | ||
| <CopyPartResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/"> | ||
| <LastModified>${lastModified}</LastModified> | ||
| <ETag>${eTag}</ETag> | ||
| </CopyPartResult>`; | ||
|
|
||
| return createOk( | ||
| new Response(xmlResponse, { | ||
| status: 200, | ||
| headers: { | ||
| "Content-Type": "application/xml", | ||
| "x-amz-request-id": getRandomUUID(), | ||
| "x-amz-id-2": getRandomUUID(), | ||
| }, | ||
| }), | ||
| ); | ||
| })(); | ||
| } |
There was a problem hiding this comment.
🛠️ Refactor suggestion
Consider refactoring the IIFE pattern in uploadPartCopy.
The function uses an immediately invoked async function expression (IIFE) pattern which makes the code harder to read and maintain. Since the function should be async, the IIFE is unnecessary.
Remove the IIFE wrapper and make the function body directly async:
- return (async () => {
const config: SwiftConfig = bucketConfig.config as SwiftConfig;
// ... rest of the implementation
return createOk(
new Response(xmlResponse, {
status: 200,
headers: {
"Content-Type": "application/xml",
"x-amz-request-id": getRandomUUID(),
"x-amz-id-2": getRandomUUID(),
},
}),
);
- })();Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In src/backends/swift/objects.ts between lines 1468 and 1685, the uploadPartCopy
function uses an unnecessary immediately invoked async function expression
(IIFE) which complicates readability and maintenance. Refactor by removing the
IIFE wrapper and instead declare the function itself as async, moving the IIFE's
inner code directly into the function body to simplify the async flow.
| "Authorization": | ||
| `AWS ${credentials.accessKeyId}:${credentials.secretAccessKey}`, |
There was a problem hiding this comment.
Fix the authorization header format.
The authorization header is incorrectly formatted. AWS S3 doesn't accept simple AWS accessKeyId:secretAccessKey format. This should use proper AWS Signature Version 4 or at least AWS Signature Version 2 format.
Consider using a proper AWS signature library or the SDK's request signing capabilities:
- "Authorization":
- `AWS ${credentials.accessKeyId}:${credentials.secretAccessKey}`,
+ // TODO: Use proper AWS signature
+ // For now, this might only work with non-authenticating test backendsCommittable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In tests/swift/basic/uploadpartcopy_test.ts around lines 103 to 104, the
Authorization header is incorrectly formatted as a simple concatenation of
accessKeyId and secretAccessKey, which AWS S3 does not accept. Replace this
manual header with a properly signed Authorization header using AWS Signature
Version 4 or Version 2 by integrating an AWS SDK signing method or a dedicated
AWS signature library to generate the correct header dynamically.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (6)
tests/swift/basic/uploadpartcopy_checksum_test.ts (6)
13-13: Remove commented out import.Remove the commented out logging middleware import as it's not being used.
-// import { loggingMiddleware } from "../../utils/mod.ts";
42-44: Remove commented out middleware code.Remove the commented out middleware stack addition as it's not being used.
-// s3.middlewareStack.add(loggingMiddleware, { -// step: "finalizeRequest", -// });
24-24: Consider reducing test file size for faster execution.The 300MB test file might significantly slow down the test suite. Consider whether a smaller file (e.g., 50-100MB) would still adequately test the multipart copy functionality while improving test performance.
-const SOURCE_SIZE = 300 * MB; +const SOURCE_SIZE = 50 * MB; // Reduced for faster test execution
56-66: Extract credential handling to avoid duplication.The credential handling logic is duplicated between the S3 client initialization (lines 31-36) and this helper function. Consider extracting it to a single location.
+// Move this before S3 client initialization +function getCredentials() { + return "accessKeyId" in containerConfig.credentials + ? { + accessKeyId: containerConfig.credentials.accessKeyId, + secretAccessKey: containerConfig.credentials.secretAccessKey, + } + : { + accessKeyId: containerConfig.credentials.username, + secretAccessKey: containerConfig.credentials.password, + }; +} + const s3 = new S3Client({ - credentials: "accessKeyId" in containerConfig.credentials - ? containerConfig.credentials - : { - accessKeyId: containerConfig.credentials.username, - secretAccessKey: containerConfig.credentials.password, - }, + credentials: getCredentials(), region: containerConfig.region, forcePathStyle: true, endpoint: proxyUrl, }); -// Helper function to get credentials in a type-safe way -function getCredentials() { - return "accessKeyId" in containerConfig.credentials - ? { - accessKeyId: containerConfig.credentials.accessKeyId, - secretAccessKey: containerConfig.credentials.secretAccessKey, - } - : { - accessKeyId: containerConfig.credentials.username, - secretAccessKey: containerConfig.credentials.password, - }; -}
197-197: Remove commented out debug code.Remove the commented out setTimeout debug line.
- // await new Promise((resolve) => setTimeout(resolve, 50000));
303-305: Consider removing or reducing the arbitrary delay.The 3-second delay might not be necessary or could be reduced. If object assembly is eventually consistent, consider using a retry mechanism with exponential backoff instead of a fixed delay.
- await t.step("Wait for object assembly", async () => { - console.log("Waiting for object assembly..."); - await new Promise((resolve) => setTimeout(resolve, 3000)); - }); + // Remove this step if object assembly is synchronous, or implement retry logic in the verification step
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
tests/swift/basic/uploadpartcopy_checksum_test.ts(1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: destifo
PR: expnt/herald#54
File: src/backends/swift/objects.ts:0-0
Timestamp: 2025-07-04T07:25:29.056Z
Learning: In src/backends/swift/objects.ts, the multipart upload system was redesigned to use individual per-upload session files (stored as <uploadId>.json) instead of a centralized index.json file. This design choice was made intentionally to avoid conflicts and race conditions that occur when multiple operations try to update the same index file simultaneously. The N+1 HTTP requests approach is preferred over the single index file approach for data consistency, even though it has performance implications.
Learnt from: destifo
PR: expnt/herald#10
File: src/backends/s3/buckets.ts:0-0
Timestamp: 2025-01-28T17:36:22.149Z
Learning: The `routeQueryParamedRequest` and `headBucket` functions in `src/backends/s3/buckets.ts` already implement retry logic for replica failover when the primary request fails, supporting both S3 and Swift replicas.
tests/swift/basic/uploadpartcopy_checksum_test.ts (4)
Learnt from: destifo
PR: #54
File: src/backends/swift/objects.ts:0-0
Timestamp: 2025-07-04T07:25:29.056Z
Learning: In src/backends/swift/objects.ts, the multipart upload system was redesigned to use individual per-upload session files (stored as .json) instead of a centralized index.json file. This design choice was made intentionally to avoid conflicts and race conditions that occur when multiple operations try to update the same index file simultaneously. The N+1 HTTP requests approach is preferred over the single index file approach for data consistency, even though it has performance implications.
Learnt from: destifo
PR: #49
File: tests/_others/_task_queue_test.ts:7-12
Timestamp: 2025-06-26T12:47:53.021Z
Learning: The test in tests/_others/_task_queue_test.ts is specifically designed to test the task store functionality with an S3 backend. The direct type assertion globalConfig.task_store_backend as S3Config is safe because the test is intentionally S3-specific and uses S3 APIs throughout.
Learnt from: destifo
PR: #10
File: src/backends/mirror.ts:170-172
Timestamp: 2025-01-28T17:41:43.238Z
Learning: In the mirror operations (src/backends/mirror.ts), bucket existence is guaranteed by design as mirror tasks are only created for buckets that exist in the bucket store, which is initialized during startup. Therefore, non-null assertions when finding buckets in the store are safe.
Learnt from: destifo
PR: #10
File: src/backends/s3/buckets.ts:0-0
Timestamp: 2025-01-28T17:36:22.149Z
Learning: The routeQueryParamedRequest and headBucket functions in src/backends/s3/buckets.ts already implement retry logic for replica failover when the primary request fails, supporting both S3 and Swift replicas.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: test-full
🔇 Additional comments (1)
tests/swift/basic/uploadpartcopy_checksum_test.ts (1)
183-189: Incorrect AWS signature implementation.The authorization header is using an incorrect and insecure format. AWS signatures require proper HMAC-based calculation, not simple concatenation of credentials.
For testing purposes, if the proxy accepts this simplified format, add a comment explaining this is test-only. For production-like testing, implement proper AWS Signature V4:
// Add comment if this is intentional for testing const headers = { // WARNING: This is a test-only simplified auth format, not AWS-compliant "Authorization": `AWS ${credentials.accessKeyId}:${credentials.secretAccessKey}`, "x-amz-copy-source": `${containerName}/${sourceObjectKey}`, "x-amz-copy-source-range": `bytes=${startByte}-${endByte}`, "x-amz-date": date, };Consider using the AWS SDK's built-in request signing or implementing proper AWS Signature V4 if testing against real S3-compatible services.
⛔ Skipped due to learnings
Learnt from: destifo PR: expnt/herald#41 File: src/backends/mirror.ts:166-174 Timestamp: 2025-06-16T14:42:52.174Z Learning: In src/backends/mirror.ts, when creating replica requests that reuse originalRequest.headers, the authentication headers (Authorization, x-amz-date, etc.) are automatically flushed/replaced down the request pipeline by functions like s3.putObject, so there's no signature mismatch issue with forwarding the original headers.Learnt from: destifo PR: expnt/herald#49 File: tests/_others/_task_queue_test.ts:7-12 Timestamp: 2025-06-26T12:47:53.021Z Learning: The test in `tests/_others/_task_queue_test.ts` is specifically designed to test the task store functionality with an S3 backend. The direct type assertion `globalConfig.task_store_backend as S3Config` is safe because the test is intentionally S3-specific and uses S3 APIs throughout.Learnt from: Yohe-Am PR: expnt/herald#37 File: src/utils/signer.ts:140-142 Timestamp: 2025-05-05T19:02:11.966Z Learning: In pre-signed S3 URL requests, parameters like X-Amz-Date are parsed from query params (not headers) and are validated using Zod schema.Learnt from: destifo PR: expnt/herald#10 File: src/backends/s3/buckets.ts:0-0 Timestamp: 2025-01-28T17:36:22.149Z Learning: The `routeQueryParamedRequest` and `headBucket` functions in `src/backends/s3/buckets.ts` already implement retry logic for replica failover when the primary request fails, supporting both S3 and Swift replicas.Learnt from: destifo PR: expnt/herald#54 File: src/backends/swift/objects.ts:0-0 Timestamp: 2025-07-04T07:25:29.056Z Learning: In src/backends/swift/objects.ts, the multipart upload system was redesigned to use individual per-upload session files (stored as <uploadId>.json) instead of a centralized index.json file. This design choice was made intentionally to avoid conflicts and race conditions that occur when multiple operations try to update the same index file simultaneously. The N+1 HTTP requests approach is preferred over the single index file approach for data consistency, even though it has performance implications.Learnt from: destifo PR: expnt/herald#20 File: src/backends/swift/objects.ts:189-189 Timestamp: 2025-04-23T09:09:42.386Z Learning: In the herald project, write operations (putObject, deleteObject, copyObject, etc.) and read operations (getObject, listObjects, etc.) follow different patterns: 1. Write operations use `const response` without replica retry logic, often using mirrorOperation for eventual consistency 2. Read operations use `let response` with replica failover logic when primary fails
|
|
||
| // Create a source object to copy from (simulating the original Docker layer) | ||
| await t.step("Create Source Object (Docker Layer)", async () => { | ||
| const tempFile = await createTempFile(SOURCE_SIZE / MB); // 30MB file |
There was a problem hiding this comment.
Fix incorrect comment about file size.
The comment says "30MB file" but the actual size is 300MB based on SOURCE_SIZE.
- const tempFile = await createTempFile(SOURCE_SIZE / MB); // 30MB file
+ const tempFile = await createTempFile(SOURCE_SIZE / MB); // 300MB file📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const tempFile = await createTempFile(SOURCE_SIZE / MB); // 30MB file | |
| const tempFile = await createTempFile(SOURCE_SIZE / MB); // 300MB file |
🤖 Prompt for AI Agents
In tests/swift/basic/uploadpartcopy_checksum_test.ts at line 96, the comment
incorrectly states the file size as "30MB file" while the actual size is 300MB
based on SOURCE_SIZE. Update the comment to accurately reflect the file size as
"300MB file" to avoid confusion.
Migration notes
Summary by CodeRabbit