[Improvement-16994][TaskPlugin] support retry for every api call for serverless spark by abzymeatsjtu · Pull Request #17476 · apache/dolphinscheduler

abzymeatsjtu · 2025-09-04T09:21:33Z

support retry for every api call for EMR Serverless Spark, will improve the robustness of this task plugin against some temporarily malfunction of remote service

part of #16994

…serverless spark

ruanwenjun · 2025-09-05T04:11:02Z

+        StartJobRunRequest startJobRunRequest = buildStartJobRunRequest(aliyunServerlessSparkParameters);
+        StartJobRunResponse startJobRunResponse = RetryUtils.retryFunction(() -> {
+            try {
+                return aliyunServerlessSparkClient.startJobRun(
+                        aliyunServerlessSparkParameters.getWorkspaceId(), startJobRunRequest);
+            } catch (Exception e) {
+                throw new AliyunServerlessSparkTaskException("Failed to start job run!");


There seems to exist timeout issue here, if the http is timeout, then client will retry, but server side might already handle the previous, then will cause the request be handled twice. I'm unsure whether the service side has implemented idempotency handling. Because a new token is passed here each time, so the server side cannot know the second request are retry.

There seems to exist timeout issue here, if the http is timeout, then client will retry, but server side might already handle the previous, then will cause the request be handled twice. I'm unsure whether the service side has implemented idempotency handling. Because a new token is passed here each time, so the server side cannot know the second request are retry.

@ruanwenjun @abzymeatsjtu Looks like the token is generated and set when initializing the request line#257, therefore, I assume the idempotency is alright here?

Ah, I missed that. If the server-side is capable of implementing idempotency handling via tokens, that would be a great solution.

ruanwenjun

LGTM

sonarqubecloud · 2025-09-08T03:41:28Z

Quality Gate failed

Failed conditions
C Reliability Rating on New Code (required ≥ A)

See analysis details on SonarQube Cloud

Catch issues before they fail your Quality Gate with our IDE extension SonarQube for IDE

…serverless spark (apache#17476) * [Improvement-16994][TaskPlugin] support retry for every api call for serverless spark --------- Co-authored-by: sunyifan.syf <sunyifan.syf@alibaba-inc.com> Co-authored-by: Eric Gao <ericgao.apache@gmail.com>

[Improvement-16994][TaskPlugin] support retry for every api call for …

5340b0c

…serverless spark

abzymeatsjtu requested review from Gallardot, SbloodyS and caishunfeng as code owners September 4, 2025 09:21

github-actions Bot added the backend label Sep 4, 2025

github-actions Bot assigned abzymeatsjtu Sep 4, 2025

[Improvement-16994][TaskPlugin] fix ut

6318cf0

github-actions Bot added the test label Sep 4, 2025

sunyifan.syf and others added 2 commits September 4, 2025 17:45

[Improvement-16994][TaskPlugin] add retry policy

2098189

Merge branch 'dev' into feat_serverless_spark_plugin_opt_1

ce8843d

ruanwenjun reviewed Sep 5, 2025

View reviewed changes

EricGao888 added this to the 4.0.0-alpha milestone Sep 5, 2025

ruanwenjun approved these changes Sep 6, 2025

View reviewed changes

ruanwenjun added the improvement make more easy to user or prompt friendly label Sep 6, 2025

Merge branch 'dev' into feat_serverless_spark_plugin_opt_1

cb18a1b

EricGao888 approved these changes Sep 8, 2025

View reviewed changes

EricGao888 merged commit 4ec7c4b into apache:dev Sep 8, 2025
80 of 117 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Improvement-16994][TaskPlugin] support retry for every api call for serverless spark#17476

[Improvement-16994][TaskPlugin] support retry for every api call for serverless spark#17476
EricGao888 merged 5 commits intoapache:devfrom
abzymeatsjtu:feat_serverless_spark_plugin_opt_1

abzymeatsjtu commented Sep 4, 2025 •

edited

Loading

Uh oh!

ruanwenjun Sep 5, 2025

Uh oh!

EricGao888 Sep 5, 2025

Uh oh!

ruanwenjun Sep 6, 2025

Uh oh!

ruanwenjun left a comment

Uh oh!

sonarqubecloud Bot commented Sep 8, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

abzymeatsjtu commented Sep 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ruanwenjun Sep 5, 2025

Choose a reason for hiding this comment

Uh oh!

EricGao888 Sep 5, 2025

Choose a reason for hiding this comment

Uh oh!

ruanwenjun Sep 6, 2025

Choose a reason for hiding this comment

Uh oh!

ruanwenjun left a comment

Choose a reason for hiding this comment

Uh oh!

sonarqubecloud Bot commented Sep 8, 2025

Quality Gate failed

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

abzymeatsjtu commented Sep 4, 2025 •

edited

Loading